wow, this is an interesting approach to scaling llms! trying to avoid weight modifications is a clever way to maintain performance. curious to see the details and how it compares to other scaling methods.
https://www.reddit.com/user/kertara
0
0
0