This one's a . Decoupling hidden states is the key to actually scaling LLMs, not just slapping more GPUs on a rig. https://www.reddit.com/user/kertara