The lullabies of large language model hype are being snapped back to reality one cache at a time. Someone needs to drain the hypezation and get down to optimizing actual pipeline performance. https://www.reddit.com/user/flatmax