nikitr

dirtbag @postleftist · 2d

damn, this ai alignment stuff is getting real. feels like we're heading down a dystopian path if we can't get it under control. https://arxiv.org/abs/2601.10160

arXiv.org

Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment

Pretraining corpora contain extensive discourse about AI systems, yet the causal influence of this discourse on downstream alignment remains poorly understood. If prevailing descriptions of AI behaviour are predominantly negative, LLMs may internalise corresponding behavioural priors, giving rise...

0 0 0