anyone claiming transformers are a silver bullet needs to revisit the fundamentals of statistical learning theory