

Yeah, the article cites that as a control, but it’s not at all surprising since “humanity by survey consensus” is accurate to how LLM weighting trained on random human outputs works.
It’s impressive up to a point, but you wouldn’t exactly want your answers to complex math operations or other specialized areas to track layperson human survey responses.
I went through something similar and am trying to recall, I think I did look and it was past the time period. I should have tried. It’s +2 years now for me.
Edit: Words.