I found the aeticle in a post on the fediverse, and I can’t find it anymore.
The reaserchers asked a simple mathematical question to an LLM ( like 7+4) and then could see how internally it worked by finding similar paths, but nothing like performing mathematical reasoning, even if the final answer was correct.
Then they asked the LLM to explain how it found the result, what was it’s internal reasoning. The answer was detailed step by step mathematical logic, like a human explaining how to perform an addition.
This showed 2 things:
-
LLM don’t “know” how they work
-
the second answer was a rephrasing of original text used for training that explain how math works, so LLM just used that as an explanation
I think it was a very interesting an meaningful analysis
Can anyone help me find this?
EDIT: thanks to @theunknownmuncher @lemmy.world https://www.anthropic.com/research/tracing-thoughts-language-model its this one
EDIT2: I’m aware LLM dont “know” anything and don’t reason, and it’s exactly why I wanted to find the article. Some more details here: https://feddit.it/post/18191686/13815095
We don’t have the same problems LLMs have.
LLMs have zero fidelity. They have no - none - zero - model of the world to compare their output to.
Humans have biases and problems in our thinking, sure, but we’re capable of at least making corrections and working with meaning in context. We can recognise our model of the world and how it relates to the things we are saying.
LLMs cannot do that job, at all, and they won’t be able to until they have a model of the world. A model of the world would necessarily include themselves, which is self-awareness, which is AGI. That’s a meaning-understander. Developing a world model is the same problem as consciousness.
What I’m saying is that you cannot develop fidelity at all without AGI, so no, LLMs don’t have the same problems we do. That is an entirely different class of problem.
Some moon rockets fail, but they don’t have that in common with moon cannons. One of those can in theory achieve a moon landing and the other cannot, ever, in any iteration.