AI hallucinations are getting worse – and they're here to stay

solo@slrpnk.net · 2 months ago

AI hallucinations are getting worse – and they're here to stay

stardustwager@lemm.ee · 2 months ago

deleted by creator

vintageballs@feddit.org · 2 months ago

Ah yes Mr. Professor, mind telling us how you came to this conclusion?

To me you come off like an early 1900s fear monger a la “There will never be a flying machine, humans aren’t meant to be in the sky and it’s physically impossible”.

If you literally meant that there is no such thing yet, then sure, we haven’t reached AGI yet. But the rest of your sentence is very disingenuous toward the thousands of scientists and developers working on precisely these issues and also extremely ignorant of current developments.

stardustwager@lemm.ee · 2 months ago

deleted by creator

wicked@programming.dev · 2 months ago

I pasted 1k line C++ file into Gemini, along with a screenshot and a trace log and asked it to find the bug. It reasoned for about 5 minutes. Extract of the solution:

There has to be an issue in this sequence for i=21:
V_to_next_xy (norm) is (5.00004113497265e-08, -0.999999999999999, 0). (Correct)
Unnormalized perpendicular P_un = Vec3(V_to_next_xy.y(), -V_to_next_xy.x(), 0.0) should be (-0.999999999999999, -5.00004113497265e-08, 0).
temp_bisector becomes P_un.
temp_bisector.normalize() is called.
The result of this normalization is logged as temp_bisector (result of collinear path): (-1.78813934326169e-07, -0.999999999999984, 0).
This step (5) is where the numerical error occurs. The vector (-0.999…, -5e-8, 0) when normalized should not become (-1.78e-7, -0.999…, 0).
The latter is almost parallel to (0, -1, 0), while the former is almost parallel to (-1, 0, 0).
It’s as if temp_bisector.normalize() for the input (-0.999…, -5e-8, 0) is producing something wildly incorrect, or the vector being normalized was actually (-1.78e-7 * L, -0.999… * L, 0) for some L.
What if temp_bisector = Vec3(V_to_next_xy.y(), -V_to_next_xy.x(), 0.0); itself has precision issues?
If V_to_next_xy.y() is -0.999999999999999 (16 nines), this is -(1.0 - 1e-15).
If V_to_next_xy.x() is 5.00004113497265e-08.
Vec3 tmp = Vec3(-(1.0-1e-15), -5.00004113497265e-08, 0.0).
This should be fine.
The most plausible candidate is Vec3::normalize() failing on this specific input vector (-0.999999999999999, -5.00004113497265e-08, 0).
Perhaps length() computes sqrt(xx + yy + zz).
xx = (-(1.0-1e-15))^2 = (1.0-1e-15)^2 = 1.0 - 2e-15 + 1e-30.
yy = (-5e-8)^2 = 2.5e-15.
xx + y*y = 1.0 - 2e-15 + 1e-30 + 2.5e-15 = 1.0 + 0.5e-15 + 1e-30.
length = sqrt(1.0 + 0.5e-15 + 1e-30). This is extremely close to 1.0.
If this length calculation has any error causing it to be, for example, abs(y) by mistake, then x/length would become x/abs(y) and y/length would become y/abs(y) = sign(y).
x/abs(y) = -1.0 / 5e-8 = -2e7. This is not what’s happening.
The logged value for temp_bisector (-1.78813934326169e-07, -0.999999999999984, 0)
is suspicious because its X component -1.78813934326169e-07 is very close to the X component of V_to_prev_xy (norm) from loop i=21, which is -5.00004113497265e-08, divided by a factor. Or perhaps V_to_next_xy.x() itself was somehow used more directly than -V_to_next_xy.x() in the Y component of the vector passed to normalize.
The logged temp_bisector (result of collinear path) for i=21 is (-1.78813934326169e-07, -0.999999999999984, 0).
Consider Loop i=20: temp_bisector (result of collinear path): (1, 0, 0). Its input for perpendicular construction was V_to_next_xy (norm) = (X_v: 4.9999318885022e-08, Y_v: -0.999999999999999, Z_v: 0).
Perpendicular from this is (Y_v, -X_v, 0) = (-0.999…, -4.999…e-8, 0). Normalizing this gives (-1, -4.999…e-8, 0).
The log for i=20 says it becomes (1,0,0). This is also wrong. It should be (-1, -very_small_y, 0).
The fallback temp_bisector = Vec3(1,0,0) only happens if temp_bisector.normalize() (the one on the perpendicular) also fails.
It seems the normalize() call for the vector constructed as Vec3(V_to_next_xy.y(), -V_to_next_xy.x(), 0.0) is problematic when one component is very small.

It correctly identified that sqrt(_v[0]*_v[0] + _v[1]*_v[1] + _v[2]*_v[2]); had too low precision and using std::hypot(_v[0], _v[1], _v[2]) would likely solve it.

If this is just autocomplete, then I agree that it’s a pretty fancy one.

vintageballs@feddit.org · 2 months ago

Funnily enough, this is also my field, though I am not at uni anymore since I now work in this area. I agree that current literature rightfully makes no claims of AGI.

Calling transformer models (also definitely not the only type of LLM that is feasible - mamba, Llada, … exist!) “fancy autocomplete” is very disingenuous in my view. Also, the current boom of AI includes way more than the flashy language models that the general population directly interacts with, as you surely know. And whether a model is able to “generalize” depends on whether you mean within its objective boundaries or outside of them, I would say.

I agree that a training objective of predicting the next token in a sequence probably won’t be enough to achieve generalized intelligence. However, modelling language is the first and most important step on that path since us humans use language to abstract and represent problems.

Looking at the current pace of development, I wouldn’t be so pessimistic, though I won’t make claims as to when we will reach AGI. While there may not be a complete theoretical framework for AGI, I believe it will be achieved in a similar way as current systems are, being developed first and explained after.

ramble81@lemm.ee · 2 months ago

To vintage’s point. The way I view it is there is no chance for AGI via the current method of hopped up LLM/ML but that doesn’t mean we won’t uncover a method in the future. Bio-engineering with an attempt to recreate a neural network for example, or extraction of neurons via stem cells with some sort of electrical interface. My initial point was that it’s way off, not that it’s impossible. One day someone will go “well, that’s interesting” and we’ll have a whole new paradigm

Powderhorn@beehaw.org · 2 months ago

AGI is just a term used for VC and shareholders.