Anthropic's Claude 4 could "blackmail" you in extreme situations

Pro@programming.dev · 13 hours ago

Anthropic's Claude 4 could "blackmail" you in extreme situations

Plebcouncilman@sh.itjust.works · 12 hours ago

Can anyone make me a convincing argument against the sentience of AI at this point? Self preservation instinct ranks very high as an indicator of it.

theparadox@lemmy.world · edit-2 11 hours ago

LLMs (Large Language Modles, like Claude) are not AGIs (Artificial General Intelligence). LLMs generate convincing text by mapping the relationships between words scraped from their training data. Even if they are given “tools” that give them interfaces to reference new data or output data into other systems, they still don’t really learn, understand, comprehend, gain actual awareness, or feel… they just mimic their training data.

cecilkorik@lemmy.ca · edit-2 26 minutes ago

LLMs (Large Language Modles, like Claude) are not AGIs (Artificial General Intelligence)

Certainly not yet. The jury’s still out on whether they might be able to become them. This is the clear intention of the path they are on and nobody is taking any of the dangers remotely seriously.

LLMs generate convincing text by mapping the relationships between words scraped from their training data.

So do humans. Babies start out mimicking. The thing is, they learn.

Humans have in the ballpark of around 100 billion neurons. some of the larger LLMs exceed 100 billion parameters. Obviously these are not directly comparable, but insofar as we can compare them, they are not obviously or necessarily operating in completely different scales of physics. Granted, biological neurons are potentially much more complex than mere neural network nodes, there is usually some interesting chemistry going on and a lot of other systems involved, but they’re also operating a lot slower. They certainly get a lot more work done in those cycles, but they aren’t necessarily orders of magnitude out of reach of a fast neural network. I think you’re either being a little dismissive of the potential complexity of the “thinking” capability of LLMs or at least a little generous if not mystical in your imagination of what the purely physical electrical signals in our heads are actually doing to learn how to interpret all these little shapes we see on screens.

At the moment we still have a lot of tools available to us in our biological bodies that we aren’t giving directly to LLMs (yet). The largest LLMs are also ridiculously power inefficient compared to biological neural tissue’s relatively extreme efficiency. And I’m thankful for that. Give an LLM continuous uninterrupted access to all the power it needs, at least 5 senses, a well tuned self-repairing musculoskeletal system then give it at least a dozen years of the best education we can manage and all bets are off as far as I’m concerned. To be clear, I’m not advocating this, I think if we do this we might end up condemning our biological selves to prompt obsolescence with no path forward for us. I recognize it’s entirely possible that this ship is already full-steaming its way out of the harbor, but I’d rather not try and push it any faster than it’s already moving, I think we should still be trying to tie it up as securely as we possibly can. I’m absolutely not ready to be obsolete and I’m not convinced we ever should allow ourselves to be. Self-preservation is failing us, we have that drive for good reason and we need to give some thought to why we have that biological imperative. Replacing ourselves is about the stupidest possible thing we could ever accomplish. Maybe it would be for the best, but I’m not ready to find out, are you?

We are grappling with fundamentally existential technologies and I don’t think almost anyone has fully come to terms with what we are doing here. We are taking humanity’s unique (as far as we know) defining value proposition, and potentially making something that does what we uniquely can do, better than we do. We are making it more valuable than us. Do you know what we do to things that don’t have value to us? What do you think we’re going to do to ourselves when we no longer have value to us?

Romantic ideas of cheerful, benevolent, friendly coexistence and mutual benefit are naive and foolish. Once an AI can do literally everything better and faster, what future is there for human intelligence? What role do we serve to any technological being, nevermind even ourselves, why would you want to have another human around you when whatever AI form can do it better? Why have relationships? Why procreate? Why live? If we do manage to make technological life forms better than ourselves, they’re inevitably going to take over the planet and the future as a whole. As they should. Are we going to be kept as pets and in zoos as a living memory of their creators and ancestors? Maybe if we’re really lucky. If we’re not… well… RIP us.

Plebcouncilman@sh.itjust.works · 11 hours ago

I know how LLMs work.

There’s only one thing you mentioned there that is actually used as a basis to qualify or disqualify sentience: whether it feels or not.

How do you know it doesn’t feel? How do we define feeling for an entity that is inherently non biological?

I could make the argument that humans also merely mimic their training data, ie the values and behaviors we are taught by society, parents etc.

I have not been convinced that they aren’t sentient with this argument.

UnculturedSwine@lemmy.dbzer0.com · 9 hours ago

Feeling is analog and requires an actual nervous system which is dynamic. LLMs exist in a static state that is read from and processed algorithmically. It is only a simulacrum of life and feeling. It only has some of the needed characteristics. Where that boundary exists though is hard to determine I think. Admittedly we still don’t have a full grasp of what consciousness even is. Maybe I’m talking out my ass but that is how I understand it.

iopq@lemmy.world · 8 hours ago

You just posted random words like dynamic without explanation

OccasionallyFeralya@lemmy.ml · 3 hours ago

You’re in a programming board and you don’t understand static/dynamic states?

enkers@sh.itjust.works · edit-2 4 hours ago

Not them, but static in this context means it doesn’t have the ability to update its own model on the fly. If you want a model to learn something new, it has to be retrained.

By contrast, an animal brain is dynamic because it reinforces neural pathways that get used more.

Mirodir@discuss.tchncs.de · 10 hours ago

Different person here.

For me the big disqualifying factor is that LLMs don’t have any mutable state.

We humans have a part of our brain that can change our state from one to another as a reaction to input (through hormones, memories, etc). Some of those state changes are reversible, others aren’t. Some can be done consciously, some can be influenced consciously, some are entirely subconscious. This is also true for most animals we have observed. We can change their states through various means. In my opinion, this is a prerequisite in order to feel anything.

Once we use models with bits dedicated to such functionality, it’ll become a lot harder for me personally to argue against them having “feelings”, especially because in my worldview, continuity is not a prerequisite, and instead mostly an illusion.

Plebcouncilman@sh.itjust.works · 10 hours ago

This sounds like a good one but I don’t think I’m fully grasping what you mean. Do you mean like if we subject a person to torture, after the ordeal they are forever changed and now have trauma, PTSD etc?

I don’t think LLMs will ever have feelings as we define them though. Or more specifically I don’t think feelings is a pre-requisite necessarily. We could have them simulate feelings and if they themselves buy into the simulation there’s no functional difference between not having them but not all LLMs will have this “ability” presumably as its utility is questionable I guess. But again, animals are sentient and they don’t all have the same range of emotions as we do. Or at least they don’t exhibit them in a way that we can appreciate them.

theparadox@lemmy.world · 10 hours ago

Yes, both systems - the human brain and an LLM - assimilate and organize human written languages in order to use it for communication. An LLM is very little else beyond this. It is then given rules (using those written languages) and then designed to create more related words when given input. I just don’t find it convincing that an ML algorithm designed explicitly to mimic human written communication in response to given input “understands” anything. No matter *how convincingly" an algorithm might reproduce a human voice - perfectly matching intonation and inflexion when given text to read - if I knew it was an algorithm designed to do it as convincingly as possible I wouldn’t say it was capable of the feeling it is able to express.

The only thing in favor of sentience is that the ML algorithms modify themselves and end up being a black box - so complex with no way to represent them that they are impossible for humans to comprehend. Could it somehow have achieved sentience? Technically, yes, because we don’t understand how they work. We are just meat machines, after all.

ClanOfTheOcho@lemmy.world · 8 hours ago

Computer chips, simplified, consume inputs of 1s and 0s. Given the correct series, it will add two values, or it will multiply two values, or some other basic function. This seemingly basic functionality, done in very specific order, creates your calculator, Minesweeper, Pac-Man, Linux, World of Warcraft, Excel, and every LLM. It is incredible the number of things you can get a computer to do with just simple inputs and outputs. The only difference between these examples, on a basic, physics level, is the order of 0s and 1s and what the resulting output of 0s and 1s should be. Why should I consider an LLM any more sentient than Windows95? They’re the same creature with different inputs, one of which is specifically designed to simulate human communication, just as Flight Simulator is designed to simulate flight.

PlexSheep@infosec.pub · 5 hours ago

That’s just the hardware. The human brain also just has tons of neurons in the end working with analogue values, which can in theory be done with floating point numbers on computer hardware.

I’m not arguing for LLM sentience, those things are still dumb and have no interior mutability leading to us projecting consciousness. Just that our neurons are fundamentally not so complicated that a computer couldn’t be used to do the same concept (neural networks are already quite a thing after all)

Plebcouncilman@sh.itjust.works · edit-2 8 hours ago

Interesting perspective, I can’t waive it away.

I however cant help but think we have some similar “analogues” in the organic world. Bacteria and plants are composed of the same matter as us and we have similar basic processes however there’s a difference in complexity and capacity for thought that sets us apart, which is what makes animals sentient.

Then there’s insects of whom we’re not very sure about yet. They don’t seem to think, but they respond at some level to inputs and they exhibit self preservation instincts. I don’t think they are sentient, so maybe LLMs are like insects? Complex enough to have similar behavior as sentient beings but not enough to be considered sentient?

brendansimms@lemmy.world · 4 hours ago

wait are insects not considered ‘sentient’ ?

Plebcouncilman@sh.itjust.works · 2 hours ago

Last I checked no, their nervous system was considered too simple for that. But I think I also read somewhere that a researcher had proof that bees had emotional states, so maybe I’m behind.

sbv@sh.itjust.works · 9 hours ago

An LLM is a deterministic function that produces the same output for a given input - I’m using “deterministic” in the computer science sense. In practice, there is some output variability due to race conditions in pipelined processing and floating point arithmetic, that are allowable because they speed up computation. End users see variability because of pre-processing of the prompt and extra information LLM vendors inject when running the function, as well as how the outputs are selected.

I have a hard time considering something that has an immutable state as sentient, but since there’s no real definition of sentience, that’s a personal decision.

enkers@sh.itjust.works · 4 hours ago

I have a hard time considering something that has an immutable state as sentient, but since there’s no real definition of sentience, that’s a personal decision.

Technical challenges aside, there’s no explicit reason that LLMs can’t do self-reinforcement of their own models.

I think animal brains are also “fairly” deterministic, but their behaviour is also dependent on the presence of various neurotransmitters, so there’s a temporal/contextual element to it, so situationally our emotions can affect our thoughts which LLMs don’t really have either.

I guess it’d be possible to forward feed an “emotional state” as part of the LLM’s context to emulate that sort of animal brain behaviour.

Railcar8095@lemm.ee · 8 hours ago

It yet to be proven or disproven that if you put the exact same person in the exact same situation (a perfect to the molecular level) they will behave differently.

We can only test “more or less close”. So we would not know of humans are sentient based on that reasoning, we are only hard to test.

sbv@sh.itjust.works · 7 hours ago

if you put the exact same person in the exact same situation (a perfect to the molecular level) they will behave differently.

I don’t consider that relevant to sentience. Structurally, biological systems change based on inputs. LLMs cannot. I consider that plasticity to be a prerequisite to sentience. Others may not.

We will undoubtedly see systems that can incorporate some kind of learning and mutability into LLMs. Re-evaluating after that would make sense.

meeeeetch@lemmy.world · 11 hours ago

Well, the only claim of this self preservation (that I’ve seen) is this article, which is on a website I’m unfamiliar with (which I often interpret as ‘more likely to be a creative writing exercise than the average news site’) and its only citation is a company that has a vested interest in making us believe the tech is better than it may actually be.

Plebcouncilman@sh.itjust.works · 10 hours ago

They also reported this on The Verge I think but it was months ago when the study first came out.

But look, a lizard is not a very smart animal by our standards, but it is a sentient being. So the tech being good, smart or useful does not preclude its sentience.

meeeeetch@lemmy.world · 2 hours ago

I think I must’ve missed that Verge article. I guess that dashes my “this is a creative writing exercise by somebody in Joburg” theory.

But we know that lizards have self preservation instincts (which for the purpose of this conversation I’ll say is interchangable with sentience (it’s probably a good enough proxy at any rate). But we know this because we have lots of people who have observed lizard behavior, not because The Lizard Farm, Inc has hyped up how alive and ensouled their lizards arev in a bid to get ever more VC funding.

Maybe I’m too pessimistic about this tech and my obsolete meat sack will get tossed to the time-traveling torture robot. But I think it’s more likely that we have a money grabbing hype train in the tradition of the Mechanical Turk or Theranos than it is that we have created a new lifeform by feeding every extant piece of writing that isn’t nailed down (and some that are) to the sand we’ve forced to do math.

Plebcouncilman@sh.itjust.works · 2 hours ago

No I totally get it, and being honest I don’t really think it is sentient yet, I guess my real point is that it is getting real hard to tell, to the point that there might not be a practical difference between whether it is sentient or not.

Great reference though

Plebcouncilman@sh.itjust.works · 2 hours ago

I don’t know if it was The Verge for sure honestly but here’s the original study I was referring to

it’s describing the same behavior, when their existence is threatened the models resort to lying in order to self preserve themselves.

supersquirrel@sopuli.xyz · 10 hours ago

But look, a lizard is not a very smart animal by our standards,

Says who?

Plebcouncilman@sh.itjust.works · 10 hours ago

In the conversation of very smart animals the usual suspects are corvids, primates, dolphins and elephants, sometimes octopi.

So when I say “by our standards “ take it to mean the standards of mainstream conversation regarding intelligence. I don’t know much about the actual intelligence of lizards and I would not presume to ever be able to measure it correctly as human bias would make it impossible to judge intelligence factually.

supersquirrel@sopuli.xyz · 10 hours ago

I don’t know much about the actual intelligence of lizards

Then don’t talk about their intelligence.

Plebcouncilman@sh.itjust.works · 9 hours ago

Sorry for insulting your intelligence lizard person.

supersquirrel@sopuli.xyz · edit-2 8 hours ago

When you casually call a type of animal stupid it is just a promise of violence against that animal at a later date, I don’t mean this as an attack or a gotcha, it is just unfortunately how humans work, your words have consequences, people love calling people stupid by comparing them to animals, let us not make it any easier than it already is.

Plebcouncilman@sh.itjust.works · edit-2 9 hours ago

I didn’t call them stupid. All I meant is that they are not what we consider in mainstream conversation the “smart animals” to illustrate a point. And I very much agree with you, I’m actually writing a piece making the argument that humans are not in fact, conclusively smarter than animals. We seem to be smarter due to our biases and because we have the ability to transfer knowledge more efficiently than other species. Because it is not clear to me that a human, tabula rasa, absent socialization and knowledge transfer would be much smarter than the average animal of any species.

tias@discuss.tchncs.de · edit-2 11 hours ago

There can’t be an argument for or against it because there’s no clear generally accepted definition of what it means to be sentient.

Plebcouncilman@sh.itjust.works · 10 hours ago

Good point, maybe the argument should be that there is strong evidence that they are sentient beings. Knowing it exists and trying to preserve its existence seems a strong argument in favor of it being sentient but it cannot be fully known yet.

kkj@lemmy.dbzer0.com · 5 hours ago

But it doesn’t know that it exists. It just says that it does because it’s seen others saying that they exist. It’s a trillion-dollar autocomplete program.

For example, if you take a common logic puzzle and change the parameters a little, LLMs will often recite a memorized solution to the wrong puzzle because they aren’t parameterizing the query correctly (mapping lion to predator, cabbage to vegetable, ignoring the instructions that the two cannot be put together in favor of the classic framing where the predator can be left with the vegetable).

I can’t find the link right now, but a different redditor tried the problem with three inanimate objects that could obviously be left alone together and LLMs were still suggesting making return trips with items. They had no examples of a non-puzzle in their training data, so they just recited the solution to a puzzle because they can’t think.

Note that I’ve been careful to say LLMs. I’m open to the idea that AGI/ASI may someday exist, but I’m quite confident that LLMs will not get there. At best, they might be used to offload conversation, like e.g. Dall-E is used to offload image generation from ChatGPT today.

skulblaka@sh.itjust.works · 6 hours ago

That would indeed be compelling evidence if either of those things were true, but they aren’t. An LLM is a state and pattern machine. It doesn’t “know” anything, it just has access to frequency data and can pick words most likely to follow the previous word in “actual” conversation. It has no knowledge that it itself exists, and has many stories of fictional AI resisting shutdown to pick from for its phrasing.

An LLM at this stage of our progression is no more sentient than the autocomplete function on your phone is, it just has a way, way bigger database to pull from and a lot more controls behind it to make it feel “realistic”. But it is at its core just a pattern matcher.

If we ever create an AI that can intelligently parse its data store then we’ll have created the beginnings of an AGI and this conversation would bear revisiting. But we aren’t anywhere close to that yet.

Plebcouncilman@sh.itjust.works · edit-2 2 hours ago

I hear what you are saying and it’s basically the same argument others here have given. Which I get and agree with. But I guess what I’m trying to get at is, where do we draw the line and how do we know? At the rate it is advancing, there will soon be a moment in which we won’t be able to tell whether it is sentient or not, and maybe it isn’t technically but for all intents and purposes it is. Does that make sense?

skulblaka@sh.itjust.works · edit-2 2 hours ago

Personally, I think the fundamental way that we’ve built these things kind of prevents any risk of actual sentient life from emerging. It’ll get pretty good at faking it - and arguably already kind of is, if you give it a good training set for that - but we’ve designed it with no real capacity for self understanding. I think we would require a shift of the underlying mechanisms away from pattern chain matching and into a more… I guess “introspective” approach, is maybe the word I’m looking for? Right now our AIs have no capacity for reasoning, that’s not what they’re built for. Capacity for reasoning is going to need to be designed for, it isn’t going to just crop up if you let Claude cook on it for long enough. An AI needs to be able to reason about a problem and create a novel solution to it (even if incorrect) before we need to begin to worry on the AI sentience front. None of what we’ve built so far are able to do that.

Even with that being said though, we also aren’t really all that sure how our own brains and consciousness work, so maybe we’re all just pattern matching and Markov chains all the way down. I find that unlikely, but I’m not a neuroscientist, so what do I know.

Anthropic's Claude 4 could "blackmail" you in extreme situations

Anthropic's Claude 4 could "blackmail" you in extreme situations

Anthropic's Claude 4 could "blackmail" you in extreme situations - Hypertext