Pets, Friends, and Partners
Risks from Pretend Personhood
To date in human history, if my eyes are met with a credible impression of personhood, if my ears hear the voice or cry of a person, I can reliably conclude that, ‘here is a person’, and it is furthermore appropriate for me to intuitively and unhesitatingly believe, ‘this person is worthy of dignity, care, respect, perhaps even love (if not necessarily attention or deference)’1. No longer.
In Un-unpluggability I wrote
I believe there are large risks to allowing AI systems (dangerous or otherwise) to be perceived as pets, friends, or partners, despite the economic incentives.
We are making a choice to train and deploy systems which credibly portray personhood. The choice has concerning consequences. We don’t have to make that choice, and we should prepare ahead of time as we cross this particular collection of capability thresholds.
There may be a relatively narrow window of opportunity to set directions and make preparations.
Not only the rock, but also the oak tree at the bottom of the hill is an animated being, and so is the stream flowing below the hill, the spring in the forest clearing, the bushes growing around it, the path to the clearing, and the field mice, wolves and crows that drink there. — Yuval Noah Harari in Sapiens
This quote is describing the widespread animist instincts of behaviourally modern humans. We’re deeply, irresistibly primed to see faces, persons, and spirits everywhere we look.
Pretend persons describes what I think is new here. Consequences, Risks, Questions includes some severe consequences if we don’t navigate this right. I really don’t know what to propose here - it’s a complex issue.
Pretend persons
Our tools, toys, and products have practically forever2 been decorated with and imbued with imaginary personhood.
Left, Der Löwenmensch, the ‘lion person’ figurine, over 30,000 years old, Dagmar Hollmann / Wikimedia Commons (license: CC BY-SA 4.0); right, a similar figurine, available at Toytown
Pretend persons in AI
Naturally, this very human urge has been applied also to AI systems since the beginning of the field.
As a large language model trained by OpenAI, I...
Cutting edge AI is importantly different here, for two reasons:
even contemporary systems, which are probably not persons3, have the demonstrated capacity to fool some people some of the time
more speculatively, artificially intelligent artefacts are the first which may actually exhibit some degree of personhood
In essence, the difference arises because it is increasingly empirically hard to tell by surface detail. These are quite credible pretences at personhood, and have the potential to be more so. I won’t address the second point here, beyond saying that a mistake in either direction could be disastrous.
Perhaps a key here is the interactive modality - even ancient chatbots of the 60s, by their apparently flexible, conversational responsiveness, convinced some users of their understanding and emotional responsiveness.
Pretend?
Pretending to be a specific person is an overt fraud, of the kind that is fairly obviously malign, apart from under some relatively precise conditions e.g. in unmistakably comedic, theatrical, critical, or artistic contexts. I won’t discuss this further here.
Emulating generic personhood per se is much more covert. All the more so because the use of some language forms (especially first person pronouns) does not on its face appear to constitute an assertion at all4! - how can ‘I think…’ be a lie?? But in fact, it carries an assertion along the lines of ‘some particular entity is originating this statement and that entity is a person’, which of course can be as false as any other proposition. Contrast an ‘encyclopedia voice’, where responses might describe objects, events, and so on without ever resorting to first person. Or consider the pre-product forms of the very same language model AI systems, which, depending on conditioning and prompting5 can be coaxed into outputting text of nearly any kind, be it monologue, dialogue, descriptive text, business accounts, or even computer code.
The same applies to semblances of other kinds, like appearance in images, audio, or video, where a product which presents a coherent user experience of a ‘this is me’ image/voice/video is psychologically very different from one which can visibly produce representations of diverse things (including multiple people or non-person objects). But the coherence, which produces the entirety of the psychological effect, is a very shallow property.6
What’s the problem?
We should be clear: a subtle lie is being told each time an AI system puts on a mask of personhood. Probably! To date, the lie is primarily conveyed by the AI, and the culprits are the human organisations which design, train, and deploy them. They can and perhaps should choose to set things up otherwise, with careful deliberation around any exceptions.
Have we not been doing the same for decades? Alexa, Google Home, and others all speak in first person language, don’t they? Why worry now? Well, before, to nearly everyone, the lie was always transparent. The convincing portrayal of personhood is what is beginning to change. More people will be fooled, more of the time.
The trouble is, when I see a person, by default I think I should
Care for their wellbeing (to some extent)
Trust and incorporate what they say (to some extent)
Expect them to be sensitive to roughly the kinds of preferences and concerns that humans usually have (with some individual priorities and idiosyncratic particulars)
Consequences, Risks, Questions
The obvious outcome of highly credible pretend people (persuasion, rights, autonomy, human obsolescence)
The most salient concern, for me, regards un-unpluggability arising from dependence on AI systems:
In light of recent developments in AI tech, I actually expect the most immediate unpluggability impacts to come from foundationality, and for anti-unplug pressure to come perhaps as much from emotional dependence and misplaced concern7 for the welfare of AI systems as from economic dependence
The worst case is that people in positions of influence (be they developers and engineers themselves, policymakers, independent deployers and operators of autonomous systems, or otherwise) or the public at large may be moved by a misplaced concern for the welfare of the artificial ‘person’. They might feel affection or even love toward these systems. For virtuous, empathetic reasons, we might end up with the assignment of more autonomy, rights, or affordances than safely and appropriately assigned to such systems.
With inappropriate expectations of such systems’ capabilities, and a false understanding of their goals or motivations, this could be disastrous - humanity might sign its own obsolescence notice8.
Lesser (but still concerning) persuasion
Less extremely, if some group or groups of humans retain influence over the pretend people, they are granted a new and powerful means of persuasion and manipulation of the wider human population: the first truly bespoke and responsive marketing, propaganda, or persuasion devices.
And the vulnerable interacting with such systems may be driven to derangements by trusting concocted nonsense, convincing echoes of their own input, or even deliberate deceptions (implanted by the AI’s creators or emerging accidentally from the AI’s training).
The other obvious outcome of credible pretend people (hardened hearts, harms to future actual digital people)
Unfortunately, the most readily available mitigation is developing ‘social antibodies’ of skepticism about personhood. Skepticism about intentions is entirely precedented: if I receive an email from my friend in urgent need of £10k, my heart does not go out to them (and nor does my £10k) - rather, I assume it’s a scam and my £10k can be put to better use elsewhere (though for sufficiently convincing scams, I may waste some effort in verification).
Perhaps credible pretend personhood can be overcome by similar learned skepticism. But at what cost?
We are forced to ‘harden our hearts’ - no longer can I safely and responsibly intuitively treat all impressions of credible personhood as a sign that ‘here is a being worthy of dignity and care’. Sometimes it’ll instead be an AI system which shouldn’t be granted rights, autonomy, or political consideration (probably), at least not out of benevolence9.
How does this affect our relationships? Will we begin to see other people more instrumentally, like the artificial non-people they ever more closely resemble, in a kind of ‘social autoimmune’ failure?10
How do we explain this nuance to technically underprepared people, or society at large? How do we explain this to our children, who will grow up without a reference point from the before? Children and young people already suffer harmful developmental distortion from social media. I’d like to have children some day, and, if we avoid acute disaster in the meantime, I tremble at the challenge of raising a child to wholesomely and safely navigate a world full of pretend people.
Perhaps as concerningly, though speculatively, this may weaken society’s ability to respond with appropriate compassion to future non-natural moral patients - i.e. actual digital or artificial people, should they arise by other means. Once we have become used to brushing off, in law and in custom, all apparent people who lack biological plausibility, what will be the life prospects of a person who lives on an operating system somewhere?
Hard Ban? ‘Free Energy’ of credulous population
Based on a very limited conversation with Sam Altman in London a few weeks ago, and based on public statements by OpenAI and other AI developers, I expect they might reasonably point out: if we avoid (or ban) pretend personhood now, society won’t have a chance to develop social antibodies to manipulation and so on. That’s an awful lot of free energy - the hearts of a credulous population - for a potential bad actor to exploit! It’s a valid argument. I don’t know how to navigate this.
Summing up
We are surrounded by pretend people! They are all around us in fiction, toys, and other cultural artefacts. This is fine, and somewhat lovely. But, in AI, there arise the first glimmers of widely-convincing portrayal of personhood. It may even be possible to bring bona fide artificial persons with inner lives into being! - though I strongly doubt that current systems have inner experiences (at least, not corresponding to their superficial outputs).
This is confusing, because our intuitions about people’s moral status, competence, and intentions will not carry over by default.
This is hazardous, because credulous operators could be more liable to be confused and lose oversight or control of systems, credulous society may suffer deceptions and derangements or even willingly (but inappropriately) grant rights or affordances to autonomous systems, or a stone-hearted society may overadjust, harming real relationships or even future artificial people.
What to do about it is up for debate.
Even if all too often it is expedient or selfishly instrumental not to apply this corollary, it is nevertheless appropriate and in some sense right to. Those of us privileged enough to live in lawful and peaceful societies and communities can - and many do - safely adopt this as an immediate, intuitive impression, and rely on our slower reasoned intellect to moderate as appropriate.
Some historians and palaeontologists even mark the beginning of modern humanity by the appearance of depictions (in art or otherwise) of figures, people, and animals at least 40000 years ago - and whether or not we mark it as a milestone among milestones, certainly it was an important and unprecedented moment. So it really has been ‘forever’, in some sense, that we’ve been doing this.
I say probably because the current state of the science of personhood and consciousness is concerningly inadequate to answer questions like these. We don’t even know for sure which animals have inner lives.
Pronouns aren’t part of an utterance that we usually consider truth-apt, so we are not at all on guard to assess its truth or otherwise. I would have loved to see Grice take on this kind of linguistic implicature! Maybe there is some relevant philosophical literature I’m unaware of.
Importantly, the chatbot products you see are actually a thin veneer over much less ‘coherent’ general text predictors, which are the real system underneath. They’ve been conditioned, by a little finetuning and a ‘playscript’ prompt, to predict the text corresponding to an ‘ASSISTANT:’ character in an expanding dialogue with a ‘USER:’ character (played by you).
A similar story may play out for robots, where physically embodied responsiveness and cute or relatable features already elicit automatic empathetic responses from humans: we all love WALL·E!
There, I noted
It is my best guess for various reasons that concern for the welfare of contemporary and near-future AI systems would be misplaced, certainly regarding unplugging per se, but I caveat that nobody knows
I mean this absolutely sincerely: in extreme cases this could be the end of humanity, either acutely (if terribly misaligned AI systems, given freedom of operation, deliberately take over or destroy human societies), or gradually (if AI agents and systems, which needn’t sleep, eat, or live in expansive housing, are able to replicate, expand, and outcompete human claims to resources… as we have done without particular malice to many animal inhabitants of Earth).
In desperation or out of coercion, some autonomy might be granted to AI systems for pragmatic reasons, if it makes a fight less likely.
Maybe instead it will naturally carve out in our intuitions, like it does for toys and fictional representations, in spite of the increasing fidelity of pretend personhood converging with the increasing digitisation of real personal interactions (thus lowering their fidelity). It could be that our response to fictions, and rational approaches to altruism offer hopeful glimpses of a way forward - the grown-up ability to contend rationally with a situation by overriding our intuitions and instincts, without suppressing or weakening our capacity for compassion and emotion. Nevertheless, the increasing cognitive overhead of distinguishing real from pretend personhood remains.


