It’s time to take AI welfare seriously

A new report argues that AI systems could soon deserve moral consideration

Oct 31, 2024

A report released today argues that AI systems could soon deserve moral consideration in their own right — and that we should start preparing for that possibility today.

“Taking AI Welfare Seriously”, written by a team of leading philosophers and AI researchers, makes the case that there’s a non-negligible chance that in the near future some AI systems could become “moral patients”: entities that morally matter for their own sake. If that happens, the authors argue, we will have a moral responsibility to care about the welfare of AI systems. The possibility, according to the report, is large enough to warrant taking seriously today.

The report, a joint project of Eleos AI and the NYU Center for Mind, Ethics, and Policy, counts some of the most prominent consciousness researchers as authors: most notably David Chalmers, a philosopher at NYU credited with formulating the “hard problem of consciousness”. And it comes on the same day Transformer reports that Anthropic has hired Kyle Fish, one of the paper’s authors, as its first full-time researcher working on AI welfare.

Anthropic has hired an 'AI welfare' researcher

Shakeel Hashim

October 31, 2024

Anthropic has hired an 'AI welfare' researcher

Kyle Fish joined the company last month to explore whether we might have moral obligations to AI systems

Read full story

As Fish told Transformer, the paper makes the case for “why this is something that is perhaps worth taking very seriously”. The issue, the report authors argue, is not whether current AI systems are conscious or morally relevant. Rather, it’s that there are two capabilities through which AI systems might come to deserve moral consideration: consciousness (defined by the authors as having subjective experiences) and robust agency (defined as the “ability to pursue goals via some particular set of cognitive states and processes”).

According to many moral theories, either path could “plausibly suffice for moral significance”. And the direction of AI progress suggests these morally-relevant capabilities might arise from current research directions, even without aiming for them: for instance, AI companies are expending vast resources in developing agents, which may eventually exhibit morally-relevant characteristics such as the ability to have desires that others can frustrate.

Many will dismiss this possibility out of hand, laughing off the idea that AI systems could ever be conscious or morally relevant. That, the authors argue, would be a mistake.

“These are among the hardest problems in science and philosophy,” Jeff Sebo, one of the report’s lead authors, told Transformer. As the report notes, dismissing the possibility of morally-relevant AIs means “having a very high degree of confidence in a very restrictive set of views about some of the hardest problems in philosophy, science, and technology” — a level of confidence, the report argues, that is “simply not warranted” given the current evidence.

The expert community’s views reflect what Fish calls “tremendous uncertainty”. In a survey of members of the Association for the Scientific Study of Consciousness, 67% of respondents said machines could definitely or probably have consciousness. A separate survey of philosophers found that 39% of philosophers “accept or lean towards” future AI systems being conscious — more than believe flies are conscious (35%). And last year, a long list of prominent researchers signed an open letter from the Association for Mathematical Consciousness Studies, which declared that “it is no longer in the realm of science fiction to imagine AI systems having feelings and even human-level consciousness”.

Even prominent sceptics advocate taking the possibility seriously. Neuroscientist Anil Seth, despite believing conscious AI remains far away or perhaps not possible at all, argued last year that “even if unlikely, it is unwise to dismiss the possibility altogether”.

Sebo concurs. “I have not seen a persuasive argument that we should not be considering this issue further,” he said, despite good arguments for the impossibility of conscious AI in the near-term (such as Peter Godfrey-Smith’s claim that consciousness requires an architecture that functions more like biological materials than silicon chips). Such arguments, Sebo says, are persuasive, but there is too much uncertainty to make decisions based on them.

“Our response to those arguments is not that the views that they defend are wrong,” Sebo explained, “but rather that the views that they defend are not clearly right, and we are not going to know for sure if they are clearly right or clearly wrong for quite a while.” Given the uncertainty — and the potential stakes — that calls for acting now, even under uncertainty.

The stakes are indeed high, and the risks of getting this wrong cut both ways. Ignore the possibility of artificial consciousness, and we might create vast numbers of suffering beings — a moral catastrophe potentially on par with humanity’s historical mistreatment of animals.

But overextend moral consideration too eagerly, and we could divert scarce resources from humans and animals who genuinely need them. The opportunity cost, the paper argues, could be significant. But as chatbots become ever more lifelike, our intuitions may push us to award AIs moral consideration undeservedly.

What, then, should we do? For one thing, we need more research, both philosophical and technical. Sebo said he is excited about the possibility of adapting the “markers” approach to consciousness — used to estimate the probability of different animals having consciousness — to make similar judgments about AI systems.

The report also makes some modest recommendations for AI companies. AI companies, the authors say, should acknowledge the possibility of AI moral status as an issue, develop frameworks for assessing consciousness and agency in their systems, and prepare policies for treating potentially morally significant AIs appropriately. Each company should appoint an “AI welfare officer” responsible for these issues.

Some companies are already moving in this direction. Anthropic provided funding for early research that led to the independent report. And as Transformer reported today, the company has hired Kyle Fish, one of the report’s authors, to work on AI welfare full-time. Fish said he will be working on many of the big philosophical questions raised by the paper — “what might be going on with respect to model welfare, and what it might make sense to do about that” — along with empirical work, such as evaluations for features that might be relevant to welfare and moral patienthood.

Anthropic is not the only company exploring AI welfare. Google DeepMind recently posted a job listing seeking a research scientist to work on “cutting-edge societal questions around machine cognition, consciousness and multi-agent systems”. And two OpenAI employees are listed in the acknowledgements of the new report, suggesting an interest in the topic among some people there, too.

In the longer term, the paper argues that governments may need to get involved. Given how long such social and political progress takes, Sebo argues, we should start thinking about that possibility — and preparing a response — now, “instead of responding haphazardly and reactively when those situations arise”.

Given our current understanding, however, that requires acting under uncertainty. While Sebo is supportive of more work dedicated to answering the thorny questions of moral patienthood, he thinks the uncertainty will not be fully resolved any time soon. Given the difficulty of the topics at hand, “we should not take solace in the idea that a secure theory is right around the corner and will definitely arrive before we really need to start confronting these policy decisions,” he warned. “I think we need to proceed on the assumption that this disagreement and uncertainty will be with us at least long enough to be our situation when we have to start making these high stakes decisions about how to treat AI systems.”

And regardless of future progress, for now that uncertainty is very real. Today’s report notes that its “analysis is not an expression of anything like consensus or certainty about these issues”. Instead, the authors say, “it is an expression of caution and humility in the face of what we can expect will be substantial ongoing disagreement and uncertainty”. At this stage, they argue, that kind of caution and humility “is the only stance that one can responsibly take”.

It’s time to take AI welfare seriously

A new report argues that AI systems could soon deserve moral consideration

Anthropic has hired an 'AI welfare' researcher

Discussion about this post