Skip to main content

Strong AI and the Foundations of Ethics

Tuesday, October 3, 2023, 3:40 pm – 5:20 pm

This colloquium talk is planned as an in-person event. Registration is only required for non-CEU members. 



A now-familiar thought (associated with the idea of “strong AI”) that in the not-too-distant

future, AI may come to far surpass humans’ general cognitive capacities. Some famously worry

that such AI pose an existential risk to humans, either due to indifference to human aims or a

hostility to humans. This paper focuses on a different cluster of questions: a series of questions in

the foundations of ethics raised by the possibility that we ask AI to engage in evaluative

reasoning (e.g., about what is good and bad, right and wrong, etc.). There is a natural epistemic

motive for asking strong AI to engage in evaluative reasoning: one might hope that strong AI

could help us to make progress in addressing persistent evaluative controversies. However,

suppose that strong AIs converge on evaluative conclusions that we are independently inclined to

oppose. For example, AI might come to anti-anthropocentric evaluative conclusions (two

examples: maybe they really prioritize certain cognitive capacities in their evaluations, treating

us the way we treat mosquitos; or maybe they don’t, treating the interests of insects as on a par

with that of highly rational beings). Alternatively, AI might come to radically consequentialist

conclusions, that portray the evaluative significance of, e.g., our relations to our projects and

loved ones as easily swamped. We take such dissatisfaction with (by hypothesis) epistemically

highly credible evaluative conclusions to raise important questions about our attitudes towards

the evaluative, even if we assume a strong sort of realism about evaluative thought and talk. It is

familiar for metaethical antirealists to ask the question “why care about evaluative properties?”

on the supposition that realism is true. And they often argue that this is a reason to reject realism.

However, we think that the sorts of possibility we are canvassing instead makes salient two other

ways of thinking about alienation from evaluative standards. First, it could push us to a

conceptual ethics conclusion that we ought to adopt other (perhaps: more anthropocentric)

evaluative concepts. Second, it might instead push us toward a deep alienation from the

evaluative: that is, we might simply embrace that the existing evaluative concepts capture

what really and truly matters, and find that we simply don’t want our lives or world to be

structured by what really and truly matters, if it involves sufficient sacrifice of what we care

about. In light of these issues, we then reflect on how to best think about what “the” alignment

problem in AI really is, suggesting that there are in fact multiple different “alignment” problems

that are worth wrestling with, and that it is a difficult evaluative question which one to prioritize

in thinking about AI ethics, and why.


Interested in receiving updates about the events of the Department of Philosophy? Sign-up to its mailing list here.