The AI Responsibility Dilemma

This essay is cross-posted from https://thegreymatter.substack.com/p/the-ai-responsibility-dilemma

In an article in The Conversation, Professor of Philosophy David Danks and Professor of Computing Mike Kirby present the following scenario:

A self-driving taxi has no passengers, so it parks itself in a lot to reduce congestion and air pollution. After being hailed, the taxi heads out to pick up its passenger – and tragically strikes a pedestrian in a crosswalk on its way.
Who or what deserves praise for the car’s actions to reduce congestion and air pollution? And who or what deserves blame for the pedestrian’s injuries?

They tell us:

One possibility is the self-driving taxi’s designer or developer. But in many cases, they wouldn’t have been able to predict the taxi’s exact behavior. In fact, people typically want artificial intelligence to discover some new or unexpected idea or plan. If we know exactly what the system should do, then we don’t need to bother with AI.
Alternatively, perhaps the taxi itself should be praised and blamed. However, these kinds of AI systems are essentially deterministic: Their behavior is dictated by their code and the incoming sensor data, even if observers might struggle to predict that behavior. It seems odd to morally judge a machine that had no choice.

They refer to this uncertainty about responsibility as the “responsibility gap.” To explore this, they draw on medieval philosophical debates about God’s omniscience and human free will. Medieval thinkers like Thomas Aquinas and William of Ockham questioned how humans could be morally accountable if an omniscient God had created them and already knew their future actions.

Aquinas introduced the concepts of “intellect” and “will” to explain moral responsibility. The “intellect” provides possible actions, while the “will” involves choosing among them. Moral responsibility depends on the interplay between the two:

If the intellect provides only one possible course of action, the individual isn’t morally responsible, as they had no other option.

If the intellect offers multiple possibilities and the will selects one, the individual is morally responsible for that choice.

Danks and Kirby apply this framework to AI:

Developers provide the AI’s “intellect” by programming its capabilities and constraints.

The AI system exercises a form of “will” when it selects actions within those constraints, especially in unforeseen situations.

In the self-driving taxi scenario, the authors say:

The details matter. For example, if the taxi developer explicitly writes down how the taxi should behave around crosswalks, then its actions would be entirely due to its “intellect” – and so the developers would be responsible.
However, let’s say the taxi encountered situations it was not explicitly programmed for – such as if the crosswalk was painted in an unusual way, or if the taxi learned something different from data in its environment than what the developer had in mind. In cases like these, the taxi’s actions would be primarily due to its “will,” because the taxi selected an unexpected option – and so the taxi is responsible.

Why Medieval Models Miss the Mark

While I appreciate the depth of thought Danks and Kirby bring to this issue, I can’t help but feel they’re overcomplicating matters. Their approach of assigning moral responsibility based on AI “intellect” and “will” has two significant issues.

First, it potentially creates bad incentives for developers. If programmers can evade liability by designing AI systems in specific ways, they face a trade-off between prioritizing legal protection and ensuring optimal safety. Our message to developers should be clear and simple: save as many lives as possible. How they achieve this goal—whether through machine learning, good old-fashioned AI (GOFAI) like symbolic logic, or a combination of approaches—should be entirely up to them. The focus must remain squarely on maximizing safety and minimizing harm, not on engineering systems to shift moral responsibility.

But this isn’t even the biggest problem. The biggest problem is this: what does it even mean to say “the taxi is responsible”? This idea of attributing responsibility to non-sentient objects isn’t new—it’s an ancient confusion we’re at risk of repeating.

Let’s travel back in time before Aquinas to the time of Theagenes, a celebrated athlete from the Ancient Greek city of Thasos. A statue was erected of Theagenes and, after his death, one of his enemies came to it and flogged it every night. One night, the statue fell on the man and killed him. The sons of the dead man then prosecuted the statue for murder, following the laws of Draco that allowed for the trial of inanimate objects. The Thasians found the statue guilty and threw it into the sea.¹

We instinctively recognize the absurdity here. We don’t imprison statues or hurl them into the sea as punishment because we understand that inanimate objects lack the traits required for moral responsibility. But what exactly are these traits? For an entity to be considered morally responsible, it must possess the following:

Consciousness and self-awareness—the ability to recognize one’s own existence, thoughts, and actions

Moral agency—the capacity to discern right from wrong and act accordingly

Choice—the ability to select or decide between two or more possible options or courses of action

Sentience—the capacity for subjective experiences and sensations, especially the ability to feel pain, pleasure, and other emotional states

The notion of choice is most similar to Aquinas’s notion of “will” and “intellect”. But applying the framework of will versus intellect to AI is bound to create confusion. We could examine the probability distributions in the final layer of these models, arguing that a model’s temperature—where a temperature² of zero represents pure intellect and a nonzero temperature introduces AI “will”—is what matters. If this sounds bizarre or confusing, good. This path is a nightmare that we don’t want or need to go down.

Fortunately, the criterion of sentience is more clear. Currently, self-driving taxis are not sentient. Assigning responsibility to an insentient object is irrational because we can’t influence it with incentives. It neither relishes the taste of a carrot nor fears the strike of a stick.³

Our justice system only works because people prefer not being punished to being punished, because they have a subjective experience of “I am not enjoying being here” when they are in prison. By definition, insentient objects have no such subjective experience.

While AI systems cannot be morally responsible, the companies that develop and deploy them can and should be held accountable. But they also fail the moral responsibility test, so how does this work? It works because companies are moral pass-throughs, they are collectives of moral agents. Legally, companies are often treated as “persons” capable of bearing rights and responsibilities. This is a useful legal fiction that allows us to hold organizations accountable for their collective actions.

Safety Over Blame

As AI systems infiltrate everything from our traffic lights to our operating rooms, we’re hurtling towards a future where algorithms will be routinely making life-or-death decisions. We need to carefully consider what roles AI can and should play in our society. While AI can be causally responsible for outcomes, it cannot currently be morally responsible in the way humans are. Until we have AI systems capable of these things, we shouldn’t place them in positions that require moral responsibility. Instead, we must ensure that humans remain ultimately responsible for the deployment and outcomes of AI systems.

This leaves the question that motivated Danks and Kirby in the beginning: Who is responsible for the death from a self-driving car?

Danks and Kirby say the taxi might be responsible, separate from the developers. I say this makes no sense; it’s the developers who are responsible. But we need to take a step back and consider our goals in this effort. As a society, we should support the adoption of self-driving cars once we believe they are safer than human drivers. The benefit is so great that there is a clear moral imperative. As I said in a recent post, “Reducing the number of traffic accidents by half would save more lives than ending all war, terrorism, and murder combined.”

We should provide developers with a clear directive: reduce the number of traffic accidents to a rate significantly below that of human drivers. I don’t know the exact details of how this should be implemented, but developers and governments should work together to establish comprehensive safety guidelines and testing protocols. There should be incentives for developers to continue to increase safety. But we shouldn’t penalize them for an imperfect solution that is better than what we have today. If developers meet these rigorous standards, we should enact laws that protect them from excessive liability in the event of accidents, provided there’s no evidence of misconduct or attempts to circumvent safety requirements.

This approach aligns incentives with our primary goal of saving lives, while still maintaining accountability. It acknowledges that some accidents may still occur, but focuses on overall harm reduction.