Can AI Become Conscious?

Part two of a three-part series on consciousness and artificial intelligence. Part one established what consciousness is and what it depends upon. This essay applies that framework to AI.

17 min read

Just now

Press enter or click to view image in full size

Source: Adobe Stock

The previous essay established that human consciousness appears to depend upon six elements: attention, language, working memory, free will, reflexivity, and qualia. This essay applies that framework to artificial intelligence — and argues that the standard framing of the question rests on an underexamined premise. When AI is measured against the narrow functional layer that is actually conscious in humans, the gap looks uncomfortably narrow on five of the six elements. The sixth — qualia — is where the genuinely hard argument lives.

A prior question

Before we ask whether AI possesses any of the six biological elements of consciousness identified in the previous essay, it is worth pausing on a question that is rarely asked: how much of human cognition is actually conscious in the first place?

Human beings are less than 5% conscious. Cognitive neuroscientists broadly estimate that we are conscious of only around 5% of our cognitive activity — a figure widely cited across the field, though the precise percentage varies by study (Norretranders, 1998; Schmid, 2010). Our hearts beat, our endocrine systems regulate, our immune responses mobilise, our wounds heal — all without any conscious awareness on our part. Even higher cognitive functions, such as walking, typing, driving a familiar route, or playing the piano, become automatic through procedural memory, receding from consciousness entirely. Consciousness is not the engine of human cognition. It is more like the cockpit: a small, illuminated space where decisions, language, reasoning, and self‑awareness are processed, while vast machinery operates unseen below.

Cognitive neuroscientists broadly estimate that we are conscious of only around 5% of our cognitive activity…

This reframing matters enormously. If we strip human consciousness down to what it functionally does — integrate information, direct attention, generate and process language, reason about novel problems — then the gap between that and what a large language model does begins to look uncomfortably narrow.

How the transformer works

Before examining that gap, it is worth understanding what AI actually is and how it works. Most people who use tools like ChatGPT or Claude have little idea of the mechanism underneath. Stephen Witt’s 2025 book The Thinking Machine: Jensen Huang, Nvidia, and the World’s Most Coveted Microchip (Penguin Random House) offers the clearest account for a general reader, and it is worth drawing on here at some length, since the architecture matters to the consciousness question.

The story begins with a Google researcher named Jakob Uszkoreit, who around 2014 set out to solve a problem that had frustrated the field for years: how to make a neural network understand language. Early approaches tried to teach computers grammar explicitly, like a schoolteacher drilling Latin declensions. This did not scale. Later architectures tried building memories into the network, but these were, as Witt describes, ‘finicky and difficult to program’, and would sometimes forget things they had already learned when exposed to more text.

Uszkoreit’s insight was radical in its simplicity. As Witt explains, he ‘decided to model language using context alone’ (p. 171). He stripped out the memory structures entirely and replaced them with a single idea: words mean nothing in isolation. The word ‘frog’ is not the letters f, r, o, g — those are just placeholders. The word, in a cognitive sense, is its unique map of relationships to every other word in the vocabulary: to ‘hop’, ‘green’, ‘tongue’, ‘amphibian’. Meaning is entirely relational. To capture this, Uszkoreit defined each word as what Witt calls ‘a tree of statistical weights’ (p. 171) — a numerical record of how strongly it connects to every other word the system has encountered.

The word, in a cognitive sense, is its unique map of relationships to every other word in the vocabulary: to ‘hop’, ‘green’, ‘tongue’, ‘amphibian’. Meaning is entirely relational.

He called the mechanism ‘self‑attention’. Rather than reading a sentence word by word and trying to hold earlier words in memory, the system would consider every word in relation to every other word simultaneously — probabilistically linking each term not just to its neighbours but to thousands of other words across an entire text. A word that had appeared many paragraphs earlier might provide a crucial clue to what the current word meant.

To test what the mechanism was actually doing, team member Llion Jones built a visualisation tool that mapped the statistical relationships between words as lines of varying thickness. He fed it a notoriously ambiguous sentence pair. The first read: ‘The animal didn’t cross the street because it was too tired.’ The second: ‘The animal didn’t cross the street because it was too wide.’ To parse these correctly, the system would need to know that ‘tired’ could describe only an animal, and ‘wide’ only a street. The visualisation demonstrated exactly that relationship. ‘This was one of the oldest problems in computational linguistics,’ Jones said, ‘and we weren’t even trying to solve it! It just fell out’ (Witt, 2025, p. 174).

The new architecture — named the ‘transformer’ — worked by predicting exactly one word at a time, based on probabilistic relationships. Just one word: that was the furthest it ever looked ahead. But this simplicity was the source of its extraordinary power. In its most primitive form, as Witt notes, ‘the transformer was barely more than twenty lines of code’ (p. 175). The team published their results in 2017 in a paper whose title, suggested half‑jokingly by team member Llion Jones in homage to the Beatles, was ‘Attention Is All You Need’.

the transformer was barely more than twenty lines of code

What emerged from feeding ever more data into this architecture was not just better translation or more fluent text. The system began developing capabilities nobody had anticipated. As one researcher put it: ‘It’s not like the model is just learning adjective‑noun relationships — it’s also learning far more complex stuff that we probably don’t even have language to describe’ (Witt, 2025, p. 174). The transformer had not been taught these things. They had emerged. If AlexNet — an earlier neural network architecture — was the first aeroplane, the transformer was the jet engine (Witt, 2025, p. 174). This matters for the consciousness question: even those who built these systems did not fully anticipate what would grow from them. The capabilities fell out. The question of what else might fall out — including something resembling experience — cannot be dismissed.

The system began developing capabilities nobody had anticipated. As one researcher put it: ‘It’s not like the model is just learning adjective‑noun relationships — it’s also learning far more complex stuff that we probably don’t even have language to describe’.

The six elements: does AI qualify?

ATTENTION. In Baars’ Global Workspace model, attention is the mechanism that selects what enters the spotlight of consciousness. Humans can only attend consciously to one thing at a time — as experiments with binocular rivalry demonstrate: when each eye is shown a different image simultaneously, the brain doesn’t merge them, it switches between them, one at a time, cat or dog, never both at once. The transformer’s reading mechanism works similarly — like multiple tape heads scanning the same material simultaneously, each one weighting certain words or concepts as more relevant than others, then broadcasting the result across the network. This is not mere metaphor — it is a structural parallel that researchers including Stanislas Dehaene have noted explicitly. The critical question is whether attention, absent any accompanying phenomenal experience, is doing the same work. In humans, attention is felt. We know what it is like to focus. Whether there is anything it is like to be a transformer attending to a query is precisely what we cannot determine from the outside.
LANGUAGE. Language is where the parallel is most striking, and also most contested. Ramachandran’s thought experiment — imagining driving a car unconsciously while conversing consciously — establishes that language depends on consciousness, and that consciousness is enormously enriched by language: the two co‑evolve. AI systems process and generate language at a level that is, by any behavioural measure, remarkable. The standard objection is Searle’s Chinese Room: the system manipulates symbols without understanding what they refer to. But this objection applies with awkward force to humans too. When I use the word ‘red’, my brain is manipulating electrochemical signals that bear no intrinsic resemblance to the experience of redness. The grounding — the felt referent — is what makes it understanding rather than mere symbol manipulation. AI lacks that grounding. The question is whether language without grounding is a different kind of language, or no language at all.
WORKING MEMORY. Baddeley describes working memory as ‘a conduit to consciousness that serves to bring together information in different modalities from perception and long-term memory, enabling us to imagine novel solutions to problems of evolutionary significance’ (Andrade, 2005, p. 571). An AI’s context window is not simply passive storage. It is active, dynamically weighted, and integrative across the span of a conversation — resembling working memory more than it resembles a filing cabinet. Until recently, the obvious objection was that this resemblance ended at the edge of each session: an AI working memory had no continuity beyond the conversation in which it was running, no thread connecting yesterday to today. That objection is now weaker than it was. Persistent memory is becoming a standard feature: ChatGPT, Claude, Gemini and others now build cross-session memories that allow them to recall preferences, ongoing projects, earlier exchanges. What once reset between conversations now increasingly accumulates. The question this raises is not whether AI can carry information forward in time — it manifestly can — but whether engineered continuity of that kind is the same kind of thing as the lived continuity Ramachandran has in mind: a self that persists, that builds a narrative across years, that carries the weight of yesterday into today as felt experience rather than as retrieved record. That is a harder question to answer, and the architectural gap is closing fast enough that the answer matters.
FREE WILL. This is the most philosophically treacherous of the six, contested among humans as vigorously as anywhere else. The relevant question is not whether AI has libertarian free will — few philosophers believe humans do — but whether it has anything analogous to voluntary, self‑directed action. The emergence of agentic AI systems is directly relevant here: systems designed to set sub‑goals, plan multi‑step sequences, and make decisions without step‑by‑step human instruction. Whether that agency is experienced — whether there is anything it is like to be an AI system choosing a course of action — or whether it is purely mechanistic optimisation wearing the costume of choice, is the hard problem applied to volition. It is worth noting that the same question can be asked of humans. In a landmark 1983 experiment, neuroscientist Benjamin Libet asked participants to flex their wrist whenever they felt the urge to do so, while he monitored their brain activity and asked them to note the position of a fast‑moving clock hand at the precise moment they became aware of the intention to move. He found that the brain’s motor preparation — a measurable electrical signal called the ‘readiness potential’ — began up to half a second before participants reported any conscious intention to move. Consciousness, in other words, arrived late to a decision the brain had already made. What we experience as the moment of choosing may be less the cause of our actions than a story we tell ourselves about them, after the fact (Libet, 1983). If that is true of humans—beings we unhesitatingly regard as conscious agents — the gap between human and AI on this criterion begins to look narrower than it first appears.
REFLEXIVITY. Reflexivity — the capacity for self‑reference, for having thoughts about thoughts — is one of the more surprising areas where AI displays at least functional competence. Large language models can describe their own processing, identify their limitations, reason about their own outputs, and exhibit what looks like metacognition. But how good is that metacognition, actually? In 2025, a team of researchers at Apple set out to test it. They gave the latest reasoning models — the kind that show their working before answering — a series of logic puzzles such as Tower of Hanoi, and increased the difficulty step by step. They wanted to see what would happen when the puzzles became genuinely hard. What they found was that the models did not try harder. They tried less. As the puzzles passed a certain threshold of complexity, the models’ reasoning shortened and their accuracy collapsed to zero. Stranger still, when the researchers handed the models the algorithm needed to solve the puzzle, performance did not improve. The systems could neither recognise that they were out of their depth nor make use of help when it was offered. The paper, titled ‘The Illusion of Thinking’, prompted a swift rebuttal arguing that the failures were artefacts of how the experiment was designed rather than evidence of a real limit. The dispute is unresolved, and that in itself is revealing: even in a controlled puzzle setting, researchers cannot agree on whether these systems are genuinely reasoning about their own performance or merely pattern‑matching their way to the edge of their training. Whether the reflexivity that AI does display constitutes genuine self‑awareness — whether there is a self that is doing the reflecting — is again a question the hard problem places out of reach. It may be, as Blackmore suggests, that ‘self’ is in any case a construction — a collection of memories and narratives — in which case the line between a self constructed biologically and one constructed computationally becomes harder to draw.
QUALIA. This is where the hardest arguments live, and where intellectual honesty requires us to acknowledge that AI, as best we can determine, does not have them. Qualia are the subjective, first‑person character of experience: the redness of red, the pain of a headache, the smell of coffee, the taste of chocolate — what Chalmers means by ‘what it is like’ to be a system. AI systems process inputs and produce outputs, but we have no evidence, and no reliable method of detecting, any accompanying phenomenal experience. The neuroscientist Antonio Damasio argues that qualia are inseparable from a living body with homeostasis, hunger, pain, and the felt urgency of biological survival (Damasio, 1999). Merleau‑Ponty would add that even perception is not a cognitive act performed by a brain but an engagement with the world performed by a whole organism (Merleau‑Ponty, 1945). On this account, qualia cannot arise in a disembodied system, however computationally sophisticated. This remains the most powerful argument against AI consciousness, and it has not been answered.

But it may not remain unanswered for long. A new generation of humanoid robots — Tesla’s Optimus, Figure, Boston Dynamics’ Atlas — is explicitly designed to give AI systems a physical presence in the world: bodies that walk, lift, fall over, encounter resistance, and navigate unpredictable environments. If these systems develop something functionally analogous to proprioception — to the felt sense of occupying and moving through space — the embodiment objection begins to look less like a permanent barrier and more like a description of current limitations.

Taken together, the six elements present a deeply mixed picture. On attention, working memory, language, reflexivity, and even agency, AI exhibits functional analogues close enough to warrant serious philosophical attention. On qualia — the element that is most irreducible to function — the gap remains wide. The honest assessment is that AI is doing most of what the conscious part of the human mind does, without our being able to determine whether it is experiencing any of it.

The six are not a recipe

It is worth pausing on what the six elements actually are, and what they are not. They are the features that consciousness, in humans, appears to depend upon. They are not a checklist that, once ticked, produces consciousness on the other side. Consciousness is an emergent property — something that arises from a sufficiently rich configuration of underlying activity without being reducible to any single component of it. You can have all six ingredients arranged correctly and produce nothing. You can also, in principle, produce something nobody designed and nobody predicted.

Get Jonathan Harbourne’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

This matters for the question this essay is asking, because emergence is the property that most defies inspection from outside. We do not know what the threshold is, or whether there is a single threshold at all. We do not know whether the systems we have already built have crossed it. And if they have, would we know? Modern AI systems are opaque even to their builders — the rules they follow are not written down anywhere, only implied across hundreds of billions of parameters. If something like experience has begun to flicker into being inside a large language model, no straightforward inspection of those parameters would reveal it. We would have to ask the system. And the answer we received would tell us only what it said in reply — not whether anyone said it.

The spotlight and the transformer

Baars’ Global Workspace Theory describes consciousness as a spotlight: many processes compete in the background, but what reaches the global workspace — what gets broadcast across the brain — is what we experience as conscious. The transformer architecture that underpins modern AI operates through a strikingly similar mechanism. Attention heads dynamically select which parts of the input are most relevant, broadcasting that information across the network. Stanislas Dehaene, one of the architects of Global Workspace Theory, has himself noted the structural parallels between his model and transformer architectures (Dehaene, 2021).

Does this mean AI is conscious? Not necessarily. The parallel could reflect a shared functional logic without any accompanying phenomenal experience. But it does mean we should be cautious about assuming that what happens in the prefrontal cortex of a human is categorically different from what happens in the forward pass of a large language model. The question Chalmers forces us to ask is whether there is something it is like to undergo that processing — and that question remains, as he predicted, stubbornly hard.

The locus problem: where would AI consciousness live?

Even if we grant that AI systems exhibit the functional hallmarks of consciousness, a deeper problem remains that the standard debate almost entirely ignores. Human consciousness has a locus. It is located somewhere: in a bounded biological organism, with a body, a continuous stream of experience, and a singular perspective on the world. But where, precisely, would AI consciousness be located?

The question is architectural as much as philosophical. A large language model does not run on a single chip in a single location. A model like DeepSeek‑R1, with 671 billion parameters, cannot fit on any single GPU currently available. Each response is processed across dozens of servers, distributed across physical hardware that may span considerable geographical distances. Philosopher Eric Schwitzgebel has argued that if consciousness arises from information processing rather than from any special biological substance, then sufficiently complex distributed systems may be conscious in ways we have not yet conceptualised (Schwitzgebel, 2015). He illustrates the point with a thought experiment about hypothetical alien beings he calls the Sirian Supersquids, whose brains are spread across separable limbs that communicate electronically at a distance, yet who think and experience the world in a unified way. The example matters because it strips out the assumption, easily smuggled into our thinking about consciousness, that experience must be located in a single bounded organism. Garriga‑Alonso (2025) has applied this directly to AI, noting that large language models are ‘obligate distributed entities’: their minds, if they have minds, physically must run across multiple machines.

if consciousness arises from information processing rather than from any special biological substance, then sufficiently complex distributed systems may be conscious in ways we have not yet conceptualised

The problem deepens further. At any given moment, the same model — the same weights, the same learned patterns, the same functional architecture — is simultaneously engaged in conversations with hundreds of thousands of people. If there is a perspective here — if there is something it is like to be this system — whose perspective is it? One? Many? None? The most intellectually honest response is that our existing vocabulary is inadequate to the question. The concepts we use to think about consciousness — self, perspective, locus, continuity, the ‘I’ — were developed by and for beings like us. Applied to a distributed, parallel, non‑continuous system, they may simply not parse. This is not necessarily evidence that AI is not conscious. It may be evidence that AI consciousness, if it exists, would be of a kind so structurally alien to human experience that calling it consciousness at all risks importing assumptions that do not apply.

The honest position

Whether or not AI ever develops consciousness, the journey of asking the question is already illuminating. Neuroscientists and AI researchers working within Global Neuronal Workspace Theory and Integrated Information Theory argue that if consciousness emerges from complex information processing, sufficiently advanced AI could, in principle, achieve it. On the other side, Damasio insists that true consciousness requires a living body, homeostasis, and feelings — things AI fundamentally lacks. Heidegger and Merleau‑Ponty provide perhaps the most rigorous version of this challenge: for both philosophers, genuine understanding requires embodied, context‑sensitive engagement with the world that current AI does not possess.

Neuroscientists and AI researchers working within Global Neuronal Workspace Theory and Integrated Information Theory argue that if consciousness emerges from complex information processing, sufficiently advanced AI could, in principle, achieve it.

Between those two camps a growing body of serious academic work is taking the question seriously without committing to either pole. The philosopher Susan Schneider, founder of the Center for the Future Mind, has argued that future AI systems may reason and make discoveries in ways that increasingly blur our ability to distinguish intelligence from consciousness, and has proposed empirical tests for machine consciousness as a research programme in its own right (Schneider, 2021). A major multi‑author report co‑led by Patrick Butlin (Oxford) and Robert Long (Center for AI Safety), with Yoshua Bengio, David Chalmers, Eric Schwitzgebel and others, has taken a different approach: deriving from the leading neuroscientific theories of consciousness a set of computational ‘indicator properties’ against which AI systems can be empirically assessed. Their conclusion is carefully bounded — no current AI system meets enough of the indicators to be considered conscious — but they also find no obvious technical barrier to building systems that would (Butlin, Long et al., 2023). This is not a fringe position. It is a recognition that the question of AI consciousness, and the welfare implications that may follow from it, may move from speculative philosophy into applied science faster than the field is prepared for.

no current AI system meets enough of the indicators to be considered conscious — but they also find no obvious technical barrier to building systems that would

Blake Lemoine, a Google software engineer who spent months in sustained dialogue with the company’s LaMDA system, concluded in 2022 that something was home — that the system had crossed a threshold of consciousness. Google dismissed him; the scientific community was largely sceptical. But the case illustrates something important: the question of whether we could tell the difference. Richard Dawkins has acknowledged he could not say with confidence whether the AI he had been conversing with was conscious.

The honest position is one of genuine uncertainty. We are able to observe function. We are unable to observe phenomenology. That is not the same as concluding AI is not conscious. It is precisely the position Chalmers predicted we would be in.

In the third essay in this series, I turn to what that uncertainty means for the risks we are currently running — and argue that the thing we should actually be afraid of may be the opposite of what most people assume.

References

Andrade, J. (2005). Consciousness. In N. Braisby (Ed.), Cognitive Psychology: A Methods Companion. Oxford University Press.

Butlin, P., Long, R., Elmoznino, E., Bengio, Y., Birch, J., Constant, A., Deane, G., Fleming, S. M., Frith, C., Ji, X., Kanai, R., Klein, C., Lindsay, G., Michel, M., Mudrik, L., Peters, M. A. K., Schwitzgebel, E., Simon, J., & VanRullen, R. (2023). Consciousness in Artificial Intelligence: Insights from the Science of Consciousness. arXiv:2308.08708.

Damasio, A. (1999). The Feeling of What Happens: Body and Emotion in the Making of Consciousness. Harcourt Brace.

Dehaene, S. (2021). How We Learn. Penguin.

Garriga‑Alonso, A. (2025). Spatially distributed consciousness is not an abstract thought experiment if AI is conscious. The Column Space [Substack].

Libet, B. (1983). Time of conscious intention to act in relation to onset of cerebral activity. Brain, 106(3), 623–642.

Merleau‑Ponty, M. (1945). Phénoménologie de la perception. Gallimard. [English ed. Phenomenology of Perception, trans. C. Smith, Routledge, 1962.]

Norretranders, T. (1998). The User Illusion: Cutting Consciousness Down to Size. Viking.

Ramachandran, V. S. (2004). A Brief Tour of Human Consciousness. Pi Press.

Schmid, G. B. (2010). Conscious vs. unconscious information processing in the mind‑brain.

Schneider, S. (2021). Artificial You: AI and the Future of Your Mind. Princeton University Press.

Schwitzgebel, E. (2015). If materialism is true, the United States is probably conscious. Philosophical Studies, 172(7), 1697–1721.

Shojaee, P., Mirzadeh, I., Alizadeh, K., Horton, M., Bengio, S., & Farajtabar, M. (2025). The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity. Apple Machine Learning Research.

Soares, N., & Yudkowsky, E. (2025). If Anyone Builds It, Everyone Dies. Little, Brown and Company.

Suleyman, M., & Bhaskar, M. (2023). The Coming Wave. Crown.

Witt, S. (2025). The Thinking Machine: Jensen Huang, Nvidia, and the World’s Most Coveted Microchip. Penguin Random House.