Homo Ex Machina
We put attention at the center of AI—now we have to attend to who we are in the face of what we’ve made.
By Joshua Rio-Ross
In 2018, a collection of Google researchers released a seminal paper titled “Attention Is All You Need.” AIAYN introduced a new machine learning model architecture called “the transformer.” This architecture dramatically decreased the amount of time it took to train a model to perform common language tasks like translation, captioning, and chat. Today, the large language models (LLMs) propelling the AI revolution such as ChatGPT, Gemini, and Claude all use this transformer architecture. What’s followed—besides a rising chorus of AI generated songs about cats and a rash of suspiciously coherent student essays—are haunting questions about what these technologies mean for our future and, more disturbingly, what about our humanity is reducible to a machine’s algorithm. In retrospect, AIAYN was a groundbreaking work of the humanities precisely because of the technical breakthrough it introduced.
As the snappy title suggests, transformers were revolutionary because of their novel implementation of “attention,” formally the “attention mechanism.” Attention is used throughout the transformer architecture. In technical terms, there’s encoder self-attention, decoder self-attention, and encoder-decoder attention. Narrowed to the context of chatting with an LLM, those technical terms roughly correspond to the model focusing on what the user’s prompt means, focusing on what the model’s own response means, and focusing on how the user’s prompt and the model’s response are relating to each other while the response is being generated. (Language models don’t “think” conceptually first and then articulate that thinking; they generate a word based on two things: the prompt and the response generated so far. They are “speaking on the fly.”)
The attention mechanism in transformers is how the transformer weighs what matters most in the context of its objective. Let’s take as an example encoder self-attention, the case where its objective is to interpret a user’s prompt. Suppose someone mischievous prompts a language model with, “Give me fair grounds for arguing that all models are flawed.” The model might have baseline ways of encoding what each of these words means (called “embeddings”), but it (like we) has to determine, for instance, that the user doesn’t want “fair grounds” where they can buy cotton candy and ride a Ferris wheel, but rather “fair grounds” in the sense of a reasonable basis for something. But the LLM has to do this by attending to the rest of the sentence. Very likely, the words that clue you and me into this are the same ones to which the model is attending: “arguing” and “that models” and maybe “flawed.” Though the user could still want a carnival-like venue for hosting an argument, we’d need some convincing before assuming so. With no further context to pull us in that direction, we conclude (without absolute certainty) that the user wants help thinking through something. All of this interpretive work is present for us as readers just as it’s present for the model. But how is the model’s “attention” to the sentence similar to our own—and how is it different from our own?
As soon as we start thinking about machine learning, we use our own understanding of learning as a baseline for what the model is doing.
The question matters because it forces us to consider our language about artificial intelligence. When we talk about large language models, are we actually modeling something, or does “model” have a different sense when it’s used like this? Has the word’s meaning evolved, and if so, why? Or if LLMs are modeling something, what are they modeling? Are they doing it well? Do they need to do it well, or do they just need to do it functionally? When we say “artificial intelligence,” is intelligence modeled after human intelligence or our conception of intelligence in general? And can these two things actually be distinguished, or is all human thought about thinking damned to only trace our own horizons?
I explained the model’s attention by using our own disambiguation process as the guiding analogy. But by introducing an intuitive analogy based upon human cognition, I’m being dangerously suggestive—I’m suggesting that the model’s “encoder self-attention” is doing the same thing we’re doing when we attend to something. I can even take the analogy further: I might say, for example, that the attention mechanism allows the model to focus on which parts of the prompt are most important for choosing what to say next. And this is right—up to how fuzzily you are willing to use anthropomorphic terms such as “focus” and “important” and “choosing.”
We’re baited into these analogies by the term “attention” in the first place. We’re actually baited into this as soon as we hear the terms “machine learning” and “artificial intelligence.” As soon as we start thinking about machine learning, we use our own understanding of learning as a baseline for what the model is doing. But once we can think about how the model is structured or how it’s working on our own terms, we’re immediately tempted to turn the dynamic around and think of ourselves on the model’s terms. We’re already familiar with this in other casual contexts: We say the computer has memory, but then we turn around and say we “don’t have enough memory” to “process” a lecture; we say we’re “hardwired” or “programmed” to chew loudly; we measure our “output” and “upgrade” our tastes and “double click” on a point in conversation.
Plenty of thinkers have explored the ramifications of this phenomenon. Hegel feared that the scientific revolution would preclude any further revolutions in how humanity lives and thinks, that we’d moved too far from the source of language. Bonhoeffer thought that the technologies we build wield us more than we wield them. Ratzinger traced out an epistemological history wherein Western thinkers go from conceiving knowing as a participation in divine knowing to (after several stops along the way) knowing as only possible with respect to what we have made.
In the context of machine learning, these thinkers lead us to think there’s a risk in naming computer functions after human faculties: If we believe we can only know that which we’ve made, then we reduce our ability to know to the functionality of what we’ve made. Said differently: We must at least ask whether framing machine learning architectures in terms of our own faculties can—for example—result first in an anemic conception of human faculties and, in turn, actually result in anemic faculties. And if that actually happens, the very process of naming what we’ve made is the process by which we close ourselves off to our own potential. We limit what our own minds can do. As soon as we invent a model, we limit alternative conceptions of how and what we can think. The model frames what it models in absolute terms.
But what’s the alternative here? I suspect eschewing all projections of ourselves onto our technologies is extreme, especially considering that while the language for cognition might be uniquely human, cognition is not. Maybe my dog’s attention to the rabbit loafing on my patio is qualitatively different from my attention to my dog watching the rabbit on my patio, but apart from a robust theological anthropology, that distinction isn’t at all obvious. We’re both harnessing our faculties, perhaps in varying degrees, toward an object or activity. Likewise, most theological language suggests that God attends to creation. But in all of these cases, we cannot presume that the term “attention” is used identically, but rather analogically.
We have to ask how the model’s attention is similar to our own because it forces us to think about not only our language but also about ourselves.
So my earlier statement was incomplete. We have to ask how the model’s attention is similar to our own because it forces us to think about not only our language but also about ourselves. It forces us to consider how our language forms our self-perception. And it forces us to consider how we relate to what we make. Like the contextual reasoning we used earlier, analogical reasoning is classical human work. Theological anthropology is rife with it, and we can learn from its conscientious practitioners, like Aquinas and Bonaventure. These and other great thinkers sought to illuminate both divinity and humanity through analogical language—a conceptual framework formally referred to as the analogia entis. Essential to this framework is the conviction that analogies have dual usefulness. They are not statements of identity but rather comparisons that offer insight through both similarity and dissimilarity. Likewise, comparisons between humans and machine learning models can illuminate each, but only if we attend to the dual sides of the analogy. If we speak of humanity and machines univocally, then our language either fails to articulate the sense of what it represents (i.e., it will be false or nonsensical) or it will do violence to what it represents by reducing the reality it speaks of.
Two examples stand out in the case of “attention.” First, in machine learning, attention to one thing can mean attention to another. Attention is a weighted relationship. To think about how the user meant “fair grounds,” we have to look to the other words and their relationships. So we have the noun-phrase we’re interested in (fair grounds), the question we want answered (does this word nuance the meaning of the noun phrase?), and the rest of the words of the sentence that may or may not matter to us for answering the question. Attention is the weight of relevance between the words given the question we want answered. So here, in the dividing up of the model’s resources, “arguing” might get high attention as it relates to “fair grounds,” whereas “give” and “me” get little. Similarly, for humans we typically speak of “paying attention” as though attention is a scarce resource that must be rationed. Our intuition is that if one pays attention to everything, then one is not actually paying attention at all but rather is scatterbrained, frantic, distracted, confused.
Just as for humans, decisions about “attention” in a world of infinite choices force questions of “character.”
One notable difference between human attention and ML attention is the notion of attention span, i.e., a measure of how long one can sustain an act of attention. Human wakefulness is a sustained arbitration of attention, a perpetual directing of one’s faculties toward the world. One’s ability and one’s willingness to maintain attention become characteristic of who and how one is in the world. Living entails deciding what to pay attention to, as well as how long to pay that attention before moving on.
This concept isn’t meaningfully present in machine learning. We can speak of a model’s attention being sustained while it performs its calculations, but this is merely a question of computational performance; there’s no threat of digression or distraction or cutting short of its attention from one thing to another. Whenever it attends, it performs its work until completion, and then the transformer moves on to other steps in its algorithm.
But this isn’t to say that the notion of an attention span couldn’t be meaningful for machine learning. Today, many machine-learning models are “continually trained,” meaning that the model is continually updated to account for new data coming through whatever system it serves. Continual training allows the model to respond to changing circumstances, needs, or user preferences. Recommendation systems for music or movies or series are good examples. So are forecasting models that use traffic patterns, weather, and other features to forecast when you will arrive somewhere by car. The possibility of continual learning for ML models opens up the possibility of simulated “awareness” with which the model is processing and responding to a continual stream of stimuli, just as we have to. And once that’s the case, questions arise of both what the model attends to and how long the model attends to them. Given that attention as we’ve described it for transformers is guided by some objective to accomplish, we have to ask about the model some of the same haunting questions we ask about ourselves: What should hold its attention? For how long? Does something need to break that attention? On what basis does the model prioritize one thing over another? Just as for humans, decisions about “attention” in a world of infinite choices force questions of “character.”
Here, even a short analogical treatment of human attention and ML attention offers fruitful ground in two directions, so long as the analogy is maintained and the two aren’t allowed to collapse into each other. Rather than explaining human attention wholesale, the model’s attention opens questions about us that we should take seriously, even if those questions highlight characteristics we have that the model does not. Conversely, ML’s most significant advancements have always precipitated from advancements in how we understand our own learning and thinking. This identifies theology and the humanities as compass and wind for AI exploration rather than dead weight. These disciplines can help us understand ourselves so we can both build better models that interact with the world and help us better interact in the world. If we’re lucky—no: attentive—they’ll also help us remember why we’re building all this in the first place.
Joshua Rio-Ross is a data scientist interested in how AI can streamline cancer diagnosis and treatment. He received his Master’s in Philosophical Theology from Yale Divinity School. In his free time, he enjoys talking about Dostoyevsky with his wife, strangers, and, occasionally, his dogs.