“I am still struggling to find a proper definition of AGI… ur thoughts”

(From an email conversation with a friend. I thought that it might be useful to put this snippet here.)

The short answer is that ‘AGI’ (artificial general intelligence) is a bit meaningless as we don’t really know what ‘I’ (intelligence) is. The best approach is to read AGI as meaning human-like intelligence in machines while accepting that human-like intelligence is something of an unknown.

Definitions of intelligence are generally somewhat hand-wavy, along the lines of “the ability to learn or understand or to deal with new or trying situations”, “the ability to solve complex problems or make decisions with outcomes benefiting the actor”, or even just “the ability to acquire and apply knowledge and skills”. These definitions rely on subjective assessments—how complex is complex?—or fuzzy concepts like skill whose definition morphs and evolves over time.

Cognitive science researchers use a reductionist approach to making intelligence researchable by associating it with (reducing it to) a set of (human) abilities. Intelligence tests are designed to measure each of these abilities, allowing us to combine the scores to create and overall measure of intelligence. There’s something in this, as different intelligence tests provide similar results. (Researchers call the hidden variable g.) The problem is the usual one with this approach—the thing researchers are studying is not necessarily the same as the thing that we commonly think of as intelligence. For them, intelligence is what the tests measure, it a set of abilities that we can separate, observe and quantify. It’s a phenomenological definition (how we perceive the thing), not a structural one (how the thing works internally), nor ecological (how it relates to the things around it), nor subjective (how we relate to the thing). Our natural inclination is to think of intelligence as including all of these.

AI research has a similarly reductionist approach to understanding intelligence:

  • Humans are intelligent because something about a set of skills & abilities.
  • We can identify these skill & abilities and find technological or algorithmic solutions to each (like the classic A* search algorithm).
  • Combining the solutions will result in ‘artificial intelligence’.

This is the basis of the research project outlined in Dartmouth meeting in the 50s, and is the view of Peter Norvig (who wrote the book in AI) and is why he published General AI is already here (co-authored with Blaise Agüera y Arcas from Google Research) which posits that “‘General intelligence’ must be thought of in terms of a multidimensional scorecard.” Norvig believes that ‘frontier models’ (the recent wave of LLMs) have achieved a significant level of general intelligence—they are competent in a wide range of tasks across diverse media, and demonstrate surprising, ‘emergent’, abilities—and current work to bolster their performance via techniques like integrating knowledge graphs will improve their performance, making them even more intelligent. AGI is within our reach—it’s just a question of tweaking.

The problem is that it’s not clear that we can decompose intelligence into a set of things, address the things separately, and then combine our solutions to create ‘intelligence’. The root of this problem is that many of the phenomena we’re interested in might be verbs (processes in contexts) rather than nouns (things). It might not be possible to understand intelligence via a reductionist approach as decomposing it in this way eliminates interactions between the processes that are essential to emergent properties of intelligence.

Take memory. Is memory a thing—a place in our head where perception is encoded for later recall—or a process—where context (past experience) and stimuli result in (imperfect) (re)construction of perception, an echo of the past?

The most interesting research in psychology takes an ecological approach to understanding human performance. That is, they posit that human development and performance is strongly influenced by the environment one find’s oneself in. Rather than looking for and studying things, nouns, inside the individual (‘memory’), they look for and study processes, verbs, that connect the individual to the world around them (‘recall’).

Their starting point was to ask a simple question: “Where does the mind stop and the rest of the world begin?” Putting this another way—do we think inside our heads? The answer is no, we don’t. Instead, we think by interacting with the world around us.

The most common example is the outfielder problem. We don’t catch a ball (when fielding in baseball) by sensing its trajectory and then rushing to where we calculate it will land. Rather, we adjust our own motion so that the ball appears to be moving in a straight line, while dodging obstacles in our path as they appear. You might also consider long division, which we’re taught to do with pen and paper. Does the thinking of long division occur in our heads? That can’t be true as we rely on the pen and paper—at least some of the thinking is on the paper.

It’s commonly assumed we think in our heads, and so thinking must be computation. This is particularly strong in the AI community.

A stronger version of the reductionist approach to understanding intelligence, above, is:

  • Humans compute—ergo intelligence is computation.
  • Church-Turing tells that any two computers are functionally equivalent—ergo once the right algorithms are discovered we will have replicated human-level intelligence.
  • Mechanical computers are more capable than human computers, they don’t have the same limitations (memory capacity, processing speed, precision)—ergo mechanical intelligence will be more capable than human intelligence.

This is the root of the whole ‘singularity’ thing—that eventually the algorithms underpinning general intelligence will be discovered, enabling artificial general intelligence, which will be more capable than human intelligence as it doesn’t have the same limits (due to its mechanical underpinnings).

We can see that this line of reasoning is built on shaky foundations. 

While humans can compute, it doesn’t necessarily follow that humans are computers, that we think inside our heads. The history of creativity research is a case in point. At first it was thought that creativity was the essential attribute of an individual, the lone genius, recipient of a gift from god or, later, the consequence of their unique genetic inheritance. This was supplanted by a reductionist view where the ‘normal’ person replaces the genius, creativity is conceptualised as a quality of the (lone) individual with a ‘creative personality’ (a skill, implying that creativity can be taught). Most recently a systems approach (or ‘social creativity’) holds that creativity is the result of human interaction and collaboration—a generative and ecological phenomena, the result of interactions between situated actors. This is part of the general shift to understand human performance in ecological terms, by understanding the systems the humans are part of. Many of the outcomes we’re interested in—such as creativity and intelligence—cannot be inferred from the properties of the individuals themselves. Instead, we need to consider the extended social-technical system that the individuals are part of, just as the research psychologists (mentioned earlier) have done.

The limitations of today’s AI solutions—such as autonomous cars—may well be due to a us trying to situate synthetic humans in in established human environments. The act of driving a car, for example, has been automated for some time—the first self-driving car is likely a vision-guided Mercedes-Benz robotic van which drove on an autobahn in Munich, Germany in the 80s. However, modern self-driving solutions which showed promise in limited trials are struggling when rolled out at scale, likely as driving in traffic is both a technical and social challenge. The self-driving cars were able to find their way through an environment dominated by human drivers, likely as the humans allowed for their quirks. Self-driving cars don’t have the same flexibility as humans though, and the problems we’re seeing are emergent behaviours that manifest once there is a non-trivial number of self-driving cars in the traffic system. It’s easy to forget just how good humans are at collectively navigating the chaos of modern traffic—including navigating the quirks of not-quite self-driving cars.

When work is automated it is typically transformed. It’s rare for a human worker to be replaced by a robot prosecuting the same tasks. Instead, the socio-technical system is reorganised—rather than de-skilling, expertise is redistributed, and the new instruments and tools become extensions of the worker’s body. Developing a successful self-driving solution may require us to engineer the environment to remove the social aspect of driving in traffic—by eliminating all human drivers—though this is likely to have undesirable infrastructure and social consequences. Developing smart human-car hybrids—redistributing driving expertise—might be a more tractable approach.

Similarly, creating AGI—human-like intelligence in machines—will likely require us to adopt a more ecological approach. Human intelligence is inherently social—as we can see from feral children in the past. It’s not enough for our AI to be trained on human language, it needs to be a full participant in the complex and messy process whereby we (collectively) understand the world. However, it’s quite possible that the limitations of human intelligence are natural consequences of being part of the world, of thinking by interacting. These limitations will apply equally to human and machine general intelligence. The fact that computers have greater memory capacity, processing speed, and precision than humans might be inconsequential—being better computers doesn’t not imply that they can be more intelligent, as the limitation might be in the environment, not in the individual actor.

The whole AGI thing is a big unknown. Arguments over pattern-based (neural network) vs symbolic processing (formal logic) vs hybrid approaches misses the point. We’re replicating phenomenological aspects of intelligence with no understanding of how the phenomena arise.

One consequence of this approach is that we ascribe the capabilities of solutions that surprise us to the enabling technology, the algorithms used, rather than structural or environmental factors. LLMs, for example, are powerful language prediction solutions, but the intelligence we ascribe to them might be more a consequence of the language they’re processing (their training set) more than the power of their predictions. It’s a bit like seeing a bacteria follow a concentration gradient through a maze and assuming that it ‘solved the maze with intelligence’. Similarly, we assume that solutions which don’t quite make it, such as autonomous cars, just need better algorithms.

There are signs that the current wave of AI is running out of steam, that “there are reasons to believe that we have reached a plateau”. It’s also possible that there’s a lot of potential yet to exploit. However, we need to move on from trying to make machines that “think inside their heads” to systems—networks of humans and machines—that think by interacting with each other.