Embodied cognition and agentic AI
Daniel Lemire's blog
Where is your intelligence located? In your brain?
It is a simplistic answer. A better model is that your intelligence is embodied.
Consider a cook working at an expensive restaurant. He has all his favorite knives and cooking instructions, placed exactly where he wants them. His kitchen is part of his intelligence, of his skills. The same cook working in your kitchen can probably cook better than you do, but he can’t reproduce the same meals he would prepare in his favorite kitchen.
We often assess computer programmers using whiteboard tests. It is an endless source of complaints. Programmers rightly point out that it forces them out of their element. They are just not as good when you take away their laptop. It is not an excuse, it is a real issue: you are cutting them off from part of what makes them so intelligent.
To sum it up, the model of intelligence as a brain in a jar, disconnected from anything else, is ridiculous.
If you accept the idea of embodied intelligence, then many actions that we view as a consequence of our intelligence are actually part of our intelligence. First and foremost, language. Our ability to talk or write to each other means that I am not limited by my own person. Have you ever heard of human beings isolated in small tribes making technological breakthroughs? Nah. Progress requires lots of people communicating together. Up until a few decades ago, progress required cities. Today I am less certain than it does, as I can more and more communicate with anyone in the world from anywhere. But language is still critical, we have not invented anything better. Similarly, having hands and the ability to build sophisticated tools (like laptops) allows us to extend our intelligence.
At the end of 2022, we got a breakthrough technology: ChatGPT. It built on several pre-existing ideas such as (large) language models, neural networks, and so forth. That’s the ‘GPT’ part. But an important, if underappreciated, part of the breakthrough was the ‘Chat’ component. Someone had the idea of connecting a large language model with a chat interface. Maybe this came naturally and obviously to people building this system, but it should not be assumed to be trivial or unimportant.
Language is a key component of our intelligence, and, thus, it makes sense that it would be pivotal for machine intelligence.
We embodied the AI software in a chat box.
The next step was what we call today ‘agentic AI’. We keep the chat box, but we add the ability for the AI software to interact with tools, and to make plans to use them. In effect, we give the AI more agency: it can do stuff and learn from the results as they happen. It is starting to resemble a human being with hands and tools.
I was talking with a colleague this week. My colleague is all in on the AI revolution. He uses his AI to help him write better and faster, and to get his data analysis done faster, without so much help from technical experts.
But my colleague was not aware of the agentic AI approach. I tried to explain on the phone. What does it mean to give the AI access to tools? Is this only about saving the effort of copying and pasting the AI’s response?
I ended up making a video where I start an AI in a shell within something called RStudio. It is an environment people use to program in R, to do data analysis. I don’t use R or RStudio, but thanks to the AI, I was able to build an entire climate research project in a few minutes, complete with the retrieval of the data from the web.
How did the AI do it? I recorded it. It tried a few things, initially struggling to download the data. At some point, it finds out that it needs new R packages, so it installs them, and once they are installed, it can proceed to generate figures, verifying that it works.
Agentic AI greatly extends machine intelligence by improving the embodiment of AI.
I believe that it is not yet understood as it should be.
In Montreal, the most established professor in the field of AI is Yoshua Bengio. He started his own non-trivial enterprise a few years ago (Element AI). His latest venture is Law Zero, which aims to create a Scientist AI. The first goal of this project is to build AI without the agentic component. It should be a disembodied AI that has no goal of its own, no agency.
I fear that Bengio suffers from what Kevin Kelly called Thinkism. Let me quote from Kelly’s 2008 essay.
No intelligence, no matter how super duper, can figure out how human body works simply by reading all the known scientific literature in the world and then contemplating it. No super AI can simply think about all the current and past nuclear fission experiments and then come up with working nuclear fusion in a day. Between not knowing how things work and knowing how they work is a lot more than thinkism. There are tons of experiments in the real world which yields tons and tons of data that will be required to form the correct working hypothesis. Thinking about the potential data will not yield the correct data. Thinking is only part of science; maybe even a small part. (…) Thinkism is not enough. Without conducting experiments, building prototypes, having failures, and engaging in reality, an intelligence can have thoughts but not results. It cannot think its way to solving the world’s problems. (…) The Singularity is an illusion that will be constantly retreating — always “near” but never arriving. We’ll wonder why it never came after we got AI. Then one day in the future, we’ll realize it already happened. The super AI came, and all the things we thought it would bring instantly — personal nanotechnology, brain upgrades, immortality — did not come. Instead other benefits accrued, which we did not anticipate, and took long to appreciate. Since we did not see them coming, we look back and say, yes, that was the Singularity.
I believe that University professors are especially prone to thinkism. They view intelligence as being centered on what is happening in their brain. When you live in an ivory tower, it is easy to dismiss the real world as the core source of intelligence. Further, they are often people who did quite well in school where thinkism is naturally prevalent.
I have been a professor most of my life. However, I tire quickly of talking with other professors. What I most enjoy is working with people who have new tools that they apply in the real world. Unsurprisingly, I spent most of my time working with software that people deploy in the real world.
What Kelly is saying is that a high degree of intelligence is not enough to do much of anything. The real world is not the final stage of your thinking process. It is maybe the most important part of it.
And thus, when you connect your AI with the real world, giving it the ability of running experiments (as virtually all software developers do today), you get impressive results that go much beyond what AI software can do on its own.
Agency is not a feature. Agency is primary.
Generated by RSStT. The copyright belongs to the original author.