It is now beyond doubt that there are multiple players in the field of ‘artificial intelligence’ focused on leveraging the hype that they have themselves created to cash in on attracting investment. The lifeline of the world of IT and computing (of which something called AI is a part) is data centres (using the plural as singular), and the lifeline of data centres in power supply, which explains the continuous search for power efficient locations. And, of course, for fresh investment.
“We don’t know what AGI (artificial general intelligence) will ultimately deliver but we know it is coming. And we must plan now to sure we have the energy infra to support it”. Eric Schmidt, former CEO, Google
The sheer pompous character of the remark is just nauseating. The most important question here is – who is we? ‘We know it is coming’ – remarks like this are simply outrageous, especially when sentence structure is still foreign to search engines. And it is outrageous that such all an obviously investment-slanted observation is given the respect it doesn’t deserve. AGI refers to the way human beings think, not just on any particular subject or topic but just the ability to think. Let me say this boldly: AGI is far away in the distant future. Human beings can learn from an entirely new field, even if we have to start from scratch; everyone has the potential. Every new breakthrough in knowledge per se is a distinct human ability. Not to sound glib, can we imagine LLM doing what Newton and Einstein accomplished, well before the addictive use of technology in knowledge discovery?
LLMs and AGI
Yann LeCun of Meta explains why LLMs will never reach the level of AGI:
- Current LLMs need 400,000 years’ worth of text to train.
- A 4-year-old learns more through 16,000 hours of vision.
- The physical world is exponentially more complex than language (which, too, is complex, we must add).
In a speech titled and subsequently written as an article ‘Why I don’t believe in AGI’, Arthur Derderian explains that by AGI he refers “to a computer system capable of interpreting and understanding its environment. In other words, a system that can think for itself and is self-aware”. A little later he adds: What makes us human and gives us consciousness lies in our senses: we feel, touch, see, smell… All of this is data – data that could, theoretically, be collected and fed into a machine via various sensors. But the data generated by just one minute of human existence would be millions of times larger than all the data used to train LLMs like GPT. So, as you can see, it’s physically impossible to use an LLM to reach AGI”.
Computational model
As has been reiterated several times by several writers, LLM is a computational model based on probabilities and trained to predict responses. Any machine learning system is built on training on data and to reach even simple response, the system needs large amounts of data, which is either what is available on the internet and what could be fed into the system. This is the reason why data is sought after by marketers – the more the data, the better the system’s responses. And vast amounts of computing prowess. Calculation and memory are central to the functioning of LLMs. And, as Derderian emphasizes, this is physically limited by the energy they require and the GPUs to make quick calculations.
Now go back to Schmidt’s remark about the need to plan for the energy infrastructure. The dominant dimension is marketing, pure and simple.
What distinguishes the human is the ability to think something completely new. For example, the discovery of ‘Chaos’ from the study of weather patterns. Or Einstein’s space-time and relativity. Or the Poincare conjecture and its subsequent solving by Gregory Perelman. Or that X2 + Y2 = Z2 has no integer solution beyond n greater than 2.This is what humans are capable of doing. This is why it is exasperating to read the dumbing down of human intelligence and the (false) elevating of computational models above human intelligence. It does not strike such writers that they are lowering their own dignity.
Reasoning and reasoning
Let us take a specific domain such as Law. In an extremely interesting article titled ‘Why You’re Thinking About “Reasoning” All Wrong’ with a sub-title ‘The Illusion and Appeal of LLM Reasoning’, Ivy B. Grey, Chief Strategy & Growth Officer for WordRake, and a former bankruptcy lawyer for ten years, makes a sharp observation: “Words like reasoning, thinking, and writing are the working tools of the legal profession. But with the rise of large language models, like OpenAI’s GPT, Anthropic’s Claude, and Google’s Gemini, these words are now used in a different way. If we don’t confront their false familiarity, we risk misunderstanding the capabilities of these tools and misplacig our trust in them” .
More than any other disciple, the subject and practice of law is extremely vulnerable to language and reasoning built on it, keeping in mind that the language of law is often distinctly different from ‘natural’ language. Any discussion of any court judgements revolves around the ‘ratio decidendi’ (the reason for the decision), which is the result of a battle between at least two opposite interpretations, two arguments, one of which will triumph. And this battle is built on multiple foundations including past decisions especially when they serve as precedents, and many legal conventions followed by the profession including the judiciary. The precision with which lawyers use language down to a comma can decide the fate of a decision.
According to Grey, “much of the recent interest in GenAI reasoning seems driven by two hopes. First, we hope reasoning might help LLMs avoid hallucinations: if we can get LLMs to explain their steps, they should be more accurate. Second, we hope reasoning is the last missing piece in building a truly useful GenAI coworker: an assistant that can draft, evaluate, prioritize, and even identify and escalate important issues—like a junior associate. Both hopes are understandable, but they’re leading us to misunderstand what LLMs with “reasoning capabilities” can actually do. Even with newer reasoning models, LLMs are still just generating plausible strings of text based on statistics”.
This is the point made earlier about LLMs being sensitive to probability and calculation. The ‘reasoning’ that is attributed to LLMs is essentially calculations and it seems like human reasoning but it is not. Human beings can explain the logic behind the reasoning and it is not based on calculations, except where it is relevant.
Conclusion
Let me end up with Edmund Burke’s remarkably resilient observation two centuries ago: Eternal vigilance is the price of liberty. One of my two favourite quotations. The other is equally relevant here. Kafka: I see an infinity of hope, but not for us. Can we prove Kafka wrong?