One of the greatest minds ever, Claude Shannon, deserves wider recognition than he has been accorded. His insights, relying on just two papers he wrote – one in 1948 and another in 1950, show remarkable prescience.

In everyday life, repetition can be jarring and a hindrance to communication as you ‘tune off’ and stop listening. In a communication system, however, repetition is a boon as it simplifies and even reduce the time taken to transmit. This is an insight of great importance and we owe it to one of the unsung heroes of modern science – Claude Shannon.

1948

Claude Shannon’s 1948 paper ‘The Mathematical Theory of Communication’ underlies modern communication systems, as he treated photographs, novels, paintings as examples of information. As is obvious from this approach, Shannon is abstracting from the substance of the type of information and treating it in a formal manner, which means that all information, irrespective of their specific character, can be treated as similar. You will no doubt wonder about the ‘size’ of information and Shannon provided a way to compare relative sizes. Since it is relative, it has to be free from substance of the information and that is the reason he furnished a numerical way to compare relative sizes. Most of you will recall from school/college days that only pure numbers can be compared, free from substance such as metres, litres, age etc.

The purpose of communication is “reproducing at one point either exactly or approximately a message selected at another point”.  To reiterate what I have just said above, Shannon informs us that we must ignore the meaning (substance) of the message. “These semantic aspects of communication are irrelevant to the engineering problem”. Once message has been divorced, the communication can be reduced to formal elements, which can be numerically expressed.

Those of you who have chatted with any chatbot will recall the Q&A structure that is central to communication. In 1948, Shannon found that the Q&A format will facilitate the communication of all messages. And we think an ‘AI agent’ has worked a miracle!

Once you grasp that reducing communication to formal elements actually simplifies transmission because it can use patterns in reducing the time to transmit. This point is critical because patterns can be used in a system only when you reduce them to formal elements, because you can then use statistics and mathematical techniques such as calculus – integral and derivative.

(While I have been familiar with Shannon’s paper, it is only recently that I really understood it thanks to a delightful book ‘Einstein’s Fridge’ by Paul Sen)   

Now that we can see the connection between mathematics, statistics and communication, and artificial intelligence is nothing but communication. This is an ordinary observation now because that is how a computer system functions. That said, let me proceed.

Once any communication is found to be reducible to a set of formal elements, this approach can be applied to any ‘subject’ – communication system is indifferent to the ‘message’. Whatever the message, it is treated the same in a communication system.

Today, we understand that anything can be seen as information, whatever it is in its specific form, be it in biology, economics, physics and so on.

The significance of Q&A

Shannon found that all communication can be formalised in a Q&A framework, because there is learning. Since he was interested in chess (recall that the first use of ‘AI’ was in chess – a structured game that can be grasped as a set of Q&A), he applied his information theory to it. Stanley Joseph, in ‘Applied Information Theory’, details Shannon’s contribution to AI. “He used information theory to measure the entropy or unpredictability of natural languages, such as English, and to compare them with artificial languages, such as Morse code or binary code. He also used it to model the structure and syntax of natural languages using probabilistic grammars and Markov chains. He also used it to generate random sentences or texts that mimic the style or content of a given source, such as a book or an author” (https://medium.com/@staneyjoseph.in/the-foundations-of-artificial-intelligence-how-claude-shannon-applied-information-theory-140c02b9920d).

In a research paper ‘Programming a computer to play chess’ in 1950, Shannon, in the opening paragraph, outlined the immense possibilities that follow: “This paper is concerned with the problem of constructing a computing routine or “program” for a modern general purpose computer which will enable it to play chess.

Although perhaps of no practical importance, the question is of theoretical interest, and it is hoped that a satisfactory solution of this problem will act as a wedge in attacking other problems of a similar nature and of greater significance. Some possibilities in this direction are: –

  1. Machines for designing filters, equalizers, etc.
  2. Machines for designing relay and switching circuits.
  3. Machines which will handle routing of telephone calls based on the individual circumstances rather than by fixed patterns.
  4. Machines for performing symbolic (non-numerical) mathematical operations.
  5. Machines capable of translating from one language to another.
  6. Machines for making strategic decisions in simplified military operations.
  7. Machines capable of orchestrating a melody.
  8. Machines capable of logical deduction.

It is believed that all of these and many other devices of a similar nature are possible

developments in the immediate future”.

75 years ago! Just remarkable.

Why chess

Shannon notes the considerable literature on chess-playing machines, tracing some work to the 19th century. He explains: “ The chess machine is an ideal one to start with, since: (1) the problem is sharply defined both in allowed operations (the moves) and in the ultimate goal (checkmate); (2) it is neither so simple as to be trivial nor too difficult for satisfactory solution; (3) chess is generally considered to require “thinking” for skilful play; a solution of this problem will force us either to admit the possibility of a mechanized thinking or to further restrict our concept of “thinking”; (4) the discrete structure of chess fits well into the digital nature of modern computers”.

We can see that Shannon’s breaking-down of the chess game has implications for any computer system simply because the formal elements can be imitated to suit any ‘subject’.

As I have written earlier, the ability to grasp and design formal elements is the crux in designing systems, which call for skills in statistics and mathematics and in formal logic.

I have written this article not just to pay tribute to a great mind but also to reiterate my warning not to get taken in by the current AI circus.