Much of the modern concept of information was invented by Claude Shannon and his homies at Bell Labs in the 1940s. For example, in order to compare messages in different alphabets, they invented a minimal unit called the ‘bit,’ the amount of information contained in a single coin-flip. This concept, like some of their other work, has now made the transition from “wha?” to “Duh!” and it is hard to reconstruct a world where it isn’t obvious. But Shannon’s central idea about how to measure the information content of a message remains not only non-trivial but downright weird.
If you asked most people to think about the information content of a message, they would probably focus first on what meaning the sender had in mind. Shannon ignored this and studied instead the message itself, along with the coding system, and he arrived at the conclusion that the information contained in a message is equal to its entropy. You will ask, “Doesn’t entropy measure the disorder and unpredictability of a system? And shouldn’t an orderly, patterned message contain more information than a chaotic and seemingly random one?” The answer to the first question is yes, that’s right. As for the second…*
Imagine you are reading an English text one letter at a time, and think about how much new information each new letter adds to what you know. If a sentence starts “They ate q…” then the next letter will almost certainly be ‘u’ and it will add almost nothing to what you know; you still won’t know if they ate quail or quiche or whatever. ‘Q’ is followed by ‘u’ in a very orderly manner, and that means that the ‘u’ doesn’t convey much information. On the other hand, if a sentence begins “He took a cra…” then the next letter is highly unpredictable and also highly informative. You can make an educated guess about where the text is going depending on which of the following you see:
He took a crap…
he took a cray….
He took a crac…
The point is that if a message contains something predictable,that part could be removed without losing information: a message with low entropy could be compressed into a much shorter one, while a message with high entropy cannot. (This is basically how compression software works—it finds patterns and replaces them with a shorter description that can be unpacked later.)
So, strange as it sounds, the texts that pack in the most information will appear quite random, compated to less efficient ones. Of course, we are used to natural languages, which include redundancy at all levels—it’s not just things like ‘qu.’ For example, why do we say “I go” but “She goes”? The pronoun has already told us who is going , so “She go” would work just fine. Or consider “They arrived yesterday.” The information that the event happened in the past is already contained in “yesterday,” so the past-tense marker is redundant.
Don’t get me wrong, I’m not saying that we should strip these redundancies from language. They are present for good reasons, because they make parsing (decoding) and error-correction easier. You can see this by looking at text-messaging lingo, where the slowness of the input mechanism has created pressure for greater efficiency. This efficiency (shorter messages) comes at the cost of effort in understanding, and the risk of information loss if there is a typo. I have an in-law who tends to use lots of texting abbreviations, some of them rather erratic, and if she types even one letter wrong, you can end up with a huddle of people on the other end attempting to decrypt the opaque message (I think she once sent me one that read something like “d n I wnu 2 dnr,” which meant “Dana and I went to dinner”).
There’s a good deal of interesting stuff to be said about information entropy, but what especially fascinates me is the possibility of using it as a lens to understand art. Artistic forms are, in general, massively redundant (think of music or verse), and this is part of what they offer. Even novels contain many patterns both in their language and in their conventions (modes of description, plot devices). But most of us also look for some surprise, some unpredictability and even chaos, and the idea of information entropy is one way to describe this: a poem that is all form, a novel that is all convention, leaves us with almost no information. This is a gripe I have with conceptual art: a work of art that can be compressed into a statement about art seems to me a cheap substitute for the complex, irreducible real thing.
(By the way, some of you may wonder how entropy, which is a concept in thermodynamics, can be defined for abstract sequences of symbols. Shannon initially used the term by analogy, since his formula, involving the probabilities of different combinations of symbols, had a similar meaning and similar mathematical behavior. More recently, though, people have realized that it isn’t just an analogy: thermodynamic entropy can be expressed in terms of information. In fact, information is the key to the most famous paradox about entropy, in which an imaginary critter called Maxwell’s demon redirects molecules to make a system more orderly (lowering its entropy, contrary to the second law of thermodynamics). The problem is that the demon has to take notes to remember what it’s doing, thus adding entropy to the system.)
You can read more about Shannon in James Gleick’s The Information.
* I wrote a couple of pieces about entropy a while back: