Title: The Information: A History, a Theory, a Flood
Author: James Gleick
Publication Date: 2011
LC Call Number: Z665 .G547 2011
Shannon’s information theory is very much the sort of theory you’d expect an engineer to produce. He carefully removes meaning from the equation and defines information as the ability to recreate the same signal at one end of the line that you put in at the other end. But he also noted that if the signal is a message, it generally has a high degree of redundancy in order to guard against “noise,” which interferes with messages, and possible loss of information in the course of transmission. He didn’t consider this redundancy to add any information. If you see a “q” in English, you can be fairly certain that the next letter will be “u” and you won’t gain any new information when you see it.
This is curious, because it means that a random arrangement of letters (or numbers or other symbols), which is the least predictable, carries the most information. But at the same time, random signals are very likely noise, and are exactly what you don’t want to transmit. In fact, in some ways the redundancy itself is what tells us that something is a message. We look for patterns and if we don’t find them, we know that we can ignore things.
So getting from information to intelligibility was a problem that Shannon avoided entirely, but if you put it back in, this all gets very complicated and messy. Margaret Mead asked him about how body language fits into his theory; it’s an interesting question, but Shannon brushed it off by saying that he was not interested in meaning. But asking this question opens up all sorts of other questions. The meaning of a message does not depend only on the symbols that are used. Most of it comes from the context in which they are presented. Language is a fairly trivial example. If we are doing the exercise I mentioned above, in which a reader attempts to predict the next character, I can be fairly certain that I would make much better predictions for English words than for, say, German ones. And this is still staying within the same set of characters! For a Hebrew or Chinese text, even if I were given a menu of characters from which to choose, I’d do no better than random, as I’m not familiar with any of the symbols used nor the rules by which words and sentences are formed in those languages. So, even if we attempt to take a very scientific view of information and measure it quantitatively… we really can’t, because the amount of information varies based upon who is measuring it!
So intelligibility is a really complex thing, and very difficult to nail down. At some times, it seems as if the book walks right up to the very threshold of the moment when the information is about to be understood, and just leaves it there. You’ve got Turing coming up with the Turing test, which turns the question of artificial sentience into a behaviorist one; you’ve got Shannon scolding people who try to think about what “information” means outside of computer science and trying to apply the concept to their own disciplines. It gets a little frustrating realizing that there’s no way of identifying what makes something meaningful or intelligible—that’s one mystery that nobody in the book really even attempts to address. There is a long discussion of random numbers, though, and identifying patterns in them, which turns out to be really the same question, or at least a closely related one. It’s difficult to say whether a number is random or whether it means something. What, they ask, is the smallest uninteresting number? If it’s the smallest uninteresting number, is it really an uninteresting number? What about Borges’s Library of Babel, with every meaningful and meaningless book included in it? Am I making a deliberate reference to something totally unrelated with the title of this post? (It depends partly on whether you think I’d do that, and partly on whether you find any connection between those words and some other words. Coincidental? Hard to say!) You can often feel fairly certain that a book constitutes a meaningful message (although you can never prove it isn’t just a random combination of signs), but you can’t prove that one isn’t.
Where am I going with this? I’m not sure at all. At the very least, it seems to prove that the meaning does not inhere in the information, or maybe anywhere in particular. Although it seems that meaning is something that exists, apart from the arbitrary systems which humans use to represent it, it’s strongly hinted in the book that there’s no consistent rule that will tell us whether something is meaningful or not.
I’m sort of fumbling with this and not expressing myself well, but the more you think about the question of what meaning is, the weirder it becomes.