Setting Things Straight: Silas Barta, information theorist by night

Monday, December 14, 2009

Silas Barta, information theorist by night

UPDATE 12/17/09: Steven Landsburg, after responding several times in the comments section here, posts a defense of his position on his blog, although without mentioning me or Bob Murphy. Hey, I can understand: if I were in his position, I'd hide the existence of me and Bob too!

***

Bob Murphy invokes my expertise on information theory to criticize (yet) another bizarre argument from Steven Landsburg, that the natural numbers are more complex than human life. Here's the mistaken part of Landsburg's reasoning:

...the most complex thing I’m aware of is the system of natural numbers (0,1,2,3, and all the rest of them) together with the laws of arithmetic ...

If you doubt the complexity of the natural numbers, take note that you can use just a small part of them to encode the entire human genome. That makes the natural numbers more complex than human life. Unless, of course, human beings contain an uncodable essence, like an immortal soul

Naturally, I don't necessarily agree with the broader theological points Bob makes in his reply, and such issues will remain even scarcer on this blog than on his. However, I will expand on point I made in discussion with Bob.

The error in Landsburg's line of reasoning is: the fact that you can use instances of X to build Y does not mean X is more complex than Y. Just the opposite, in fact: in order to describe Y, you must describe X as a substep. Like in the analogy I gave, you can use bricks and mortar to build a house, but that means it's the house that's more complex. To fully specify the house you must describe not only the bricks and mortar, but the form they take as a house -- how they're supposed to be put together.

As for arithmetic and natural numbers, it's their lack of complexity that makes them so useful. By appealing to it, you can make sense of a diverse array of phenomena. The more complex arithmetic were, the less helpful it would be in making sense of things.

Just to be clear, this doesn't mean it's easy to learn math (different people have different problems in different topics and levels), or that you can't do anything complex with math. The point is that no amount of complexity produced in using arithmetic could ever imply arithmetic's complexity, for the same reason that no matter how complex a house you make with one kind of brick, you can't make the brick more complex.

But of course, Landsburg's errors don't end there. He wants to go so far as to say that by merely encoding the genome in base 4, you've described human life. That's certainly the impression people get from discussions of DNA in the popular media and movies like Jurassic Park. Hey, all you need is a string of letters made up of A,G,C,T, and you've described someone completely!

To put it mildly: that's not how it works. First of all, you need to say what the letters actually mean. And then, even if you know that much, all you have are empty labels -- suggestively named LISP tokens. So you know that C is cytosine? Okay, but what's that? Now you need to describe where the carbons and nitrogens and oxygens go to make up cytosine. But wait -- what's this "nitrogen" thing, anyway? And so on.

Don't worry -- the process terminates: once you've described the generative model that puts all of these concepts together in a way that yields a description of human life as its output.

Needless to say, you're using more than a few integers by that point!

18 comments:

Steven E. Landsburg said...: Don't worry -- the process terminates: once you've described the generative model that puts all of these concepts together in a way that yields a description of human life as its output.

Yes, it terminates in a combinatorial structure, which is to say that it terminates in a fragment of arithmetic.

Indeed, it terminates in a particularly simple fragment of arithmetic---one that can be simulated (in principle) on a computer, which is decidedly not true of arithmetic as a whole. (This is essentially the content of Godel's first incompleteness theorem, or, better, Tarski's theorem on the undefinability of truth.); December 14, 2009 at 10:05 PM
Silas Barta said...: @Steven_E._Landsburg: Thank you for your comment.

However, you're still equivocating: between the building blocks, and what you can do with the building blocks. (And, again, between math in general and arithmetic, which you took pains to distinguish before, though I can't find the comment on your blog anymore.)

Arithmetic can be simulated on a computer; proofs for all true statements in math (by the theorems you gave) cannot. But the complexity of that proof set is not evidence for the complexity of arithmetic (or math, whichever meaning you want to use this time), any more than a complex structure made of bricks is evidence that the bricks are complex. You can't get there from here.

Your original claim as that the natural numbers, combined with the laws of arithmetic, are more complex than human life. But you can write a simple program that specifies the basic arithmetic operations and generates all natural numbers.

If you're going to count the complexity of the product of the tools toward the tools themselves, then math is not special in being complex -- in fact, nothing is, since by that premise, everything is complex, and the term has no meaning.; December 15, 2009 at 12:05 PM
Anonymous said...: This comment has been removed by the author.; December 15, 2009 at 1:38 PM
Steven E. Landsburg said...: Arithmetic can be simulated on a computer; proofs for all true statements in math (by the theorems you gave) cannot.

The theorems I gave are not about math; they are about arithmetic. No computer can generate all true statements about arithmetic; arithmetic is too complex for that.; December 15, 2009 at 1:39 PM
Silas Barta said...: No computer can generate all true statements about arithmetic; arithmetic is too complex for that.

That doesn't make arithmetic (or "the laws of arithmetic") complex. That means you can produce complex things within the system of arithmetic. Again, if you include that as part of the definition, then *everything* counts as complex, and the term has no meaning -- you can point to anything and say "no computer can generate all true statements about this".; December 15, 2009 at 3:32 PM
Steven E. Landsburg said...: you can point to anything and say "no computer can generate all true statements about this".

Well of course you can say this, but only if you're willing to be wrong. You would be wrong, for example, if you said it about euclidean geometry or the theory of real closed fields. But you'd be right if you said it about arithmetic. That seems like kind of an important distinction.; December 15, 2009 at 5:09 PM
Silas Barta said...: No, I wouldn't be wrong, and you've shown no distinction. Remember, you've decided to count any statement about X towards X's complexity. That was your justification for calling arithmetic (or math, or whatever) complex:

"No computer can generate all true statements about arithmetic."

Well, since you can just involve arbitrary arithmetic predicates in statements about apples, or Euclidean geometry, or whatever else, then, by that metric, everything is maximally complex.

But of course, no one else considers "the size of the set of true statements about X" to be the definition of X's complexity.

What you seem to want to do is use your own definition of complexity, involving how many statements can be generated about something, in one context, and then abandon it in others.; December 15, 2009 at 5:20 PM
Silas Barta said...: By the way, Steven_E._Landsburg: Do you now agree that encoding the entire human genome is insufficient to encode human life, even without resorting to the concept of a soul? Because at the very least, you need to list a large number of chemical reactions.; December 15, 2009 at 5:33 PM
Steven E. Landsburg said...: "No computer can generate all true statements about arithmetic."

Okay, I see your issue. By "all true statements about arithmetic", I meant "all true statements in the language of arithmetic". The language is imprecise, but I think appropriately so given that I wasn't writing a mathematics textbook. I am reasonably certain that when I refer to "statements about arithmetic", virtually all readers will interpret that as intended, e.g. statements like "two plus two equals four" and "every number is the sum of (at most) four squares". For a more sophisticated reader, I'd have expected this to be clear from context, given my invocation of Godel.

But since this wasn't clear to you, let me clarify it now: No computer program can generate all true statements in the language of arithmetic. One cannot say the same for euclidean geometry. This is a substantive and relevant distinction.

Do you now agree that encoding the entire human genome is insufficient to encode human life, even without resorting to the concept of a soul?

Again, you are quibbling about my informal use of language in an informal venue. Whether the genome is sufficient to "encode" life depends on what you mean by "encode". Again, I'd have thought the meaning was clear from context, and whether or not you like my choice of language, the fundamental point remains that all of the A's, C's, G's, T's, and the chemical reactions in which they take part, etc., etc., are all imbeddable in a finite fragment of arithmetic.; December 15, 2009 at 5:58 PM
Cyan said...: ...you can use just a small part of [the natural numbers] to encode the entire human genome. That makes the natural numbers more complex than human life.

You've permitted yourself the luxury of letting sophisticated readers infer the specific content of your informal statements about arithmetic by invoking Godel. You ought to permit your sophisticated readers the luxury of taking you to task when you write something that strongly and falsely implies that the human genome specifies human life without quibbling yourself.; December 15, 2009 at 6:25 PM
Silas Barta said...: By "all true statements about arithmetic", I meant "all true statements in the language of arithmetic". ... I am reasonably certain that when I refer to "statements about arithmetic", virtually all readers will interpret that as intended, e.g. statements like "two plus two equals four" and "every number is the sum of (at most) four squares". ... No computer program can generate all true statements in the language of arithmetic.

That still doesn't help you. Computer programs lack this ability because of the problem of distinguishing true from false statements. What definition of complexity hinges on "how difficult it is to make determinations about things your produce with it"?

One cannot say the same for euclidean geometry. This is a substantive and relevant distinction.

Then make the distinction using the appropriate terms. "Complexity" is already taken -- it means something else.

Again, you are quibbling about my informal use of language in an informal venue. Whether the genome is sufficient to "encode" life depends on what you mean by "encode".

No, it doesn't. No matter what definition of "encode" you use, you still need more than the genome in order to make sense of the base-4 string.

Again, I'd have thought the meaning was clear from context, and whether or not you like my choice of language, the fundamental point remains that all of the A's, C's, G's, T's, and the chemical reactions in which they take part, etc., etc., are all imbeddable in a finite fragment of arithmetic.

Your text (and discussion in your book) don't show an understanding of this distinction. Why bring up the genome at all, if you understand that it's not the same thing as human life?; December 15, 2009 at 7:07 PM
Steven E. Landsburg said...: "Complexity" is already taken -- it means something else. "Complexity" means many different things in many different contexts, some of those meanings being precise and formal and others being colloquial. The fact that it's a term of art in information theory does not detract from its usefulness as an everyday English word. If your complaint is that I didn't use the word "complexity" in the way it's used in the field in which you happen to be an expert, then I plead guilty. The fact remains that arithmetic, in the everyday sense of the word, is enormously more complex than, say, euclidean geometry, in the sense that all true statements of euclidean geometry can be generated from finitely many bits of data and there is a sense in which "most" true statements of arithmetic cannot. Come to think of it, that's pretty damn close in spirit to the technical Kolmogorov definition of complexity that you seem to think is the only allowable use of the word.

Re the genome, again---"genome" is a term of art in genetics, but it has a respectable history of everyday use as "complete set of instructions for assembling proteins into a living being", which includes not just a sequence of symbols but an attribution of meaning to each of those symbols, with the meanings incorporating the way the proteins fit together, etc etc. And what matters to my argument that the genome, in this broader more colloquial sense, is encodable as a fragment of arithmetic.

Incidentally, you used the word "produce" in one of your comments. Are you aware that "produce" is a term of art in the field of economics and that your use of that word does not coincide with the standard economic definition at all? Would it strike you as childish if I used that as a pretext to dismiss everything of content in your comments?; December 15, 2009 at 10:21 PM
Silas Barta said...: Re: Kolmogorov complexity. It's more than just an issue of you not using the standard definition of complexity in information theory; it's that that's the appropriate definition to use in the fields you discuss in your book and the blog post. I certainly don't deem it the "only allowable use of the word", but if you're going to deviate, you need to at least define complexity explicitly, which you did in neither your book nor the blog. Heck, you should have defined it even if you *were* just going with Kolmogorov complexity!

(Your insinuation I'm trying to force-fit things into a field just because I know a lot about it is noted, petty, and wrong. I do appreciate the promotion to expert though.)

But even if I'm wrong about K-complexity's appropriateness for this topic area, your definition *still* doesn't match the everyday use of the term, which actually does follow Kolmogorov complexity quite closely. It's why people intuitively see 1111111111 as being less complex than 1010101010, which in turn is less complex than 1001011110. Nowhere have I have seen a lay conception of "complexity of X" that hinged on how many bits of data you need to generate all true statements that involve permutations of components of X, as you seem to be defining the term. And such an esoteric definition of complexity takes most of the significance away from the conclusions predicated on it.

Yes, if what you were saying about *normal* complexity were true, that would be cool, but you're only saying it about Landsburg-complexity, which no one cares about.

(Incidentally, I've been giving your comparison of Euclidean geometry less attention, but that's just as wrong too -- you're placing a tighter constraint on what counts as a geometric statement compared to an arithmetic one. After all, you can line up squares and make statements about them that are isomorphic to statements about the natural numbers.)

"genome" is a term of art in genetics, but it has a respectable history of everyday use as "complete set of instructions for assembling proteins into a living being", which includes not just a sequence of symbols but an attribution of meaning to each of those symbols, with the meanings incorporating the way the proteins fit together, etc etc.

Not any place I'm aware of. Rather, people assumed that the literal genome would get them the benefits of the figurative genome, and the confusion has persisted despite this being proved wrong. Whatever history that is, isn't respectable. And you're not doing anyone a favor by perpetuating this misconception.

Incidentally, you used the word "produce" in one of your comments. Are you aware that "produce" is a term of art in the field of economics and that your use of that word does not coincide with the standard economic definition at all? Would it strike you as childish if I used that as a pretext to dismiss everything of content in your comments?

It strikes me as childish for you to make this comparison. If I were dealing with subject matter where "produce" in the economic sense is the standard definition, but used it in a completely different way, foreign to that subject matter and the lay sense, AND made an argument crucially dependent on the meaning of "produce", AND lacked a definition anywhere, AND which made my thesis trivial if my definition actually *were* used, then yes, all of the claims I've made about your use of "complexity" would be justified.; December 15, 2009 at 11:26 PM
Steven E. Landsburg said...: It's why people intuitively see 1111111111 as being less complex than 1010101010, which in turn is less complex than 1001011110.

Yes, and in pretty much the same way that the set of true statements in arithmetic is more complex than the set of true statements in euclidean geometry.

The various theorems to this effect (e.g. Tarski's) will not go away just because you happen not to understand them---as is well evidenced by your vision of "lining up squares". You might attempt the exercise of restating the Peano axioms in (say) Tarski's language of geometry if you want to see where your vision fails.; December 15, 2009 at 11:56 PM
Silas Barta said...: Yes, and in pretty much the same way that the set of true statements in arithmetic is more complex than the set of true statements in euclidean geometry.

No, not the same way. The strings I listed are more (or less) complex depending on how long a program it takes to describe them.

The two axiom sets that you listed differ in complexity -- by your definition -- because it takes a different number of bits to categorize, into true or false, the entire set of statements that can be generated from those axiom sets. That's *not* "pretty much the same way", and I don't know who speaks of complexity like that.

I'd go over the point about geometry vs arithmetic (the problem is not the theorem but your different standards for what counts as a statement within each of them) but that's a relatively minor error compared to the rest of the problems with your claim, which persist even if your definition were valid. To summarize:

- You say that math is complex because you can't generate all true statements about arithmetic on a computer.

- This only counts as complex because you use a non-standard, counterintuitive, never-used definition of complexity that you never spell out explicity (which would make you see its clumsiness).

- You want to import all of the standard implications and connotations about complexity as if you were using the normal definition.

Again, that's not how it works. The very reason people use math in the first place is because it's simple, which allows one to efficiently compress descriptions of our observations. If it were actually the most complex "thing in the universe" (and I haven't even touched your Platonism, which is another can of worms), there would be no point to expressing scientific laws in terms of it!

If I had to say, I think the word you're looking for is not "complexity" but "usefulness" or "expressiveness".; December 16, 2009 at 10:23 AM
Steven E. Landsburg said...: - You say that math is complex because you can't generate all true statements about arithmetic on a computer.

- This only counts as complex because you use a non-standard, counterintuitive, never-used definition of complexity that you never spell out explicity (which would make you see its clumsiness).

What (I think) you are missing is that the incompleteness of Peano arithmetic implies the existence of nonstandard models, which means that Peano arithmetic is NOT a complete description of arithmetic. Nor does any computable description exist.

I'll spell that out again: The fact that the true statements about arithmetic can't be generated by a computer *implies* that there are non-standard models of the axioms, which *means* that no computer can completely describe the natural numbers.

The non-existence of a computable description is very much in the spirit of the Kolmogorov definition of complexity.

Tell you what: Since you are not the only reader who is badly confused about this issue, I'll post a fuller explanation on my own blog tomorrow morning. Then you can let me know if it's still not clear.; December 16, 2009 at 4:14 PM
Silas Barta said...: What (I think) you are missing is that the incompleteness of Peano arithmetic implies the existence of nonstandard models, which means that Peano arithmetic is NOT a complete description of arithmetic. Nor does any computable description exist.

Actually, computable descriptions of addition, subtraction, multiplication, and division, applied to natural numbers, do in fact exist. That was your definition or arithmetic, remember?

Again, you're equating the complexity of X, with the complexity of the task of classifying things built from X. It's the same mistake as if you said, "It takes a lot of data to generate a list of all brick structures that won't fall down; therefore, bricks are complex."

Tell you what: Since you are not the only reader who is badly confused about this issue, I'll post a fuller explanation on my own blog tomorrow morning. Then you can let me know if it's still not clear.

Go for it! But you might want to settle on a definition of complexity before deciding that your views are the non-confused ones. First, you objected that Kolmogorov complexity isn't the only definition of complexity, then your said your usage is consistent with it. And you've added lots of analysis and (re)definition that wasn't in your blog post or book, which can't do anything to substantiate your *original* claims being correct.; December 16, 2009 at 10:46 PM
Joshua said...: Posting a comment here because Landsburg's post is now closed to comments. He substitutes well-ordering for individual well-ordering axioms for each subset of N and then writes:

"Okay, our description of the natural numbers just got infinitely long, but at least it’s infinitely long in a simple sort of way. We’ve added an infinite number of axioms, but they all fit the same simple pattern—a pattern that you could easily train your computer to recognize."

This misses a major issue. In order for that to be equivalent to well-ordering one needs an uncountable number of axioms (since the cardinality of the powerset of N is the cardinality of the continuum.) If one tries to do this in a computable fashion one ends up with a structure that is strictly weaker than what you get with full well-ordering. I can in fact add to his axiomatic system an axiom to the effect that there's a non-empty subset of N that has no least element, and that will be consistent but omega-inconsistent (I'm glossing over some details here but that's roughly what happens).; July 19, 2010 at 7:17 PM