January 27, 2021

Party Games

Like driftwood spars, which meet and pass
Upon the boundless ocean-plain,
So on the sea of life, alas!
Man meets man,—meets and quits again.
Matthew Arnold (1822-1881), "Switzerland"

In work and play. from hour to hour and year to year, we pass from one gathering to another. Sometimes we are surrounded by people we know; sometimes we find ourselves in the presence of strangers. The restless ebb and flow of human concourse appears to be random, yet there are curious patterns present in the composition of these random groupings of strangers and acquaintances who come and go.

Whether in large crowds, modest social gatherings, or small groups, people congregate in all sorts of combinations of strangers and acquaintances. Consider the varied possibilities in a party of six.

You're one of six people at a dinner party. You look across the table at your five companions. You would undoubtedly find that the dinner party includes either a group of at least three people who all know one another or a group of at least three people who don't know one another.

The reason for this certainty lies in a mathematical proof that any gathering of six people will always automatically include one or the other of these two groupings. No such guarantee is possible when five or fewer people are present.

To prove that any party of six must contain at least three acquaintances or at least three strangers, you could write down all the possible combinations, starting with a group in which everyone knows one another. However, there are 32,768 such possibilities.

This number reflects the fact that any pair in a group of six must be either acquainted or strangers. Moreover, there are six choices of people for the first member of a pair, which leaves five choices for the second member of the pair, for a total of 30 possibilities. Because the order of the choices in a particular case doesn't matter (picking Alice first and then Barry is the same as picking Barry, then Alice), the number of possibilities that count is down to 30/2, or 15. Overall, for two possible relations (stranger or acquaintance), the number of possibilities is 215, or 32,768.

Luckily, there's a shorter path to a proof. You need consider only two particular cases.

Suppose a party is attended by the conveniently named guests Alice, Barry, Charles, Diana, Edith, and Frank. Of the five partygoers Barry, Charles, Diana, Edith, and Frank, either at least three are friends of Alice or at least three are strangers to Alice.

Suppose Alice knows Barry, Charles, and Diana. If either Barry and Charles, Barry and Diana, or Charles and Diana are friends, then Alice and the acquainted pair make three people who know one another. Otherwise Barry, Charles, and Diana are mutual strangers.

In the second case, suppose Alice knows only two (or fewer) of the others, say, Barry and Charles. If either Diana and Edith, Diana and Frank, or Edith and Frank are strangers, then Alice and the unacquainted pair make three people who do not know one another. Otherwise, Diana, Edith, and Frank are mutual acquaintances.

This argument covers all possible cases for each of the six dinner-party guests.

The result represents a special case stemming from a branch of mathematics known as Ramsey theory, named for the mathematician Frank Plumpton Ramsey (1903-1930).

Instead of talking about six people, you could consider groups consisting of any number of people. Instead of specifying just two relationships (acquaintances and strangers), you could have any number of mutually exclusive relations. For example, you could consider people (or nations) who are allies, foes, and neutral parties—representing three mutually exclusive relationships—embroiled in a widespread conflict.

In general, Ramsey theory concerns the existence of highly regular patterns in sufficiently large sets of randomly selected objects, whether they are gatherings of people, sequences of randomly generated numbers, randomly placed points in the plane, or even stars in the night sky.

Tabulating the outcomes of 2,500 flips of a coin by sequentially filling in the squares of a 50-by-50 grid (black for tails, white for heads) produces a random array. It is not difficult, however, to pick out fragments of patterns among the black or white squares, in much the same way that you can find patterns among stars randomly scattered across the sky. In this case, one array represents tosses of a fair coin and the other represents tosses of a slightly biased coin. Can you tell which is which?

It's straightforward to convert a party puzzle into a graph theory problem. In general, a graph consists of an array of points, or vertices, connected by straight lines, which are often called edges.

In the dinner-party case, you can draw a point for each of the six people present in the group. You can then draw lines to join the points, with each line colored red to signify two people who know each other or blue to mean the two people are strangers. When all pairs of points are joined, the resulting network of points and lines is known as a complete graph.

Depending on the relationships specified in a given case, such a graph may contain only red lines (all acquaintances), only blue lines (all strangers), or a mixture of red and blue lines joining the points. The problem comes down to proving that no matter how the lines are colored, you can't avoid producing a red triangle (representing three mutual acquaintances) or a blue triangle (three strangers).

A graph depicting relationships among six people at a party, where red lines link acquaintances and blue lines connect strangers.

Such a coloring is guaranteed for a complete graph of six points, but not necessarily for a graph of five or fewer points, as the following figure demonstrates.

This graph representing the relationships of five people shows that it is possible to obtain a gathering in which there is no group of three strangers or three acquaintances. In other words, there is no triangle in the diagram that is made up entirely of red or blue lines.

Wolfgang Slany turned the party problem into a two-person game he called HEXI (see his paper "Graph Ramsey Games"). You and your opponent take turns coloring in the uncolored edges of a complete graph laid out as a hexagon, with each player using a different color. The first person to build a triangle with all three edges of the same color loses.

Slany also had a variant of the game in which a player may choose to color in more than one edge with each turn.

In both cases, Ramsey theory ensures that HEXI never ends in a draw.

Previously: The Long Run

January 24, 2021

Playing with Ruth-Aaron Pairs

On April 8, 1974, Henry (Hank) Aaron hit his 715th major league home run, surpassing the previous mark of 714 career home runs long held by baseball great Babe Ruth. Understandably, the event received considerable coverage in newspapers and magazines and on television.

However, those reports invariably overlooked the mathematical aspects of the achievement, particularly the curious properties of the two numbers 714 and 715. It took the efforts of mathematicians Carol Nelson, David E. Penney, and Carl Pomerance to call attention to this facet. (See Numberphile episode "Aaron Numbers" featuring Carl Pomerance).

Notice that 714 = 2 x 3 x 7 x 17 and 715 = 5 x 11 x 13; so 714 x 715 = 2 x 3 x 5 x 7 x 11 x 13 x 17. In other words, the product of the two consecutive integers 714 and 715 is equal to the product of the first seven prime numbers!

Pomerance and his colleagues wondered whether there were other pairs of consecutive numbers whose product is also the product of the first k primes.

The first few instances are easy to find: 1 and 2 (1 x 2 = 2), 2 and 3 (2 x 3 = 2 x 3), 5 and 6 (5 x 6 = 2 x 3 x 5), 14 and 15 (14 x 15 = 2 x 3 x 5 x 7), and 714 and 715.

The mathematicians then used a computer to search for such pairs, going as far as products of the first 3,049 primes (numbers up to 6,021 digits long). They found no additional examples in that range. It’s now conjectured that 714 and 715 is the last pair of consecutive integers whose product is the product of the first k primes for some k.

And there's more. Notice that the sum of the prime factors of 714 is 2 + 3 + 7 + 17 = 29, and the sum of the prime factors of 715 is 5 + 11 + 13 = 29. How often do two consecutive numbers have prime factors that add up to the same total?

Pomerance and his coworkers conducted another computer search, looking for such pairs up to a value of 20,000. Here are the first few examples:

Such numbers are now known as Ruth-Aaron pairs. Pomerance speculated that these pairs become less frequent as their size increases. However, he didn’t have a mathematical proof quantifying their scarcity.

Within a week of the appearance of these results in a 1974 issue of the Journal of Recreational Mathematics, Pomerance received a letter from legendary mathematician Paul Erdős (see "Paul Erdős and an Infinity of Problems"). Erdős offered to show Pomerance how to prove the conjecture.

Pomerance invited Erdős to Georgia, and the meeting resulted in a joint paper giving the proof, which was published in 1978. It was the first of more than 40 papers that the two mathematicians would write together.

During his lifetime, Erdős collaborated with so many mathematicians that these efforts have been captured in something called the Erdős number. Erdős is assigned the number 0. People who have coauthored a paper with him are given the number 1. People who have coauthored a paper not with Erdős but with someone who coauthored a paper with Erdős get the number 2, and so on.

Pomerance noted that many years after his initial collaboration with Erdős, Emory University awarded honorary degrees to both Erdős and Aaron. Pomerance was invited to that occasion, and he asked both recipients to autograph a baseball for him. In effect, Hank Aaron joined the elite ranks of those having Erdős number 1! Even though Aaron didn't have a joint paper with Erdős, he did have a joint baseball.

In the meantime, the search for Ruth-Aaron pairs continued. In 1996, John L. Drost reported that a computer search yielded 149 pairs below 1,000,000. Somewhat later, Joe K. Crump developed a method for generating Ruth-Aaron pairs. It hasn't yet been proved, however, that the number of such pairs is infinite.

Drost went on to look for other analogously interesting sets of numbers. For example, he found a triple with equal prime sums: 417162 = 2 x 3 x 251 x 277; 417163 = 17 x 53 x 463; and 417164 = 2 x 2 x 11 x 19 x 499. All have prime sums of 533. Is there another such triple? No one knows.

Going back to 714 and 715, there’s still more. Note that 714 + 715 = 1429. Notice anything about 1429? It’s a backwards-forwards-sideways prime, which means that 1429, 9241, 1249, 9421, 4129, 4219 are all prime numbers. What about 1492? That was the year that Christopher Columbus stumbled upon America.

Originally posted August 7, 2005

January 20, 2021

Poe's Secrets

Writer Edgar Allan Poe (1809-1849) is famous for his short stories of the mysterious and the macabre. Poe's tale "The Gold-Bug," published in 1843, is sometimes cited as one of the best works of fiction that turn upon a secret message.

"The Gold-Bug" is about a secret message written in invisible ink on a scrap of parchment. The deciphered message leads to a buried chest filled with fabulous treasure. Here's the coded message that Poe included in his story:

The clever treasure seeker in story, William Legrand, assumed that symbols stood for letters of the alphabet, with no spaces left between the words of the message. He noticed that the character "8" appears 33 times, far more often than any other character. In the English language, the letter that occurs most often is "e."

Starting with that clue, he went on to look for combinations of three characters that might represent "the"—a very common word in English. Legrand could then guess that the semicolon represents "t" and 4 represents "h."

Following such hints, Legrand deciphered the secret message. Clues contained in the mysterious message eventually led him to a fortune in gold and jewels.

Poe had a longstanding interest in cryptology. When he for a short time became editor of Graham's Lady's and Gentleman's Magazine in Philadelphia, he offered to solve cryptograms sent to him by readers and, while waiting for responses, wrote an essay about secret writing for the July 1841 issue of the magazine.

When Poe ended the contest 6 months later, he claimed to have solved the 100 or so legitimate ciphers that readers had submitted. In his final "addendum" on the topic, published in the December issue, Poe included two ciphers from "a gentleman whose abilities we very highly respect." He attributed the ciphers and accompanying notes to W.B. Tyler and challenged readers to solve the puzzles. Poe never published the solutions.

No one seemed to have paid much attention to the ciphers until 1985. That's when Poe scholar Louis A. Renza suggested in an essay, titled "Poe's Secret Autobiography," that Tyler and Poe were the same person.

English professor Shawn Rosenheim took up Renza's theory and, in his 1996 book The Cryptographic Imagination: Secret Writing from Edgar Poe to the Internet, furnished additional circumstantial evidence pointing to Poe as the author of the two ciphers.

Meanwhile, in 1992, intrigued by Rosenheim's ongoing research on the topic, Terence Whalen took a break from working on his dissertation to tackle the puzzle. He solved the first of Tyler's cryptograms, which consisted of a long string of various typographic symbols.

The key to the solution was Whalen's recognition that the three-symbol pattern ", † § "—repeated seven times in eight lines—represents the word "the."

The deciphered passage proved to be lines from the 1713 play Cato, a Tragedy by English essayist Joseph Addison (1672-1719). Whalen's solution, however, did not settle the question of who created the cipher.

The second cipher, which used several different symbols for each English letter in the text, was much more difficult.

Determined to obtain a solution, Rosenheim offered a prize of $2,500 for deciphering the cryptogram. The contest received a substantial boost in 1998 when Jim Moore of Bokler Software, which builds cryptographic components for software developers, created a website devoted to the puzzle.

Rosenheim and Moore fielded hundreds of inquiries from around the world. The puzzle was finally solved in July 2000 by Gil Broza, a software engineer in Toronto.

It turned out that the number of different symbols for a given letter depended on the relative frequency with which that letter appears in English text. So, there were two symbols standing for "z" and 14 symbols for "e." Given the brevity of the cipher, would-be sleuths couldn't count on applying information about letter frequencies to crack the code.

Broza assumed that the symbols in the cryptogram were properly broken up into English words and proceeded from there, eventually using a computer to check for patterns and sort through various possibilities.

Broza's solution revealed that the original cipher contained more than two dozen mistakes, which had been introduced by the typesetter or the puzzle's originator. The deciphered text begins, "It was early spring, warm and sultry glowed the afternoon. The very breezes seemed to share the delicious languour of universal nature,…"

That certainly doesn't sound like Poe, but the text echoes themes that he favored. The passage is probably taken from some unknown novel or story of the period, Rosenheim said. It's still possible, though not certain, that Poe himself composed both ciphers.

There is something mysterious even in the decrypted cipher, Rosenheim added, not only because we do not know who enciphered it but also because it reminds us of the uncanny and limited immortality writing sometimes affords.

Originally posted November 13, 2000

January 18, 2021

Cracking a Medieval Code

The first printed book on cryptology was written by Johannes Trithemius (1462-1516), an abbot in Germany who was one of the leading intellectuals of his day. Bearing the title Polygraphia, it was published in 1518 after Trithemius's death.

Polygraphia title page.

The first of the six books of Polygraphia contains 384 columns of Latin words, two columns per page. Each word stands for a letter of the alphabet. Here's a sample from the first page:

By taking the words standing for the letters of a secret message from consecutive columns, it's possible to construct passages that make sense as innocent prayers. For example, enciphering the letters of the word abbot would generate the Latin sentence DEUS CLEMENTISSIMUS REGENS AEVUM INFINIVET.

The remaining books of Polygraphia introduce additional cryptographic schemes, accompanied by lengthy tables, for ingeniously hiding information.

Polygraphia wasn't Trithemius's first venture into cryptology. In 1499, he had composed a controversial, cryptic volume called Steganographia (meaning "covered writing"). For years, it circulated privately in manuscript form before finally being printed in 1606, then placed on the official list of prohibited books in 1609. Ostensibly, it explained how to employ spirits to send secret messages.

Steganographia title page.

The first two books of Steganographia contain numerous examples of some simple types of ciphers. For instance, in the message beginning PARAMESIEL OSHURMI DELMUSON THAFLOIN PEANO CHARUSTREA MELANY LYAMUNTO…, the first nonsense word signals the specific cryptographic system being used. The decipherer then knows to extract every other letter of every other word, starting with the second word, to get the message (in Latin): Sum tali cautela ut….

Book III consists largely of tables of numbers, whose columns are headed by zodiacal and planetary symbols, suggesting astronomical data. Unlike the first two books, however, there are few clues to help decipher the contents.

For centuries, scholars debated whether the incomplete third book of Steganographia contains any examples of enciphered messages. Many concluded that it does not hold cryptographic secrets and merely represents magical operations of interest only to occultists.

Nonetheless, the preface of Book III begins by announcing the provocative goal of presenting a method of transmitting messages afar without the use of word, book, or messenger. Trithemius warns, however, that he has deliberately expressed himself obscurely:

"This I did that to men of learning and men deeply engaged in the study of magic, it might, by the Grace of God, be in some degree intelligible, while on the other hand, to the thick-skinned turnip-eaters it might for all time remain a hidden secret, and be to their dull intellects a sealed book forever."

Those were fighting words to mathematician and cryptanalyst Jim Reeds, who couldn't resist taking on the centuries-old challenge.

"On receiving a photocopy of the Steganographia, I decided to see if I could find any hidden messages in Book III," Reeds recounted in a paper published in 1998 in the journal Cryptologia.

"I knew that Book III was probably in a draft state." he added. "Hence, it might be missing important information; the printed version had of course not received the author's proof corrections."

At the same time, he noted, "if Book III was anything like Books I and II, it was probably pointless to try to follow the instructions given in the text. Moreover, I could expect that any plain [deciphered] texts would be short and banal."

Reeds made a lucky guess. He assumed that the cipher was numerical and that the tables were to be read in columns vertically. He also decided that the table accompanying the preface was in the form of a key, in which successive lines describe blocks of 25 numbers each, which might specify distinct letters-to-numbers encryption formulas.

Reeds started by rewriting the first numerical table, turning the columns into rows, excluding all headings and data not appearing within the original columns, and replacing any nonnumerical symbols with a / sign. Here's a sample:

/ 644 650 629 650 645 635 646 636 632 646 639 634 641 642 649 642 648 638 634 647 632 630 642 633 648 650 655 626 650 644 638 633 635 642 632 640 637 643 638 634 / 669 675 654 675 670 660 675 661 651 671 664 659 666 667 674 667 673 663 659 672 657 655 667 658 673 675 660 651 675 669 663 658 660 667 637 665 662 668 663 659 / 694 700 679 700 695 685 696 686

He noticed that the slashes divide the first 160 numbers into four blocks of exactly 40 numbers each. Moreover, almost all the numbers in each block fall within a particular numerical range.

Reeds wrote the four, 40-number blocks in four rows, one underneath the other, to see if there was any similarity in structure among the rows:

He found that, with few exceptions, a number in a given row is 25 greater than the corresponding number in the row above.

"Although I still did not know that there was a cipher present," Reeds said, "it was clear from the emergence of this pattern that there was enough truth in my initial guesses about column reading and the importance of the number 25 to continue further."

"And if there were a cipher present, this finding would surely be due to the presence of four copies of an isolog: four copies of the same plain text encrypted in different but related ways," he continued.

"If I knew how to read those parts of the text encoded with numbers in the range 626 through 650, I could probably use the same recipe to read those parts encoded with numbers in the range 651 through 675: Simply subtract 25 from each number and proceed as before."

Reeds went on to check how often each of the 25 different numbers of a row were used:

The counts looked uneven enough to be consistent with a Latin or German text rather than just random outcomes.

A little bit of experimentation revealed that a reversed 22-letter alphabet apparently fit the observed frequency distribution: 650 = A, 649 = B, and so on, through an alphabet consisting of the letters A, B, C, D, E, F, G, H, I, L, M, N, O, P, Q, R, S, T, U, X, Y, Z, along with three additional symbols beyond Z (arbitrarily labeled α, β, and γ).

Applied to the 40 letters of the first row, this guess yields: gazafrequenslibicosduyitca?[γ]agotriumphos. That's certainly pronounceable, and it has hints of meaningful Latin words.

Additional clues helped Reeds unveil the scheme that Trithemius had used. For example, the symbol β is really the common German letter sequence sch, and what Reeds had thought to be x is w. He then discovered that α is tz and γ is th.

"One final piece of luck cinched the identification of the cipher alphabet," Reeds said. He performed an Internet search for the two-word phrase gaza frequens and came up with the Latin passage Gaza frequens Libycos duxit Carthago triumphos…. This confirmed that γ is th and suggested that the letter Reeds had labeled as y is actually x.

The Book III ciphers turn out to be numerical substitution ciphers, with multiple numerical equivalents supplied for each plain text letter, Reeds concluded.

What did Trithemius go to so much trouble to encipher? The text is somewhat garbled, probably indicating that pieces have been lost over the years or were missing to start with. Thus, the text now available represents little more than a collection of isolated sentence fragments.

Those fragments reveal nothing astonishing: mundane Latin and German phrases, including one that can be loosely translated as "the bringer of this letter is a bad rogue and a thief."

"Book III contains cryptograms," Reeds said. "Like those in Books I and II, they are disguised and presented in a context of angelic magic." However, the cryptographic technique is different because the letters are represented by numbers, disguised as astronomical data, instead of being hidden within a larger mass of letters.

Trithemius might have chosen angel language not to promote magic but as a ploy to capture the reader's interest. "If so," Reeds says, "he was vastly successful, even if he completely miscalculated how his book would be received."

In one final twist to this tale, it turned out that Reeds was not the first to reveal the ciphers in Book III of Trithemius's Steganographia (see New York Times report).

German language scholar Thomas Ernst had resolved the problem several years before when he was a graduate student at the University of Pittsburgh. He had written a paper describing his solution, which was published in 1996 in the German literature and culture journal Daphnis but apparently drew little attention.

In 1676, an obscure figure named Wolfgang Heidel had also claimed that he had deciphered Book III, but his findings were disputed when he insisted on writing about the discovery in his own secret code. Ernst also managed to crack Heidel's coded commentary.

The magic of Trithemius's cryptic work has finally evaporated in the face of determined code-breaking.

Originally posted May 4, 1998

January 12, 2021

The Limits of Mathematics

At the beginning of the 20th century, the German mathematician David Hilbert (1862–1943) advocated an ambitious program to formulate a system of axioms and rules of inference that would encompass all mathematics, from basic arithmetic to advanced calculus. His dream was to codify the methods of mathematical reasoning and put them within a single framework.

Hilbert insisted that such a formal system of axioms and rules should be consistent, meaning that you can't prove an assertion and its opposite at the same time. He also wanted a scheme that is complete, meaning that you can always prove a given assertion either true or false. He argued that there had to be a clear-cut mechanical procedure for deciding whether a certain proposition follows from a given set of axioms.

Hence, it would be possible, though not actually practical, to run through all possible propositions, starting with the shortest sequences of symbols, and check which ones are valid. In principle, such a decision procedure would automatically generate all possible theorems in mathematics.

What Hilbert was saying is that "we can solve a problem if we are clever enough and work at it long enough," mathematician Gregory J. Chaitin wrote in his 1998 book The Limits of Mathematics: A Course on Information Theory and the Limits of Formal Reasoning. "He didn't believe that in principle there was any limit to what mathematics could achieve."

In the 1930s, Kurt Gödel (1906–1978), followed by Alan Turing (1912–1954) and others, proved that no such decision procedure is possible for any system of logic made up of axioms and propositions sufficiently sophisticated to encompass the kinds of problems that mathematicians work on every day.

"More precisely, what Gödel discovered was that the plan fails even if you just try to deal with elementary arithmetic, with the numbers 0, 1, 2, 3,… and with multiplication and addition," Chaitin wrote in his 2004 paper META MATH! The Quest for Omega. "Any formal system that tries to contain the whole truth and nothing but the truth about addition, multiplication, and the numbers 0, 1, 2, 3,… will have to be incomplete."

In Gödel's realm, no matter what the system of axioms or rules is, there will always be some assertion that can be neither proved nor invalidated within the system. Indeed, mathematics is full of conjectures—assertions awaiting proof—with no assurance that definitive answers even exist.

Turing's argument involved mathematical entities known as real numbers. Suppose you happen upon the number 1.6180339887. It looks vaguely familiar, but you can't quite place it. You would like to find out whether this particular sequence of digits is special in some way, perhaps as the output of a specific formula or the value of a familiar mathematical constant.

It turns out that the given number is the value, rounded off, of the so-called golden ratio, which can also be written as (1 + √5)/2, an example of a real number. Given that expression, which represents an infinite number of decimal digits, you can compute its value to any number of decimal places. Going in the opposite direction from the given rounded-off number to the expression, however, is much more difficult and problematic.

For example, it's possible that if the mystery number were available to an extra decimal place, the final digit would no longer match the decimal digits of the golden ratio. You would have to conclude that the given number is not an approximation of the golden ratio. Indeed, the extended string of digits could represent the output of a completely different expression or formula, or even part of a random sequence. It's impossible to tell for sure. There isn't enough information available.

To sort through the relationship between random sequences and the types of numbers that mathematicians and scientists use in their work, Chaitin defined the "complexity" of a number as the length of the shortest computer program (or set of instructions) that would spew out the number.

"The minimum number of bits—what size string of zeros and ones—needed to store the program is called the algorithmic information content of the data," Chaitin wrote in the article "The Limits of Reason," published in the March 2006 Scientific American. "Thus, the infinite sequence of numbers 1, 2, 3,… has very little algorithmic information; a very short computer program can generate all those numbers."

"It does not matter how long the program must take to do the computation or how much memory it must use—just the length of the program in bits counts," he added.

Similarly, suppose a given number consists of 100 1s. The instruction to the computer would be simply "print 1, 100 times." Because the program is substantially shorter than the sequence of 100 1s that it generates, the sequence is not considered random. If a sequence is disorderly enough that any program for printing it out cannot be shorter than the sequence itself, the sequence counts as algorithmically random. Hence, an algorithmically random sequence is one for which there is no compact description.

Interestingly, the number pi (the ratio of a circle's circumference to its diameter), which is expressed by an infinite number of digits, has little algorithmic information content because a computer can use a relatively small program to generate the number, digit by digit: 3.14159…. On the other hand, a random number with merely 1 million digits has a much larger amount of algorithmic information.

Chaitin proved that no program can generate a number more complex than itself. In other words, "a 1-pound theory can no more produce a 10-pound theorem than a 100-pound pregnant woman can birth a 200-pound child," he liked to say.

Conversely, Chaitin also showed that it is impossible for a program to prove that a number more complex than the program is random. Hence, to the extent that the human mind is a kind of computer, there may be a type of complexity so deep and subtle that the intellect could never grasp it. Whatever order may lie in the depths would be inaccessible, and it would always appear to us as random.

At the same time, proving that a sequence is random presents insurmountable difficulties. There's no way to be sure that we haven't overlooked a hint of order that would allow even a small compression in the computer program that produces the sequence.

From a mathematical point of view, Chaitin's result suggests that we are far more likely to find randomness than order within certain domains of mathematics. Indeed, his complexity version of Gödel's theorem states: Although almost all numbers are random, there is no formal axiomatic system that will allow us to prove this fact.

Chaitin's work indicates that there is an infinite number of mathematical statements that you can make about, say, arithmetic that can't be reduced to the axioms of arithmetic. So there's no way to prove whether the statements are true or false by using arithmetic. In Chaitin's view, that's practically the same as saying that the structure of arithmetic is random.

"What I've constructed and exhibited are mathematical facts that are true… by accident," he said. "They're mathematical facts which are analogous to the outcome of a coin toss…. You can never actually prove logically whether they're true."

This doesn't mean that anarchy reigns in mathematics, only that mathematical laws of a different kind might apply in certain situations. In such cases, statistical laws hold sway and probabilities describe the answers that come out of equations. Such problems arise when you ask whether an equation involving only whole numbers has an infinite number of whole-number solutions, a finite number, or none at all.

"In the same way that it is impossible to predict the exact moment at which an individual atom undergoes radioactive decay, mathematics is powerless to answer particular questions," Chaitin stated. "Nevertheless, physicists can still make reliable predictions about averages over large ensembles of atoms. Mathematicians may in some cases be limited to a similar approach."

That makes mathematics much more of an experimental science than many mathematicians would like to admit.

Chaitin went further. Human creativity is absolutely necessary for mathematical work, he argued, and "intuition cannot be eliminated from mathematics."

Originally posted March 6, 2006

January 11, 2021

Coyote Fence Passageway


Dale Ball Trails South (Wilderness Gate), Santa Fe, New Mexico, 2021.

Photo by I. Peterson

January 10, 2021

Winter Grass


Feather reed grass (Karl Foerster), Santa Fe, New Mexico, 2021.

Photo by I. Peterson

January 9, 2021

Mountain Clouds


Sierra Del Norte (Dale Ball Trails North), Santa Fe, New Mexico, 2020.

Photo by I. Peterson

January 7, 2021

Dancing Musical Chaos

Point by glowing point, the image swirls into view. As it builds up on the computer screen, it begins to resemble a delicate, stylized butterfly with translucent wings held lazily askew.

It's called the Lorenz attractor, named for Edward N. Lorenz, who in 1963 discovered this curious form encoded in a set of equations describing air flows in the atmosphere. The computer image arises out of a chaotic—in the mathematical sense—system.

For a given starting point, the computer calculates the coordinates of each successive point as the dynamical system described by the equations evolves. It displays these points as luminous dots on the screen. They appear to sprinkle themselves randomly across the display, but gradually a distinctive butterfly pattern emerges.

Different starting coordinates typically lead to radically different sequences of calculated points. The overall pattern, however, can always be identified as the Lorenz butterfly. It's an example of both the sensitive dependence on initial conditions and the distinctive patterns that are characteristic of chaotic systems.

In the early 1990s when Diana Dabby was a graduate student in electrical engineering at the Massachusetts Institute of Technology, she could imagine using the mathematics of chaos to compose music. She envisioned "riding the back of the attractor" to create musical variations that stray in unexpected ways yet do not wander so far as to lose all ties with the original music.

A concert pianist before she came to MIT, Dabby devised a scheme for using the Lorenz attractor to generate variations on the sequences of notes in a piece of music. Her initial experiments were done on Bach's Prelude in C from the first book of The Well-Tempered Clavier. Audio (Tracks 3, 4, 5).

The x coordinates of the points that make up the Lorenz attractor for a given starting point fall within a certain range of numbers. Dabby's idea was to list the pitches of all the notes or chords of a musical piece and assign them one by one, in order, to the x coordinates of points belonging to the attractor. In this way, she paired up each of the pitches in the original music with a particular range of x values.

Then she could choose a second starting point only slightly different from the first to produce a new "trajectory", or sequence of points, making up the Lorenz attractor. Because this new trajectory generally does not track the original one perfectly, the x coordinates of the two trajectories differ and the musical notes corresponding to these new x coordinates may occasionally change from those in the original piece.

You can imagine that the initial "mapping" step lays down the musical landscape, and the second trajectory, representing the variation, takes a slightly different path through this terrain. Because the landscape incorporates features of the original musical piece, any variations created in this way usually sound consistent with the source piece. Indeed, no variation can include a pitch not present in the original.

"The musical variations that result can be close to the original, mutate almost beyond recognition, or achieve degrees of variability between these extremes," Dabby noted. "The technique can also be used to infuse a given work with the attributes of another, so that, for instance, a work by Bach can acquire attributes of a work by Gershwin."

To Dabby, this method of producing musical variations served as an idea generator. It brought a fresh perspective to familiar music. A musician could interact with the variation-producing software to select, edit, and record particularly pleasing passages, even weave them into new compositions.

Dabby applied the technique not only to works by Bach but also to a Gershwin piece and some of her own compositions. She also explored the use of chaotic mappings to create rhythmic variations via the y variable of a Lorenz attractor.

"Once variations of an entire piece are available, the composition can change with successive listenings, from performance to performance, or even within the same concert," Dabby remarked. "In a broad sense, the music has become dynamic. It changes with time much the same way as a river changes from day to day, season to season, yet is still recognized in its essence."

Inspired by Dabby's example, computer scientist Elizabeth Bradley and Joshua Stuart developed a similar scheme for dance. They used chaos to generate variations on movement sequences.

"We map a progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence onto a chaotic attractor, establishing a symbolic dynamics that links the movement progression and the attractor geometry," Bradley and Stuart reported in the article "Using Chaos to Generate Variations on Movement Sequences," published in the  December 1998 Chaos.

The researchers used special symbols to represent human body postures, encoding those positions by defining an axis and angle of rotation (given in the form of a mathematical expression called a quaternion) for each of the main joints. They then mapped a given motion sequence—whether a ballet jump or sequence of karate moves—onto a chaotic attractor. Following a new trajectory around the attractor produced a variation of the original motion sequence.

In effect, the choreographic software took an animation as input and generated an animation as output.

The researchers also had to adjust for the capabilities of the human body. "While musical instruments can play arbitrary pitch sequences, subject to instrument range and performer ability, both kinesiology and aesthetics impose a variety of constraints on consecutive body postures in dance and martial arts genres," Bradley and Stuart noted.

To smooth out abrupt transitions introduced by the chaotic mapping, the researchers developed schemes that captured and enforced particular dance styles. They dubbed the resulting software Chaographer.

The researchers showed the computer-generated animations to hundreds of people, including dancers and martial artists. "The consensus is that the variations not only resemble the original pieces but also are, in some sense, pleasing to the eye," Bradley and Stuart concluded. "They are both different from the originals and faithful to the dynamics of the genre. There are no jarring transitions or out-of-character moves."

They added, "Showing these results in a classroom is an enormously effective way to motivate students to learn the mathematics of rigid-body dynamics, chaos, and context-dependent grammars."

Originally posted January 11, 1999

January 5, 2021

Calendar Quirks

Calendars represent our efforts to create frameworks that allow us to reckon time over extended periods.

We normally count the day—the time it takes Earth to rotate once on its axis—as the smallest unit of calendrical time. The measurement of fractions of a day fits, by convention, into the category of timekeeping.

Of the dozens of calendars presently used in the world, the most common ones group days into weeks, months, and years. They follow two astronomical cycles in addition to the day: the year (based on the revolution of Earth around the sun) and the month (based on the revolution of the moon around Earth).

The trouble is that the cycles of revolution do not comprise an integral number of days (see "Fractions, Cycles, and Time"). Those quirks of the solar system add a maddening complexity to any calendar based on astronomical cycles—especially the 365 days, 5 hours, 48 minutes, and 46 seconds of the solar (tropical) year.

Moreover, because of the precession of Earth's axis, a solar year is actually defined not by the time Earth takes to make one revolution about the sun but as the average time between two vernal equinoxes. A vernal equinox represents the instant at which the sun lies exactly between the north and south celestial poles.

How best to handle that fraction of a day has long spurred calendar reforms. In 46 B.C., Julius Caesar instituted a calendar that made a year 365 days long, with every fourth year having an additional day.

However, because a year actually runs 365.2421896698 instead of 365.25 days, there was a slight discrepancy. Over ensuing centuries, the Julian calendar gradually went out of step with the seasons, muddling the timing of various festivals and religious observances.

In 1582, Pope Gregory XIII introduced a new calendar to replace the old Julian system. It was based on a 400-year cycle. Leap years followed the rule: Every year that is exactly divisible by four is a leap year, except for years that are exactly divisible by 100; and those years are leap years only if they are exactly divisible by 400. As a result, the year 2000 is a leap year, whereas 1900 and 2100 are not.

The Gregorian cycle of 400 years contains exactly 20,871 weeks. Hidden within the machinery is a bias toward certain days of the week landing on certain days of the month. For example, the 13th is more likely to be a Friday than any other day.

Indeed, Bernard D. Yallop of Her Majesty's Nautical Almanac Office at the Royal Greenwich Observatory a while ago determined that there are 688 Friday-the-thirteenths every 400 years, but only 684 Thursdays. That also means a month is most likely to begin on a Sunday.

You can find additional quirks by creating a giant table representing the Gregorian cycle, sorting the data in various ways, and compiling the results.

It's also relatively easy to write a computer program that tells you what day of the week a particular date is, how many days there are between two given days, and so on. Indeed, the calendar can be regarded as a positional number system, Ilan Vardi has remarked.

Here's a formula for computing the day of the week, W, for a given day, D, of the month, M, and the year 100C + Y:

where months are numbered beginning with March = 1. Dates in January and February are considered to be in the 11th and 12th months of the previous year. Days of the week are numbered W = 0 for Sunday, W = 1 for Monday, and so on. The symbolmeans take the integer part, x, of the decimal x.y; mod 7 means divide by 7 and retain only the remainder.

January 3, 2021

Fractions, Cycles, and Time

In our efforts to measure and understand the universe in which we live, we often find ourselves dealing with "messy" numbers. Our tendency is to replace those inconveniently long strings of digits by rough approximations.

In the everyday world, we say that an inch is about two and a half centimeters, the speed of light is nearly 300,000 kilometers per second, and the number pi is close to 22/7 or 3.14.

In ancient times, people had to confront awkward numbers in astronomical contexts when they compared the motions of the sun and moon. The unfailing, daily passages of the sun across the sky and the corresponding movements of the stars at night represented one highly predictable cycle. The periodic changes in the moon's appearance and position represented another cycle.

Full moon in the night sky.

Closer observations over months and years revealed subtle shifts in these patterns. The sun, for instance, doesn't rise at precisely the same point on the horizon every day. The location of sunrise drifts back and forth along the horizon. These recurring excursions define a longer cycle tied to the changing seasons.

Similarly, particular stars rise at different locations along the horizon and, at certain times, disappear from the sky for lengthy periods. These movements also have a definite rhythm attuned to the seasons.

In modern terms, taking the day as the standard unit of measurement, the seasons recur every 365.242199 days (a year), while the period of the moon's phases is 29.530588 days (a month).

And to be really precise, we must also note that these numbers decrease each century by 1 or 2 in the last decimal place because tidal friction is slowing Earth's rotation and making the day longer. Indeed, official timekeepers add a second (a leap second) every year or so to keep their clocks in sync with Earth's rotation rate.

The ancients didn't use decimals, but they could represent these cycles with remarkable precision by considering ratios. The Athenian astronomer Meton (5th century B.C.), for example, noted that 235 months very nearly equals 19 years. The so-called Metonic cycle is still used to determine the Hebrew (Jewish) calendar and set the date of Easter.

The following table gives the error when various numbers of months are compared with the corresponding numbers of years. The listed entries represent successive improvements in the accuracy of the ratio of months to years used to approximate a cycle. Each line is obtained by adding a certain multiple of its predecessor to the one before that. For example, to get 99 months in line 5, you add two times 37 (fourth line) to 25 (third line).

Meton's approximation is off by just 2 hours 4.4 minutes! And it's bettered only by comparing 4,131 months with 334 years.

Now consider the successive fractions (number of months divided by the corresponding number of years): 12/1, 25/2, 37/3, 99/8, 136/11, 235/19, 4131/334. Those fractions, in turn, can be written in the following manner:

Such expressions are known as continued fractions. They can be used in designing gear trains, including those that might be found in a planetarium to simulate the relative motion of the sun and moon with respect to Earth.

The so-called Antikythera mechanism, apparently constructed in the first century B.C., recovered in 1900 from a Mediterranean shipwreck, and analyzed in recent times, is one of the most striking examples of such engineering in the ancient world.

The mechanism contained a system of gears whose gear ratios corresponded to well-known astronomical cycles involving the moon, including the Metonic cycle. The mechanism was clearly a type of analog computer, using fixed gear ratios to make calculations displayed as pointer readings on a dial.

The Antikythera mechanism—the sole survivor of what was undoubtedly a long tradition of astronomical automata—served primarily as an elegant simulation of the heavens. It was a tabletop monument to Greek and Alexandrian astronomy. Such ingenious devices also illuminated the intimate link between mathematics and astronomy, especially the role of number in astronomical prediction.

By demonstrating an ability to predict the movements of the moon, the rising and setting times of stars, and the changes of the seasons, astronomers could please their rulers while contemplating the subtleties of the apparent mathematical order displayed in the heavens.

Originally posted October 13, 1997

January 1, 2021



2021 = 43 x 47 (the product of two consecutive prime numbers: A006094). As the product of two primes, 2021 is a semiprime.

2021 = 452 − 4  (A028347).

The Euler polynomial x2 + x + 41 generates an unbroken sequence of 40 primes, starting with x = 0. 2021 is the third composite number that the formula produces (for x = 44) (A145292).

2021 is the sum of the first 24 pairs of consecutive primes (A102724).

The sum of the divisors of 2021 (1, 43, 47, 2021) is palindromic: 2112.