Sunday, September 12, 2010

Collatz conjecture and binary palindromes

The Collatz conjecture, or 3n + 1 problem, is about unexplained number patterns.
Take any natural number n. If n is even, divide it by 2 to get n / 2, if n is odd multiply it by 3 and add 1 to obtain 3n + 1. Repeat the process indefinitely. The conjecture is that no matter what number you start with, you will always eventually reach 1.
One wonders if the conjecture is true or false, and why. Might the existence of a path to 1 be undecideable, where we can't know if the process will ever end? [In 2006, researchers Kurtz and Simon, building on earlier work by J.H. Conway in the 1970s, proved that a natural generalization of the Collatz problem is undecidable.[2]]

But the appeal of the conjecture has more to due with the strange patterns that occur in the paths that numbers take on the way to reaching 1. The length of the path, the stopping time, is highly variable and anything but random. What's the pattern?

It is easy to find the stopping time for any small integer. A short Matlab program was able to find the first 10,000,000 stopping times within a few minutes, without any optimization. Jacob programmed a Collatz path calculator in Scratch, that shows the intermediate path values.

The first path that begs for an explanation is 27. Integers lower than 27 have stopping times less than 23 steps, but at 27 the number of steps jumps to 111!  What's so special about 27, or other jumps in stopping time? Is it because it is a power of 3 (1000 base 3)? No, other powers of 3 are easily checked with no obvious extremes or pattern.

Some patterns are easily explained. Any power of 2 (1000... in base 2) will have a small stopping time, because repeated division by 2 (removing 0's from the right repeatedly) brings it to 1 directly, without increases. Similarly, there are special binary palindromes (bit sequences that read the same backwards and forwards) that have short stopping times: any integer of the form 101010....1, with alternating 0's and 1's, immediately precedes a power of 2, so will also have a small stopping time. The reason for this is that 3n + 1, where n is of this form, is a power of two:

    101010...10      2*n
+    101010...1        n
=   1111111...1      3*n
+                      1
=  10000000...0      3*n + 1
Do all binary palindromes have short paths? No. In fact, 27 = 11011 base 2 is a binary palindrome and has a very long path relative to its length. Another binary palindrome, 22-bits long:
3732423 base 10 = 1110001111001111000111 base 2
also has an extreme stopping time, larger than any integer before it.

Is this a coincidence, a long stopping time for some palindromes? No, it is unlikely. There are 2^11 22-bit palindromes (one for every possible leading 11 bits), but 2^22 22-bit numbers. That means these palindromes only occur for 1 of 2^11 = 1 / 2048 integers. Accounting for the first bit always being 1, there is ~1/1000 chance that one of these numbers is a palindrome. 3732423 is only the 10th extreme stopping time integer. That is long odds -- out of 10 randomly picked 22 bit integers, there is less than a 1% (1/100) chance of any one of them being a palindrome.

Is it also a coincidence that this long stopping time palindrome (1110001111001111000111) also has long palindromes within its two halves (111000111)? Is there a way to predict which palindromes will have long stopping times?

I see the algorithm as a symbolic dynamics system on binary words, the base 2 representation of the integers. Each step transforms a string of 1's and 0's into a new string of 1's and 0's, and the process is repeated until there is a single 1 in the string. For example multiplication by 2 just shifts bits left, appending a 0, and division by 2 removes a right 0. The conjecture is that any string of 1's and 0's will end up as a single letter word, '1'.

Why are palindromes special with respect to these string operations?

To do: Write a program that shows every step of an integer's stopping sequence as a binary word.

[ From Collatz conjecture as a tag system
The conjecture can also be mapped to a different symbolic dynamics system, a very short 2-tag system with production rules abc, ba, caaa. In this system, the positive integer n is represented by a string of n a's, and iteration of the tag operation halts on any word of length less than 2. The Collatz conjecture equivalently states that this tag system, with an arbitrary finite string of a's as the initial word, eventually halts. See Example: Computation of Collatz sequences for a worked example.  
 Adapted from De Mol, "Tag systems and Collatz-like functions", Theoretical Computer Science, 390:1, 92-101, January 2008 (Preprint Nr. 314)   ]


  1. As you note: it is the pattern of alternating 1s and 0s that make a short sequence, not the fact that it is a palindrome. 27 is a long sequence because it shortly becomes a Mersenne Number (all 1s). Blocks of contiguous 1s at the LSB cause divergence, the more 1s, the greater the divergence. What's important is how the propagating carry bits interact with the paatern, for instance, whether a carry bit can cross a fire-beak (a 0 in the pattern). Once a caertain pattern is established, often the pattern resonates when multiple copies of the pattern are present. And this often results in a palindromic overall pattern, but not in all cases. Sometimes a certain 3-bit pattern repeated dozens of times will resonate (the same pattern keeps repating albeit one rep-unit shorter), but often it requires a resonator (an extra 2 bits at the LSB. The presence of a resonator automatically means the pattern is not palindromic.

  2. Here's some more on resonance.