Beal Conjecture Counterexample Search

06/16/2013

In 1994, Andy Beal searched all combinations of variable values through 99 for a Beal conjecture counterexample. This required "thousands of cumulative hours" of compute time. A Beal conjecture counterexample search utility written by Peter Norvig was made available sometime before December 2000. It completed values through 100 in 0.6 hours. In addition, Peter Norvig's search extended the results using non-equal values for maximum base and maximum exponent. Dan Vanderkam released an improved version of Peter Norvig's search around February 2008. This search completed all base and exponent values through 1000 in less than 17 hours.

In 2013 Andy Beal increased the reward for a proof or counterexample to US $1,000,000, resulting in renewed interest in counterexample search. The new counterexample search utility described on this page reduces the runtime of Peter Norvig's 2008 search of base and exponent values to 1000 from 17 hours to 10 minutes. The new search also completes 2000x2000 in less than 4 hours. It easily handles base/exponent search ranges such as 100x10000, 100000x30, 250000x10, and 250000x30. After several days of searching, the new program found no counterexample.

Why is the new utility faster? One reason is that desktop computer hardware has improved. The new search results are from the Intel Core i7-2600K processor running at 4.0 GHz. Here are some software improvements that reduce search time:

C language replaces Python language
A multiple stage lookup table replaces hash table
Multithreading lets all the processor cores contribute

Like Dan Vanderkam's program, this utility uses modulo arithmetic to avoid the processing overhead of extended integer math. The use of modulo arithmetic makes the A^X + B^Y calculation fast. Once the easy work of finding an A^X + B^Y sum is complete, the more difficult job of quickly testing the sum to see if it is a power of an integer remains. Previous work uses a hash table for this purpose. This utility instead creates a lookup table containing a 32-bit remainder for each possible C^Z value. Searching the lookup table for a match is far too slow. A second table eliminates the need to search the lookup table for every A^X + B^Y sum. This second table is a bitmap containing 2^32 entries. Each bit indicates the presence of a C^Z modulus in the lookup table. With the second table, a single memory read tells the program if a value is present in the C^Z table. If the value is present, it can then be found using a linear search.

This solution is still not optimal. One problem is that many A^X + B^Y sums have a remainder that aliases to a C^Z table entry, even though the sum is not actually a power of an integer. This triggers an unneeded table search. To reduce the occurrence of this problem, the bitmap is supplemented with two additional bitmaps. These two additional bitmaps are built exactly the same as the first, but using a different 32-bit modulus value. The code then checks all 3 bitmaps before proceeding to the lengthy table search. In order for the code to check all 3 bitmaps requires calculating the A^X + B^Y sum 3 times instead of once. But because modulo arithmetic is used, the overhead of the extra two calculations is negligible when compared to the time saved by reducing unneeded table searches.

The above approach works well, though profiling shows the code spends most of its time waiting for the single memory read needed to lookup a value in the first bitmap. Because the bitmap read address is effectively random and the bitmap size of 2^29 bytes is far larger than the processor cache size, the read often requires an access of relatively slow DRAM. To address this problem, each bitmap is supplemented with a compressed version. The compressed version is 2^22 bytes, which fits in the L3 cache of a modern processor. The compression works by discarding the lower 7 bits of the A^X + B^Y sum value that indexes the uncompressed bitmap. In other words, each block of 128 uncompressed bitmap bits is replaced by a single bit in the compressed bitmap. If one or more bits in the uncompressed 128 bit block is set, then the compressed bit value is one. This compression is not lossless, but the sparseness of the uncompressed bitmap makes it work well. Occasional aliasing due to compression causes unwanted accesses of the uncompressed bitmap. But the use of 3 independent modulo values reduces this effect. If all 3 compressed bitmaps show a match, then all 3 uncompressed bitmaps are checked. If all 3 uncompressed bitmaps show a match, then the C^Z lookup table is searched.

It is possible more than one match will be found when searching the C^Z lookup table. For each match, the A^X + B^Y = C^Z calculation is repeated with yet more modulo values. If all of these modulo calculations pass, then the result is logged. If the program logs a result, it must be manually confirmed using non-modulo arithmetic. So far, this program has not found A^X + B^Y = C^Z values that pass all modulo checks needed for logging. One way to test the logging mechanism is to search for Pythagorean triples. For example, bc minpower=2 minbase=2 quiet.

The program is built as a Windows x64 console application. The included executable was made by compiling with gcc 4.8.1 using mingw64. Microsoft's Visual Studio 2012 compiler also produces good results.

Revision History

version 1.0 06/16/2013 Initial version.