Fast Software LFSR
portable C and x86 optimized versions
How fast can a software LFSR run on a modern (2012) x86 processor? Here are measurements from an Intel Core i7-2600K running at 4.0 GHz. The numbers are measured nanoseconds per LFSR update cycle.
LFSR Width GPR
(portable) XMM (AVX) YMM (AVX2)
bits
MS gcc MS
gcc MS gcc
32
1.5 0.8 0.8
0.7 ?
?
64
1.5 0.8 1.0
1.0 ?
?
128
1.4 1.3 1.3
1.3 ?
?
256
2.1 2.2 1.8
2.0 ?
?
512
7.8 3.7 4.8
2.9 ?
?
1024
21.0 11.3 11.3 5.8
? ?
2048
42.3 22.0 22.3 13.0
? ?
4096
86.0 81.0 48.8 44.4
? ?
Notes:
1) The LFSR code and speed of execution are independent of polynomial selection.
2) LFSR width 8192, 16384 ... were not coded because primitive polynomials needed for maximal length sequence generation are not available. Once (2^8192 - 1) is successfully factored, LFSR width 8192 will be possible.
3) MS: Microsoft Visual Studio 2012 compiler.
4) gcc: version 4.7.3
(mingw-w64-bin-x86_64-20121103.7z) from http://www.drangon.org/mingw/.
note: gcc 4.8.0 and 4.8.1 produce slower code for this benchmark/processor
combination.
5) Results are for x64 build. The code also executes properly when compiled for x86.
6) The YMM (AVX2) columns will be filled out when the Intel Haswell processor is released in 2013. A significant speed improvement can be expected for LFSR width 256 and greater. The AVX2 code is included and works properly with Intel Software Development Emulator 5.31.
7) The software first runs the GPR (general purpose register) version of the code. Then the XMM version is run if both processor and OS support AVX. Finally, the YMM coded functions run if both the processor and OS support AVX2.
Revision History
version 1.0 11/12/2012 Initial version.
version 1.1 11/20/2012 Remove -flto from the gcc build command because AVX instructions were being generated where not intended, causing a crash on systems where AVX is not present.
version 1.2 12/30/2012 Add 4096 bit LFSR. Switch from VS2010 to VS2012 for Microsoft build.
version 1.3 08/14/2013 Correct allocation size for static array in lfsrymm.c. Add version number display to test program startup. Add obj file cleanup to test batch file.