Stan Chesnutt's Pentominoes Solver

Pentominoes

The pentominoes puzzle is composed of twelve pieces that are placed into a rectangular frame. Wikipedia has a good explanation of pentominoes here.

I've worked with the pentominoes puzzle, off and on, ever since reading Arthur C. Clarke's book Imperial Earth. I started playing with computer solutions in the late Seventies, with my first implementation written in 6502 assembly language.

Typical solutions to pentominoes are via brute-force methods, and usually over 95% of the computation time is spent trying to fit the last one or two pieces into the board.

A brute-force implementation took days to find all of the solutions using processor technologies of the Seventies and Eighties, but modern processors can give exhaustive solutions in a matter of minutes.

Brute-force methods can be hyper-optimized, and removing all un-necessary instructions from the code is an enjoyable challenge. I also enjoy brute-force solutions because they are often undebuggable by ad-hoc methods, since they require millions or billions of iterations. Brute-force mechanisms require careful thinking as well as careful optimization.

My first solution started with C code that was carefully pruned to the minimum number of steps. I then modified the program so that it would generate 80x86 machine code that included the simplest possible instructions. For example, the rectangle frame that is used to place the pieces is generally considered to be a rectangular array with dimensions such as 6x10, 5x12, or 3x20: my code takes the user-specified size of the frame and generates machine code specific to that frame size.

The result is code that will only run on the 80x86 processors, but it does so extremely quickly.

Similar solutions have been used elsewhere: for example, many of the early implementations of bitmap graphics would use on-the-fly rasterization routines that were optimized for the particular size and style of the blit operation needed. Also, Michael Abrash has described a similar approach to a solution of Conway's "Game of Life".

A few notes about the code:

The code can be found here.

My next foray into this area is a custom version for the 3x20 frame: it is possible to fit the entire frame representation into a single 64-bit processor register. This approach should be significantly faster than my current implementation, which uses an in-memory array to store the contents of the frame.

However, on the amd64 processor, this method is slower, for the following reasons:

Still, the implementation is instructive, and can be found here. I'll tune it a bit more, hopefully in the months to come.

$Header: /home/cvs/htdocs/stan/pentos.html,v 1.3 2006/06/11 04:06:33 chesnutt Exp $