The answer is not a mate in two, three, or even five... it's mate in fifteen at the minimum, and can take as long as twenty or more against a sufficiently desperate opponent. The tremendous horizon required to solve this puzzle is the reason, I think, why no computer I know of (today) can solve it.
The mate happens in three "acts", which I will explain in detail:
If you're not interested in the detailed solution, you can jump right to the philosophy at the end.
Act 1: what occurs right away to most people. 1. c3 Kc5; 2. Na4 Kb5; 3. c4 Ka6. Like most chess puzzles, these are all "forced" moves for Black. The only way to evade check is to move the King, and the King has only one square he can go to for each move.
But White cannot make another check. The white Bishop cannot go to b7 to complete the mate, since it's unprotected and the black King will take it. And White needs the Bishop on that diagonal to prevent any of those six terrifying pawns from being promoted, or from shuffling forward in tandem. None of the pawns have any protection on the white squares to which they must inevitably move. (The pawn at c7 may go to c5 but is promptly blocked by White's pawn, and leaves Black's d-pawn in no better position.)
Act 1 ends in a standoff: White's pieces are all now on white squares, leaving Black's black-squared bishop with nothing to do but dance around impotently. Black's King cannot move and any move by the non-blocked black pawns results in capture by the white Bishop, with no consequences for that Bishop.
Act 2: White realizes that if only the white King were positioned at c8 or a8 (stay on the white squares, remember), it could support Bb7 and checkmate. This journey, along white squares only, is at a minimum 7 moves away, a very distant horizon for a chess computer. But there is a problem. The King must pass over the white diagonal protected by the white Bishop. If he does, he breaks the line of attack presented by the Bishop to the black King, allowing the black King to pick up his skirts and flee, over b7 to the right side of the board, where he can begin nefariously aiding the promotion of the pawns. Disaster!
Worse, the white Bishop cannot slip up the diagonal (say up to a8) to let the white King by while keeping the black King penned in, because the white King now breaks the line of defense that the white Bishop has on the pawns, especially the terrifying sword of Damocles at h2.
The solution to the "Bishop passage", is, therefore, to let the white Bishop keep the black King penned in (Ba8) while the white King crosses at g2 and only at g2 -- thus allowing him to temporarily take over the guarding of black's h2h1 promotion.
Act 3: Now safely on the right side of the board, at h3, the white King trots unimpeded up the white diagonal from h3 to c8, whereupon the white Bishop descends to b7 for the coup de grace.
Chess-playing computers, even Deep Blue, run a fairly standard "minimax" algorithm, that goes something like "assuming you play the best move I can think of for you [worst for me], what's the best move for me," and so on, move after move. The problem is that in order to evaluate the "best move", the computer has to evaluate uncountable hordes of bad ones. So many, in fact, that it greatly reduces the "horizon" of the computer's thought process, roughly defined as the length of the longest chain of consecutive moves it can evaluate before having to move.
There are two amazing properties of human thought when faced with a puzzle like this that the computer doesn't (to date) have: (1) the ability to generalize a sequence of moves and (2) the ability to not even consider bad moves.
In other words, once we "get" the idea that dancing on the white squares allows us to ignore Black's bishop, we won't even consider moves onto black squares. And, once we realize that the King needs to travel up to c8, that's a single coherent idea to us, not one brilliant needle of consecutive moves among a haystack of worthless ones.
Of course, the Achilles heel of all this generalization and blindness to "bad" moves is that when our generalization is inaccurate or incomplete, we are completely vulnerable: we can't see the trees for the forest, and trip over the roots.
It took me only two hours to solve this problem. Yet
my low-end chess computers which will never be able to solve it,
regularly beat me when playing normal games of chess. Perhaps
in day-to-day reality, it's better to see trees than forests.