July 2, 2014

Solving Sudoku Puzzles

Recently, I got hooked on to Sudoku (thanks to a wonderful IPhone app). I found it to be a slightly more constructive time sink and plan to teach it to my daughter some day (she is one year old as of yet). For this, I thought of writing an assistive software to teach kids how to simplify a Sudoku. This motivated me to first build a Sudoku solver and this blog is about it.

A Sudoku is a matrix of cells, typically 9x9, but it can be any number. In this post, we consider N x N Sudoku (where N is a perfect square, i.e. 1x1, 4x4, 9x9, 16x16). Now for such a Sudoku, we first construct a groups of cells called as blocks. We get N row blocks, N column blocks and N sub-grid blocks. All the blocks contain N cells each. A cell can have a value from 1 to N, such that, if a cell is assigned a value, then no other cell in the same block can be assigned that value. To check if a Sudoku is solved, all the blocks must sum to N*(N+1)/2. Another check is one where number uniqueness constraints are satisfied per block and each cell is assigned a value from 1 to N.

To simplify a Sudoku by hand, I typically use the following techniques:
  1. A number is assigned to a cell,
    • If a cell can hold only that number.
    • If that number can be held by only that cell (in at least one block).
    Once a cell is assigned a number, other cells (in the same block) get their possible hints (or values) pruned.

  2. Two numbers are assigned to two cells if those two cells share at least one block (Note: two cells can share maximum of two blocks),
    • The two numbers are referred exclusively by these two cells in a given block. In this case other hints of the two numbers are pruned.
    • The two cells refer to those two numbers only. In this case the two numbers are removed from the hints of other cells in their shared block.

  3. To finally solve it, simplify Sudoku by applying (1) and (2) until further simplification is not possible. Then pick a cell with least number of hints. Assign it one of those hints and repeat this brute-force method. If failed, then try another hint. If all hints fail then Sudoku is not solvable.
Algorithm 1a, 1b are pretty straightforward. Algorithm 2a and 2b are also easy to see (if you have played Sudoku before). In spirit Algo 2 can be extended for k>2 but the benefit of simplification it provides is marginal compared to the cost of finding set of k cells or k numbers that satisfy the unique reference aspect. Now to encode these algorithms, I wrote a Java program which solves Sudoku of size N x N (where N is a perfect square).

The program is oriented to be event driven, where a cell is asked to remove its hint based on the logic of (1,2,3). In this process if a cell has only one hint remaining, it triggers a fix event. When the fix event is called, this cell is assigned the value and that value is removed from the hint of cells that share block with it. If at anytime any inconsistencies are found, the events return a “false”, indicating that Sudoku is not solvable. This is a fail fast technique and it quickly helps in determining if a Sudoku is non-solvable or not. Otherwise, we would wait for all cells to get filled and then realize that it failed (wasting precious time). A simple brute-force strategy is very easy to code, however the possibilities to check are huge. For e.g., we might end up checking O(9^81) possibilities (which can be prohibitive). The simplification algorithms (1) and (2) come to the rescue by drastically simplifying the Sudoku. For most of the easy problems - the recursive step (3) is not even triggered.

I searched online for Sudoku problems and found that Peter Norvig has a blog about it. He used algo(1,3) but not (2) and has shared a set of easy, hard, and hardest Sudoku puzzles. It amazes me to say that for most hard problem, I take at least 8-10 minutes (and sometimes a hint) but a computer can solve it in order of ~0.1 ms. If you are interested in the maths of Sudoku (e.g. number of possible solutions), please check out the wikipedia page: maths of sudoku.

In the first analysis below, we see how many cells different algorithms are able to solve.
 Input    (1)  (2a) (2b)  (2) (1+2a)  (1+2b)   (*)
Easy 35% 89% 54% 68% 70%   96%  94% 96%
Hard 25% 31% 26% 26% 26%   41%  31% 41%
Hardest 30% 45%  35%  35%  36%    50%  45% 50% 

Here we see that algorithm (1) itself can solve 89% of the cells for easy problems. However, the joint application of (1+2) is able to solve 96% of the easy cells and 41% of the hard cells (and roughly provide a 10-15% improvement. This is a great simplification, especially for easy Sudoku problems. Amongst (2a) and (2b), I find (2a) to be a better strategy to run in conjunction with (1) as (1+2a) has same result as (*).

In the next analysis, we see how many hints are left after application of different algorithms
 Input   (1)  (2a) (2b)  (2) (1+2a)  (1+2b)   (*)
Easy  474  29 126  89  82   10    16  10
Hard  544 230 252 260 250   184  228 181
Hardest  508 161  191  193  189    141  158 141 

Similar analysis as above. 2a appears better than 2b. But (*) must better than (1) or (2). So combining these strategies is much more effective. We see that clearly (1+2a) as good as (*) so we can simply drop algorithm 2b. To see how much (if at all) there is a penalty of running the (2b) algo, we turn to time taken.

For measuring time, I use Java's System.NanoTime() function, which gives very precise timing. Experiments are run on my MacBook Pro with 2.6Ghz intel I7 processor. To get the timings, I solve a puzzle 500 times and take the average solving time. We get the following timings.

 Brute Force      (1)   (1+2a)   (1+2b)      (*)
Easy    0.14 ms 0.04 ms  0.03 ms  0.03 ms   0.03 ms 
Hard    21 ms 1.16 ms  0.44 ms  0.55 ms  0.42 ms
Hardest    0.61 ms 0.17 ms  0.20 ms   0.19 ms   0.22 ms 

Clearly brute force without simplification is a bad idea. (*) improves timing over (1) for hard problems at least. For other problems, numbers are very close to conclude. My experience over other random problems that I tried, suggests that (*) leads to fastest simplification (i.e. reduced iterations) and hence better.

I also tried the problem that Peter mentions taking most time and noted its timing through my Java implementation. It took 6134 ms for algo (1), but only 10ms for (*). A clear evidence of the joint algorithms' usefulness. This improved timing comes from the fact that Sudoku gets drastically simplified before entering the brute-force stage and even for each choice of brute-force, it further gets simplified eliminating possible hints.

The Sudoku program is generic in the sense that it can solve 1x1, 4x4, 9x9, 16x16, … problems. A quick run of the program is as follows:

Solving a Sudoku puzzle of size 1 ...
 ===
| * |
 ===
[status] solved=0 hints=1
 ===
| 1 |
 ===
[status] SOLVED
Time taken=0.022 ms

Solving a Sudoku puzzle of size 4 ...
 ===== =====
| 4 * | 2 * |
| * * | 4 1 |
 ===== =====
| 1 * | * * |
| * * | 1 4 |
 ===== =====
[status] solved=7 hints=36
 ===== =====
| 4 1 | 2 3 |
| 2 3 | 4 1 |
 ===== =====
| 1 4 | 3 2 |
| 3 2 | 1 4 |
 ===== =====
[status] SOLVED
Time taken=0.132 ms

Solving a Sudoku puzzle of size 9 ...
 ======= ======= =======
| * * * | 7 * * | 6 * 3 |
| 7 * 3 | 9 * * | * 1 2 |
| * * * | 5 3 * | * * * |
 ======= ======= =======
| 1 * * | * * * | * * 7 |
| 9 * * | * * * | * * * |
| * 4 5 | * * * | * 9 6 |
 ======= ======= =======
| 3 * * | * * * | 9 * * |
| * * * | * * 1 | * 6 5 |
| * 1 * | 3 6 * | * 7 4 |
 ======= ======= =======
[status] solved=27 hints=486
 ======= ======= =======
| 5 9 4 | 7 1 2 | 6 8 3 |
| 7 8 3 | 9 4 6 | 5 1 2 |
| 6 2 1 | 5 3 8 | 7 4 9 |
 ======= ======= =======
| 1 6 2 | 8 5 9 | 4 3 7 |
| 9 3 7 | 6 2 4 | 1 5 8 |
| 8 4 5 | 1 7 3 | 2 9 6 |
 ======= ======= =======
| 3 5 6 | 4 8 7 | 9 2 1 |
| 4 7 8 | 2 9 1 | 3 6 5 |
| 2 1 9 | 3 6 5 | 8 7 4 |
 ======= ======= =======
[status] SOLVED
Time taken=2.198 ms

Solving a Sudoku puzzle of size 16 ...
 ============= ============= ============= =============
| 8  7  *  *  | *  *  *  *  | *  3  *  *  | 13 *  4  *  |
| *  5  14 *  | *  *  3  10 | 15 9  1  *  | *  6  *  *  |
| 16 *  *  *  | 5  8  7  *  | *  14 *  *  | 9  *  11 12 |
| *  *  4  *  | *  14 6  13 | *  11 10 12 | *  7  *  3  |
 ============= ============= ============= =============
| 14 *  *  8  | *  *  1  *  | *  *  *  3  | 7  4  12 *  |
| 9  *  *  *  | *  6  15 12 | *  *  13 14 | *  3  1  *  |
| 11 *  10 3  | *  *  13 *  | *  8  *  1  | *  *  6  *  |
| 6  *  *  1  | 14 *  4  *  | *  5  *  9  | 11 *  *  13 |
 ============= ============= ============= =============
| *  *  *  *  | 15 *  *  *  | *  *  9  *  | 5  *  2  10 |
| 10 1  *  *  | 6  *  5  *  | 13 15 7  16 | *  *  *  *  |
| *  *  16 11 | *  4  *  8  | 2  *  *  *  | *  13 *  7  |
| *  9  *  7  | 1  3  *  2  | 6  *  8  10 | 16 15 14 4  |
 ============= ============= ============= =============
| 7  *  13 *  | 9  16 *  5  | *  *  14 4  | 3  8  *  2  |
| *  *  3  *  | 10 *  *  *  | *  *  *  *  | *  16 15 *  |
| 1  *  9  *  | *  *  14 4  | *  *  *  *  | *  *  7  *  |
| *  6  8  *  | 3  *  *  *  | 10 7  *  *  | *  *  *  *  |
 ============= ============= ============= =============
[status] solved=116 hints=2240
 ============= ============= ============= =============
| 8  7  11 10 | 2  12 9  1  | 5  3  16 6  | 13 14 4  15 |
| 12 5  14 13 | 4  11 3  10 | 15 9  1  7  | 2  6  16 8  |
| 16 3  1  6  | 5  8  7  15 | 4  14 2  13 | 9  10 11 12 |
| 2  15 4  9  | 16 14 6  13 | 8  11 10 12 | 1  7  5  3  |
 ============= ============= ============= =============
| 14 13 15 8  | 11 2  1  9  | 16 10 6  3  | 7  4  12 5  |
| 9  4  7  5  | 8  6  15 12 | 11 2  13 14 | 10 3  1  16 |
| 11 2  10 3  | 7  5  13 16 | 12 8  4  1  | 15 9  6  14 |
| 6  16 12 1  | 14 10 4  3  | 7  5  15 9  | 11 2  8  13 |
 ============= ============= ============= =============
| 3  8  6  12 | 15 13 16 7  | 14 4  9  11 | 5  1  2  10 |
| 10 1  2  4  | 6  9  5  14 | 13 15 7  16 | 8  12 3  11 |
| 15 14 16 11 | 12 4  10 8  | 2  1  3  5  | 6  13 9  7  |
| 13 9  5  7  | 1  3  11 2  | 6  12 8  10 | 16 15 14 4  |
 ============= ============= ============= =============
| 7  11 13 15 | 9  16 12 5  | 1  6  14 4  | 3  8  10 2  |
| 5  12 3  14 | 10 7  8  6  | 9  13 11 2  | 4  16 15 1  |
| 1  10 9  2  | 13 15 14 4  | 3  16 5  8  | 12 11 7  6  |
| 4  6  8  16 | 3  1  2  11 | 10 7  12 15 | 14 5  13 9  |
 ============= ============= ============= =============
[status] SOLVED
Time taken=1.346 ms

Solving all puzzles from: ./src/sudoku/hardest.txt
Solved       :  11/11
Solved Cells :  270(input) 891(finally) 891(expected)
Cell Hints   :  5589(input) 0(finally)
Time Taken   :  2.59014144 milli-seconds
Avg time     :  0.23546740363636365 milli-seconds

Solving all puzzles from: ./src/sudoku/hard.txt
Solved       :  95/95
Solved Cells :  1953(input) 7695(finally) 7695(expected)
Cell Hints   :  51678(input) 0(finally)
Time Taken   :  41.772783616 milli-seconds
Avg time     :  0.4397135117473684 milli-seconds

Solving all puzzles from: ./src/sudoku/easy.txt
Solved       :  50/50
Solved Cells :  1418(input) 4050(finally) 4050(expected)
Cell Hints   :  23688(input) 0(finally)
Time Taken   :  1.868110848 milli-seconds
Avg time     :  0.03736221696 milli-seconds

A word of advice to readers. While building the Sudoku (or similar programs), you might be tempted to optimize it. I suggest its more important for the code to be readable and debuggable and concise. I tried to optimize my code, but realized its not worth it. Using some tricks i got speed up of 1ms for a specific alto, but that meant highly cryptic data structure. This would lead to increased timing in another algo and a nightmare to debug. Just do basic optimizations. As a side-effect of my attempt, i learnt few useful things to speed up the code in Java.
  1. bytes is more expensive that int. So use int.
  2. int check (e.g. i==0) is very-very slightly faster than boolean check. But don't use it if it compromises readability. 
  3. Exception handling (even if exception is never thrown) is expensive.
  4. Java doesn't do tail call optimization. So don't worry too must about trying formatting your recursive routine.
  5. keyword final has no effect on performance in Java. So if you think a one liner function is inlined, its not.
  6. Avoid using array list and use fixed array. Even if array list is of size say k and array of n where k < n, its faster to use array. UNLESS k <<< n.
  7. Hashmap, Hashsets would easily hog performance. It can easily double the running time. So if you can avoid it, its better.
  8. 1-d array is cheaper than 2-d array.
  9. Using objects is extremely cheap. I find next to nil penalty on using objects. So use them to simplify code.
  10. Caching can be useful. for example instead of referring A[i] several times, its faster to set x =A[i] and refer to x.
  11. private, public, and other qualifies only increase space of byte code but has no performance impact. So use them for good design principles and not for performance improvements.
Code is available at https://github.com/n1balgo/sudoku. Feel free to check it out and drop in comments or email for feedback (and suggestions and bugs).

May 2, 2014

Stick Breaking

Problem Set 1: Given a stick of length 1 meter. It is cut into two parts randomly.
  • Problem 1a: What is the expected length of the smaller stick?
  • Problem 1b: What is the expected length of the bigger stick?

Problem Set 2: Given a stick of length 1 meter. It is cut into three parts randomly.
  • Problem 2a: What is the expected length of the smallest stick?
  • Problem 2b: What is the expected length of the largest stick?
  • Problem 2c: What is the expected length of the stick with second smallest (or second largest) length?
  • Problem 2d: What is the expected length of the middle stick (stick that is between the two cuts)?
  • Problem 2e: What is the probability that the three sticks would form a triangle?
  • Problem 2f: What is the probability that the three sticks would form a right angle triangle?
  • Problem 2g: What is the probability that the three sticks would form an equilateral triangle?

Problem Set 3: Given a stick of length 1 meter. It is cut into two parts randomly. Then bigger stick is picked and is again cut into two parts randomly.
  • Problem 3a: What is the expected length of the smallest stick?
  • Problem 3b: What is the expected length of the largest stick?
  • Problem 3c: What is the expected length of the stick with second smallest (or second largest) length?
  • Problem 3d: What is the expected length of the middle stick (stick that is between the two cuts)?
  • Problem 3e: What is the probability that the three sticks would form a triangle?
  • Problem 3f: What is the probability that the three sticks would form a right angle triangle?
  • Problem 3g: What is the probability that the three sticks would form an equilateral triangle?

Generalized Cut Problem: Find the solution to problem 2a-2c, 3a-3c for n-1 cuts.

Solution: show

May 1, 2014

Defective Balls

For problem 1-3 consider a weighing scale (with two sides) and it only tells which side is heavier (it doesnt tell by how much).

Problem 1: You have 9 balls of same weight, except that 1 ball is defective and it is known to be lighter. Find the defective ball in two weighing.

Problem 2: You have 9 balls of same weight, except that 1 ball is defective (it can be heavier or lighter, we dont know). Find the defective ball in three weighing.

Problem 3: You have 9 balls of same weight, except that 2 balls are defective and they are known to be lighter. Find the defective balls in four weighing.

Problem 4: There are 9 machines that produce the balls. You get 9 balls from each machine during a production cycle. You are given a weighing scale that gives the exact weight. You are told that one machine produced defective balls in a given production cycle. Find the defective machine through a single weight measurement. Assume that a perfect ball weight 1lbs and defective ball weight 1.1lbs.

Problem 5: Consider problem 4. Now assume that the weight of the defective ball is unknown. Now find the defective machine in two weighing.

Problem 6: There are two jars A and B. A contains 50 perfect balls and B contains 50 defective balls. Arrange the 100 balls in the two jars in a way that the probability of randomly picking a ball (by first randomly picking a jar) is maximized.

Solution: show