Sudokus more random than random!
Darren Wraith pointed out this column about sudokus to me. It analyses the paper by Newton and De Salvo published in the Proceedings of the Royal Academy of Sciences A that I cannot access from home. The discussion contains this absurd sentence “Sudoku matrices are actually more random than randomly-generated matrices” which shows how mistreated the word “random” can be! The point is absurd because any random sequence is random, no more, no less than others. It is only the distribution behind the randomness that changes. I presume the author of the column interpreted a large entropy value as a sign of higher randomness and I would guess the authors did not make this claim. Although we are talking about distributions over a finite set, the uniform distribution over the set of all 1,2,…,9 valued matrices has simply nothing to do with the uniform distribution of the set of all sudoku matrices. Up to a factor , the later set is negligible within the first one. (It is impossible to create a sudoku by pulling 81 integers at random.) Therefore, comparing the entropies of both distributions hardly makes sense… This other remark in the column
“Newton and DeSalvo predict that this greater understanding of Sudoku could lead to better Sudoku-generating algorithms that create more difficult puzzles. Currently, Sudoku puzzles require at least 17 numbers to be given in their correct boxes in order for the puzzle solver to find a unique solution. The new study could decrease that number, making it more difficult to solve the puzzles.”
does not seem more relevant. Why should the randomness of sudokus be related to their difficulty or to the minimal number of entries for a unique solution?
January 28, 2012 at 12:18 am
[...] Monde puzzle of last weekend was about sudoku-like [...]
May 17, 2010 at 12:38 am
[...] After thinking about random sudokus for a few more weeks, I eventually came to read the paper by Newton and DeSalvo about the entropy [...]
April 19, 2010 at 10:35 pm
I don’t really care about Suduku but it’s fun to think that some simple Bayesian inference might resolve philosophical disputes about randomness.
April 19, 2010 at 9:45 pm
X:
I’d say it somewhat differently: It is the _procedure_ for creating the sequences, not the sequence itself, that is random. When you talk about sequences, certainly some are more random than others. 111111111 is much less random than 1000111000. If you wanted to, you could go Bayesian on this and say that your posterior probability that the underlying process is close to random is much lower, conditionally on observing 111111111 than after observing 1000111000.
April 19, 2010 at 10:04 pm
A: Right! This was hasty of me to phrase it this way. I was thinking about it today a wee further, but the coverage is too imprecise to make me understand how the sudokus were produced (and how the unconstrained matrices were simulated). If the sudokus were drawn uniformly over the set of all sudokus, the entropy should be
where N is the size of the set of all sudokus… Since the set of all matrices is much larger,
, we should have
. So some information is clearly missing!