The Map of Wordle

I’ve been busy doing all kinds of Wordle analysis — as I mentioned in the last post I’ve been testing various strategies to assess the “difficulty” of Wordle. This led to some interesting speculation about which word is best to guess first. Obviously, the best word to guess first is the correct word, but you can’t do that better than chance (unless you have the oracle, in which case you’re cheating.)

Another interesting question I had is, how many possible responses are there to a guess? The entire set is { green, yellow, nothing } across five letters, so 243. However, five of those possibilities are invalid, since one yellow can’t happen — if you have four of the letters right, by definition the fifth letter cannot be out of position.

So I mapped every possible guess to every possible answer to look at how many responses each first guess gives, on average, against every answer. This is an answer set of (2315 * 2315), or about 5.3 million possible guess/answer combinations. I did a bunch of analysis I’ll share later, but then I got an interesting idea. Can we visually plot this map?

The image above is exactly that — a 2315×2315 image showing every guess along the X axis and every answer along the Y axis. I simply averaged the (r,g,b) values across the letters. So five yellows becomes pure yellow, the correct answer becomes pure green, no correct letters is black, and all other combinations generate some color in the yellow/green spectrum. The idea was simple enough. I quickly ran through the file and generated a P3 PPM, then converting it to PNG on the command line.

I expected to see a bright green diagonal line and sparser areas away from it. I also reasoned that guess/answer pairs would be sort of symmetrical, in that guessing A for B would give something similar to guessing B for A.

What I did not expect was this rich map of words colliding into the diagonal axes, pronounced squares coming out of the diagonal axis, and yellow and green swirls showing words that are probably related. This isn’t a map of how closely related words are to each other in any sense other than letters in their positions, and it only uses the words that can be actual Wordle answers, not the full guess set. So it really is a map of how Wordle starts, the very top of the decision tree in two dimensions.

I want to look at it more closely and talk about what generates the columns, but I was taken aback by the simple beauty of this image and wanted to share it before doing anything else. Enjoy 🙂

UPDATE: Since this image represents one word in each axis with a single pixel, I realized that the sizes of the squares probably correlates to the letters of the alphabet. I applied a hue modification to each letter combination, which produced this:

Same image — one pixel per guess, but colors have been tinted by the first letters of the words they represent.

This image is less helpful from a word perspective, but it sure is pretty!