Wordle-solving state of the art: all optimality results so far

Retr0id · on Jan 27, 2022

Everyone is missing the core game mechanic - the ability to share a "spoiler free" report of your game on social media. For example, my (human operated) game from today:

Wordle 222 3/6

⬛⬛⬛⬛⬜

⬛⬜⬛⬜⬛

⬜⬜⬜⬜⬜

However, this isn't truly spoiler free. It still leaks information. From what I posted above alone, you can infer that the last letter is probably a relatively common one.

With the help of Twitter's APIs (or in my case, a janky screen-scraping script), you can trivially gather thousands of these emojified game logs. I do not consider this cheating, because sharing these is part of the game.

You can use these, in aggregate, to perform frequency analysis. I haven't shared my code yet because I'm still working on it, but as an example, here's the output from my script for day 216:

https://twitter.com/David3141593/status/1484669351261347842

Out of all the possible words, it ranked the correct solution at rank 5 - and that's before providing a single input to the game.

Edit: Replaced green blocks with "⬜" because HN doesn't like them.

bgrainger · on Jan 27, 2022

Another project that solves Wordle by reverse-engineering the emoji tweets: https://twitter.com/davidebbo/status/1482911439731912706

Code in C#: https://github.com/davidebbo/WordleReverseSolver

Retr0id · on Jan 27, 2022

Damn, this is much smarter than my approach, to be honest.

Although my current implementation is O(n), compared to his O(n^2) - but he's getting much better results.

iainmerrick · on Jan 27, 2022

I don’t think that’s a core game mechanic, it’s the sharing mechanic - the metagame. I agree it’s also great, though!

Yet another thing that’s easily overlooked is that you don’t share a URL with a preview, as most developers would naturally think of; it’s just plain text! So it’s absolutely trivial to cheat, much simpler than extracting the word list or the solution from the code.

But why would you cheat? What’s the point?

stavros · on Jan 27, 2022

The core mechanic is that the word is the same for everyone each day, so you can compete with friends. If you could play a different word whenever you wanted, I don't think it would have gone viral.

eyelidlessness · on Jan 27, 2022

I enjoy the game itself and don’t have much interest in game solvers generally. But the same thought process happens when I see the emojis shared, especially now that I’m in a group where people share their results. It’s not cheating, but my temptation to gain information while playing would ruin the game for me. So my personal rule is not to look at anyone else’s emojis until I’ve completed the game for the day.

furstenheim · on Jan 27, 2022

You can do much better than frequency. Just six clues you can reduce it to less than 200 and solve it 2-3 guesses

https://medium.com/@furstenheim/learning-from-the-mistakes-o...

jlynn · on Jan 27, 2022

There are a few things that I think would be really interesting to do with this data:

1. Provide a different scoring mechanism. Number of guesses is a bit simplistic. I'd like to see something where blank squares are 2 points, yellow are 1 point, and green is 0. Then a total score could be computed across all guesses, with the goal being the least number of points. It would incentivize "hard mode" I think.

2. Provide a difficulty rating of the daily word based on average number of guesses

3. Given a "spoiler free" answer, make a game out of guessing what the original player guessed.

rkagerer · on Jan 27, 2022

Edit: Replaced green blocks with "⬜" because HN doesn't like them.

Thanks for clarifying, my first impression was inverted i.e. that the solid ones were "green".

fsckboy · on Jan 27, 2022

> It still leaks information. From what I posted above alone, you can infer that the last letter is probably a relatively common one.

I don't get what you mean about leaking information, but you can infer that all letters are "probably" relatively common ones, that's a property of relatively common, higher probability

Retr0id · on Jan 27, 2022

Let's call the last letter L4. Given no prior infomation, the probability of L4 == X, for each value of X in the alphabet, gives you a histogram - which is always the same, and can be calculated by looking at the game's dictionary wordlist.

If I give you the new information "I guessed L4 on my first guess", you can use this to update the histogram. The more-probable entries in the histogram are now even more probable, and the less-probable entries less probable.

For the specific numbers, have a look at Bayesian Inference

https://en.wikipedia.org/wiki/Bayesian_inference

fsckboy · on Jan 28, 2022

oh, you meant the position contains a popular letter because that brick shade is green! I thought it was yellow

pgt · on Jan 27, 2022

Congratulations on playing and solving Wordledles: meta Wordle!

d23 · on Jan 27, 2022

https://twitter.com/WordleStats gathers and tweets the aggregates daily

Retr0id · on Jan 27, 2022

Interesting, but discards a lot of useful information.

iainmerrick · on Jan 27, 2022

Wordle is so amazingly well-designed. I’m starting to feel it’s comparable to Tetris in how deceptively simple but well-balanced it is. The choice of 5-letter words and 6 guesses gives just the right level of difficulty; these optimality results back up that intuition pretty well.

not2b · on Jan 27, 2022

The author also did a great job of building a game that scales. No server, everything is in a very small amount of HTML and JavaScript, easily cached. He can have the world play with no ads at very low cost. Of course, this means people can easily cheat, but where's the fun in that?

sushisource · on Jan 27, 2022

It also demonstrates how easy it is to ruin things by caring about monetizing them. A lot of what's great about it wouldn't really be possible if you cared about making money off it.

bil7 · on Jan 27, 2022

all the Wordle dev would have to add is a single link to a place to donate. He would make a fair amount given it's current popularity. But I agree with your point.

rkagerer · on Jan 27, 2022

Without diminishing any accolades to the author, this is exactly how I would design the game - and the most sensible approach. It's the bulk of the web that is shit these days, and the low expectations and surprise I see expressed is a testament to how bad things have gotten.

mrfusion · on Jan 27, 2022

The entire allowed word list is in the JavaScript?

ridaj · on Jan 27, 2022

Also, the entire list of solutions for every single day.

spoils19 · on Jan 27, 2022

I think it would be possible to build an even faster, more robust solution with plain HTML, CSS and a Rails / PHP server on the backend. The less propagation of JavaScript, the better.

iainmerrick · on Jan 27, 2022

The current implementation has easily survived a worldwide hug of death due its massive viral success. It’s plenty robust.

Your server-side PHP version would be slower for most users and might get expensive to run.

tiagod · on Jan 27, 2022

Rails and PHP would be the last thing on my mind when scaling a project like this one. Worldle only has to serve a bunch of static files to clients (the exact same files to everyone). How would this be improved by having back and forth with a server to play the game?

ladberg · on Jan 27, 2022

I'm curious how adding network requests and server dependencies to code makes it faster and more robust than code that works entirely offline.

dalmo3 · on Jan 27, 2022

Wouldn't that mean a network request and full page reload for every single guess? Genuinely curious how that could be faster than it is.

faffernot · on Jan 27, 2022

It is also a fun little game to program/make a clone of

I made one simple web page with JS, HTML, and CSS, and now I can play it whenever I want (or show it off to my friends who introduced me to Wordle a few days ago).

Essentially Wordle is just Hangman with a neat twist.

reificator · on Jan 27, 2022

> Essentially Wordle is just Hangman with a neat twist.

I'd say the closer comparison is to Mastermind[0].

[0]: https://en.wikipedia.org/wiki/Mastermind_(board_game)

bryans · on Jan 27, 2022

Wordle is basically a clone of Lingo, so that'd be the closest comparison available.

https://en.wikipedia.org/wiki/Lingo_(American_game_show)

iainmerrick · on Jan 27, 2022

I was thinking about Mastermind too. I played that one as a kid, but Wordle is a better version of the same idea:

- Mastermind is based on abstract codes, Wordle is words. Words are fun!

- Wordle is much easier to explain. I could explain it to somebody who hadn’t heard of it and we could be playing on pen and paper inside two minutes.

bloak · on Jan 27, 2022

I initially thought it's just like Mastermind, but with Mastermind you are not told which colours (corresponding to letters) are right: you are just told there is one right in the right place and two right in the wrong places, for example.

julianwachholz · on Jan 27, 2022

I had the same exact idea. It turned into a fun weekend project[0] that allowed me to dust off my vanilla JS skills and dive deeper into Tailwind CSS and put my own spin on the concept by basically turning it back into an asymmetric multiplayer game.

I do commend Josh Wardle for implementing Wordle with native web components, I personally wouldn't have the patience.

[0] https://word.rodeo

keithnz · on Jan 27, 2022

it reminds me of the puzzles that had a starting word, an ending word, and you could only change one character at a time to make a new word to finally achieve the ending word, and mixed in with the game "master mind"

pfarrell · on Jan 27, 2022

I believe they are called “word ladders” [0]. Coincidentally, I was listening to Genesis’s song “Supper’s Ready”, yesterday. It includes 2 word ladders in its lyrics.

  Mum to mud to mad to Dad

and

  Dad to dam to dum to Mum

https://en.m.wikipedia.org/wiki/Word_ladder

keithnz · on Jan 27, 2022

that's the one!

CoryAlexMartin · on Jan 27, 2022

Games with less complexity are generally easier to balance, so one shouldn't be too surprised.

I'd consider Tetris to be far more impressive design-wise in that it's a easy to learn but difficult to master, despite its simplicity. It's a mind-blowing experience to watch the best Tetris players in the world. The skill gap between a beginner Wordle player and the best in the world simply cannot be that great, by nature of the game's design. I guess a large part of the reason is because the player can't alter the game state: they can only gain information. It doesn't allow for complexity to emerge.

iainmerrick · on Jan 27, 2022

That's fair -- I definitely agree that Tetris is deeper, in the "minutes to learn, a lifetime to master" sense.

Partly that's state, as you say, but I think it's also significant that Tetris is a realtime arcade-style game. Experts are vastly faster than newbies, and the main difficulty increase mechanism is that the blocks drop faster.

Wordle might have some similarities there. People don't commonly share their times, but I bet even among competent players who routinely solve it in 4-5 moves every day, there are some expert players who can do it much faster -- just like some people can solve the daily crossword in a few minutes, while others work on it gradually over the whole day.

You could also distinguish between people who usually take 4-5 moves and those who can do it in 3-4 (e.g. optimal play with its average of ~3.4 moves). It's just not obvious immediately because there's a lot of variance; if a great player solves it in 3 you could write it off as a fluke, and if a poor player is genuinely lucky and gets it in 2 you could mistake that for skill.

CoryAlexMartin · on Jan 27, 2022

Right, if time to solve were included in the shared results, you’d be able to pick up on skill variation a lot easier, and players would start to regard speed as something to improve at. I think it would make the game more interesting (not that it isn’t already).

acchow · on Jan 27, 2022

Hmm, I feel like 6 guesses is fine for your first few games. But once you begin to think about pruning and letter frequency strategies, you'd probably never hit 6.

cookie_monsta · on Jan 27, 2022

Luck takes over after a certain point and I don't know that the solver programs and averages really reflect this. If you're 3 guesses in and looking at _ _ACK, skill isn't necessarily the thing that's going to distinguish a 4 result from a 6.

kzrdude · on Jan 27, 2022

Maybe it's just me that's very bad. I don't use any self-computed or computed word lists though.. :)

We can't share the wordle emoji thing here because hacker news strips something from it that removes the colors.

acchow · on Jan 27, 2022

I don't feel like I'm good at this but I hit 6/6 only once - on my first day.

I'm in a group chat where we all share our Wordle results for the day. It seems everyone is at least as good as me.

jacquesm · on Jan 27, 2022

2048 gave me a similar feeling.

scoot · on Jan 27, 2022

2048 could be defeated with a simple repeated key sequence. Not so this. No comparison.

whimsicalism · on Jan 27, 2022

i find this much easier than 2048

dalmo3 · on Jan 27, 2022

It's also much less time-consuming.

iainmerrick · on Jan 27, 2022

You mean Threes, I hope!

andrewingram · on Jan 27, 2022

Threes is the better game, IIRC there's only one rule difference but it makes all the difference. And I really feel for the small team that spent months designing Threes only for a clone of a clone to be the one that everyone remembers.

tirpen · on Jan 27, 2022

No, that's a completely different game, with different mechanics and a very different feeling. Threes feels more slow and methodical and 2048 feels more fluid and slidey.

Both are very good games, but only one of them became a world wide sensation.

iainmerrick · on Jan 27, 2022

I don’t recall 2048 being nearly as big as Wordle, or say Flappy Bird, but of course it’s fairly subjective.

Anon4Now · on Jan 27, 2022

I think it's too derivative of Fallout's computer hacking to garner that much praise.

felipc · on Jan 27, 2022

What I really like about Wordle is that the very reason that all of these fun and interesting math analysis are being done on it is due to the entire corpus being easily accessible there in the code. It's not something kept away server-side. It's so easy to cheat on it that there's no point in doing so.

In addition, if it was an ad-filled, high growth commercial venture, this approach would likely never have been chosen. (Not to mention the entire game mechanics of one word per day, instead of trying to show you a stream of ads every minute or so after each word you would have played).

I guess what I wanted to say is: thanks, open web!

johnla · on Jan 27, 2022

You jinxed it. Tomorrow's word will be HONDA

seoulmetro · on Jan 27, 2022

I'm afraid to tell you that you like something for the wrong reasons.

The reason justification you just provided is wrong. People aren't digging through because they can, they're digging through because they like the game and it sits in a perfect state of simple, popular, fun where people want to see the curtains behind the game.

This would still be reverse engineered if it had to be decompiled first.

Your second statement kind of proves that you don't believe in your first statement, and in fact it is definitely the simplicity, popularity and fun that comes from the game that spawned intrigue.

What you really wanted to say was: "thanks, fun games!" because this has nothing to do with the game being open.

iainmerrick · on Jan 27, 2022

I disagree, the openness of the implementation does help. People have different skills - not everyone doing these statistical analyses would be able to decompile a game, say.

Flappy Bird and Desert Golfing are both really simple, and both were viral successes; but neither was directly analysed so rapidly and so openly, because it’s much more difficult to do that on iOS.

seoulmetro · on Jan 27, 2022

That's the thing. More people are going to analyse it basically if it's open, but they wanted to analyse it either way and if it were closed, they would have found the several people who did spend the work decompiling and analysing certain data or reverse engineering it and use their findings.

It's kind of made easier, but not started by open code.

iainmerrick · on Jan 28, 2022

Hmm, I don't entirely disagree, but friction is a real thing and can make a massive difference in whether something becomes a viral hit or not.

If something is a huge success it's likely to be hacked, sure. But being open and hackable can contribute to its success, so if it's closed, it might be less likely to be a success in the first place. In which case nobody will bother hacking it.

TylerGlaiel · on Jan 27, 2022

I spent a while looking for a different type of optimal wordle solution as well that I haven't seen anyone else try, whats the minimum number of words you need where you can guess the same set of X words and then have enough information to guarantee solve the puzzle on the next guess? After running a search overnight I found a fair amount of 8-word sets that satisfied this property, but nothing lower

kthejoker2 · on Jan 27, 2022

This is by defintion suboptimal, as you'd ignore information gain from each guess.

Only the first guess is "unbounded"

TylerGlaiel · on Jan 27, 2022

this is a specific limitation I'm imposing because it makes for an interesting problem, not because its meant to be optimal.

jachee · on Jan 27, 2022

If you were careful, you could cover 25 of 26 letters in 5 guesses.

adamjb · on Jan 27, 2022

Been there, done that!

https://news.ycombinator.com/item?id=30045487

joshbuckler · on Jan 27, 2022

The "worst-case number of guesses" objective considered here is pretty insensitive to differences in quality of the decision tree. It seems to me that a better objective would be: minimize the number of secret words that take more than 6 guesses (hopefully to zero), then minimize how many that take 6 guesses, and so on.

kthejoker2 · on Jan 27, 2022

This has been done

https://jonathanolson.net/experiments/optimal-wordle-solutio...

Even with adversarial Wordle, the upper limit on guesses is 5 (ie after 3 guesses the maximum remaining word pool is 2)

The hardest words in Wordle are BOOBY / BOOZY no optimizer would include those letters in the first two guesses ...

Thorrez · on Jan 27, 2022

What you described is not what joshbuckler described.

joshbuckler wants an algorithm that has the fewest words that take at least 6 guesses, and also among all algorithms that tie for that number of words that take at least 6 guesses, has the fewest number of words that take at least 5 guesses, and also among all the algorithms that tie for that number of words that take at least 6 and at least 5 guesses, has the fewest number of words that take at least 4 guesses, and also...

The page you linked to doesn't seem to say it does that.

kthejoker2 · on Jan 27, 2022

As I noted, adversarial wordle shows that no words take 6 guesses and only 2 words take 5 guesses (BOOZY and BOOBY.)

If you removed just 1 of those 2 words (leaving 2312 possible Wordles) all Wordles can be solved in max 4 guesses.

Ignoring the 2 scenarios where your first two "optimal" guesses are the Wordle

X% of the time you have 1 word left (guesses= 3)

Y% of the time you have 2 words left (guesses =3.5)

(100-Y-X)% of the time there is a pool of words left, requiring one more "optimal" guess

And either your optimal guess #3 is the Wordle Z% of the time (guess=3) or there is now one word remaining (guess ≈ 4)

Where Z in this case is just a function of the number of words remaining (ie Y is just a special case of Z)

So you want to maximize X and the weighted average of Z across pools.

By sheer brute force you can do so

1) for each 2 guess combo 2) discard any "suboptimal" combo where for any remaining response state word pool there is no optimal guess #3 (ie not possible to definitively guess in 4 guesses) 3) calculate avg remaining guesses 4) identify optimal word 1 (minimum of sum of step 2 per first guess) 5) within combos with word 1, identify optimal guess 2 for each response state

And the weighted average of step 3 for these combos is the global minimum for Wordle.

I believe this is the algorithm you're after? In this case, we're making a first guess that maximizes the chance we will get 3 guesses instead of 4.

Per this algorithm, optimal guess 1 is RAVED.

svat · on Jan 27, 2022

Here's a variant question about Wordle strategy that's AFAIK unsolved, which I'm currently working on (my program is too slow). Namely: yes, any optimal strategy will involve adapting your second guess to the response to the first guess, but what if you'd like to play two blind guesses instead of one? (Many human players seem to do this in practice.) In that case, what would be the best two starting words (and decision tree thereon), and how much worse is it than the globally optimal strategy?

In other words: what's the cost of a second blind guess?

Maybe going even further: is it possible to still win the game with three blind guesses, or four? What is the smallest subset of "blind" guesses after which the possible solution set is "small" (say at most a word or a handful: less than some specified k).

muyyatin2 · on Jan 27, 2022

I saw an analysis for a four-word blind combination, but I believe it didn't fully determine the word by the 5th guess: http://www.garytay.net/4-words-to-win-wordle-in-less-than-a-...

pjsg · on Jan 27, 2022

I play a fixed first word, and then one of 6 second words depending on the number of matches (any color) I get. I can remember these seven words and so it doesn't feel like cheating. Remembering words for all the 200 odd possible results of the first guess doesn't seem feasible, and referring to a cheat sheet seems to be, I guess, cheating.

First guess: ARISE Second guess (from 0 - 5 matches): COULD CLOUT CLOUT PLANT PWNED EIGHT

comex · on Jan 27, 2022

Another comment on this thread tried something similar:

https://news.ycombinator.com/item?id=30094398

avl999 · on Jan 27, 2022

There will be Worlde inspired interview questions coming at big tech companies near you in the next couple of months.

tomlockwood · on Jan 27, 2022

Great writeup! I'm self-taught so I worked kind of naively on a different approach: https://news.ycombinator.com/item?id=30050231

I'm hoping to get the 3.4 result, soon, in a Rust program I'm writing!

jonathankoren · on Jan 27, 2022

I’ve been playing with your Python version and comparing it to my heuristic approach.

We’re both basically playing hard mode with letter position frequencies, but for some reason yours is slightly better, although I’m not sure why. Also, yours runs much faster than mine, probably because I’m using a regex instead of a proper positioning index.

I’ve been playing with various strategies, but position frequency of single letters has been the best one I’ve come up with. Even factoring factoring in the frequencies of bigrams is worse. Kind of depressing that my first guess was the best.

I’ve been explicitly avoiding incorporating knowledge that there is a target dictionary and legal guess dictionary. It seems like cheating, but it looks like the optimal bots use that information.

My next strategy is to explicitly calculate coverage of the words, instead of using probabilities. (FWIW, all my statistical approaches suck horribly. :/ )

https://github.com/jonathankoren/wordle-solver

tomlockwood · on Jan 27, 2022

I spent a bit of time using cProfile in Python to see where my program was slow, and optimize. Some dict initialization was also slowing me down, but my feeling also is that regex will probably be a fair bit slower than the string comparison.

Also yeah, my next attempt is going to do some other stuff, I moved to rust to speed it up, but its mostly a learning exercise.

Thanks for sharing!

jonathankoren · on Jan 27, 2022

FWIW, feeding both of ours only the target dictionary:

"Score" is the sum of all the guesses, with 7 for losses.

Yours:

  Wins 2302 Losses: 13 Surrenders: 0 Played: 2315 WinPct 99.438 %
  Number of attempts to win: mean: 3.644811
  Score (lower better) 8485
  Winning Histogram
   1 |   0.04 % |  (1)
   2 |   6.34 % | ### (146)
   3 |  37.36 % | ################### (860)
   4 |  43.31 % | ###################### (997)
   5 |  11.08 % | ###### (255)
   6 |   1.87 % | # (43)

Mine:

  Wins 2297 Losses: 18 Surrenders: 0 Played: 2315 WinPct 99.222 %
  Number of attempts to win: mean: 3.626034
  Score (lower better) 8455
  Winning Histogram
   1 |   0.04 % |  (1)
   2 |   5.75 % | ### (132)
   3 |  40.81 % | #################### (937)
   4 |  40.72 % | #################### (935)
   5 |  10.37 % | ##### (239)
   6 |   2.31 % | # (53)

NathanOsullivan · on Jan 28, 2022

IMO using the guess list is fine.

Since the game rejects words that are not in the guess list without penalty, it could easily have been brute forced if it was not easily obtainable from the implementation.

forrestthewoods · on Jan 27, 2022

Beautiful. This is exactly what I was hoping to see.

This sub-link is especially good. https://alexpeattie.com/blog/establishing-minimum-guesses-wo...

I really really need to learn how to use an SAT solver. The problems they can solve is downright magical as far as I’m concerned. Does anyone know of some good tutorials?

lacker · on Jan 27, 2022

It probably really depends what sort of problem you're working on, but in my experience it was easier to represent things as mixed integer linear programming, (MIP) rather than a SAT instance. They're both NP-complete representations, but with the MIP representation you can naturally add constraints like "maximize the sum of these various things" as well as the sort of boolean constraints that work naturally with SAT.

And for MIP solving, the Python mip package is quite convenient.

https://python-mip.readthedocs.io/en/latest/quickstart.html

CJefferson · on Jan 27, 2022

I would strongly recommend using minizinc ( https://www.minizinc.org/ ) or Essence ( https://conjure.readthedocs.io/en/latest/essence.html ) (other systems exist too), which provide higher-level interfaces to SAT (and other system) solving.

You can view these as compilers to SAT, much like "no-one writes in assembler", most people don't write raw SAT either :)

mzl · on Jan 27, 2022

I agree with the recommendation to start with a modelling language in a sibling comment. On the MiniZinc homepage, there are Coursera courses linked that are quite nice as an introduction into combinatorial modelling.

Apart from that, it is worth noting that the solver used in the article, OR-Tools, is not strictly a SAT solver. I would rather describe it as a hybrid portfolio solver, since it combines constraint programming, SAT, MIP, and local search in a single solver. It is also possible to use that solver as a back-end when solving a model written in MiniZinc.

Thorrez · on Jan 27, 2022

One thing I'm not clear about: in the version that knows the 2315-word list, is it allowed to only guess words from the 2315, or can it also guess words outside of it, but in the 12972-word list?

It seems to me like with more flexibility in what words to guess, it might be able to eliminate words faster and build a more-optimal tree.

thxg · on Jan 27, 2022

Guesses come from the 12972-word list in both cases.

cynik_ · on Jan 27, 2022

I took a stab at generating a visualization of all the solutions with one brute force based approach: https://twitter.com/kunalbhalla/status/1486189627644035078

a-dub · on Jan 27, 2022

> While many focused on finding the best “first word” to play as a guess, the first guess is just a tiny part of what we really want: A decision tree.

some algorithms can be described as a decision tree, but not all.

still curious if they all converged to the same seed word (or basket of seed words with similar characteristics).

zodiac · on Jan 27, 2022

But all algorithms that deterministically play a finite combinatorial game can be represented as a decision tree. (there may be other, far more succinct representations)

a-dub · on Jan 27, 2022

ahh yeah.

still curious what the first node looks like though ... (and it it's similar across what people are finding)

wyldfire · on Jan 27, 2022

A good first guess determined by an algorithm would feel less like cheating and more like a 'pro tip', IMO.

My wife plays and has tried strategies like focusing on high frequency letters or maximizing vowels used in the first word, for example. It would be neat to know if one of these approaches yielded more information than others.

erehweb · on Jan 27, 2022

There's a lot of analysis on how to use a computer program to solve Wordle. Have people been able to use these to extract any suggested techniques for humans on easy or hard mode besides optimal first guesses?

jeroenhd · on Jan 27, 2022

In the Dutch version of Lingo (a game show which has been televised for over thirty years now that's basically Wordle but in more and different configurations) there was an episode where two brothers memorized an optimized set of words for determining letters and their positions. One of them wrote a write-up here: https://vincenttunru.com/hacking-a-gameshow/

In the show, you usually get one letter for free in the six letter words so there's a better word set available with more words to remember. Despite this, the episode was quite boring to watch because the optimized guessing strategy takes all the fun out of the game by guessing the same words every day.

MauranKilom · on Jan 27, 2022

The linked site has easy to follow decision trees in text file format: https://sonorouschocolate.com/notes/index.php?title=The_best...

If you want something much less elaborate (i.e. something you can memorize), you will find plenty of heuristics (e.g. "what are the optimal two starting words based on letter distributions") in the earlier discussions. The optimality of a heuristic for your strategy necessarily depends on your strategy though, and unless you actually memorized the list of solution/trial words, such a strategy will remain impossible to formalize.

rcthompson · on Jan 27, 2022

I don't find the "limit possible solutions to the 2315-word list" problem interesting. Is there any reason to think this isn't just the first 2315 words of a lazily-evaluated ordered sequence of all possible solutions? Presumably the complete list of possible solutions would be "all 5-letter combinations one could reasonably argue are English words" minus any words considered profane or otherwise inappropriate.

jonathankoren · on Jan 27, 2022

The main problem I have with these optimal solutions isn’t that they’re using the target dictionary, it’s that they’re taking advantage of the fact that there is a dictionary of valid words, and smaller dictionary of target words. That’s why none of them solve it in one guess.

Thorrez · on Jan 27, 2022

The game itself prevents you from guessing words that aren't in the 12972-word list. It doesn't count as a guess, and doesn't go through. So if you want to try anything with a larger list, you're playing something Wordle-like, not actual Wordle.

But I agree that using the 2315-word list is basically cheating. On the other hand, are words in that list more common English words than words in the 12972-word list? If so, then humans in a sense intuitively take advantage of the 2315-word list by guessing more common English words rather than infrequent English words. So since humans sort of use the 2315-word list intuitively, it sort of makes sense for a computer to use it as well.

cookie_monsta · on Jan 27, 2022

> are words in that list more common English words than words in the 12972-word list?

Subjectively, yes - those are the 2315 words that the dev's partner recognised out of the 12972.

https://pressnewsagency.org/wordle-is-a-love-story

NathanOsullivan · on Jan 28, 2022

I agree, the answer list is clearly curated to avoid unusual words.

This leaves a fuzzy middle ground for human players: there are accepted guess words like SOARE that can be good as a first guess, but is almost certainly not going to be the answer. Conversely, FAVOR is not a good first guess; but of the two words human players know which is more likely to be the answer.

For computer implementations that want to avoid "cheating" (not use the answer list), there would still seem to be room for evaluating how likely a given guess is to be in the curated list.

That could be done by examining a corpus for frequency of each guess word, which if done right should give the same insight we have as humans.

GistNoesis · on Jan 27, 2022

I spent a few hours this week-end playing with writing a c++ single-threaded zero-memory allocation, exhaustive decision-tree solution-space explorer, just to see how far brute-force could go.

Here is the code if anyone is interested : https://gist.github.com/unrealwill/83dc7a1d0675ac0751535a6c4...

dhotson · on Jan 27, 2022

I wonder whether you can improve on this, because people that play regularly can eliminate words that were a solution on a previous day. That's assuming solutions don't repeat of course.

Could eliminating previous solutions help reduce the search space meaningfully?

greeneggs · on Jan 27, 2022

Yes, of course it can. And this doesn't seem like cheating. I would be interesting in knowing how the difficulty (worst and average case) for the different models changes over time.

Of course, on the last day (in 2027?), there will only be one word left, so the game can be solved with one query.

That's not interesting, but for how long will the game stay interesting? How hard will it still be in 2023, 2024, etc.?

j2kun · on Jan 27, 2022

The "rigorous proof" points to a Wiki page with a rambling, unrigorous explanation of a tree search.

svat · on Jan 27, 2022

The program is the proof. It's rigorous in the sense that it's an exhaustive tree search (with lower and upper bounds used for pruning), without any heuristics like "pick a word that leaves the smallest set remaining in the worst case" or anything that amounts to "let's not explore that part of the tree; hopefully there will be nothing of significance there". (Also, it's someone's blog—the author's page says "My name is Alex Selby. I am a mathematician living in Cambridge, UK"—just happens to use Mediawiki.)

j2kun · on Jan 29, 2022

I don't doubt the author's credentials, and the program is probably correct (I don't care enough to check). But if you write a program and tell me it does what you say it does with a high level reason why you think so, that's not a "rigorous" proof. It's a sketch of a proof.

jedberg · on Jan 27, 2022

Worldle solvers are like the Hello World of AI programming. Or maybe not AI, just statistical analysis.

Retr0id · on Jan 27, 2022

You appear to be writing this from the perspecive of someone who hasn't tried writing a wordle solver yet. The "obvious" approaches have unreasonable algorithmic complexity, so you have to get at least a little bit creative.

Vetch · on Jan 27, 2022

I've written a wordle solver (and tic-tac-toe, Chess, Leduc Poker CFR, mastermind and others) and I do agree it's a good start for AI. It's not trivial but still quite approachable, much like the game itself. My solver takes a very simple approach and has algorithmic complexity O(N^2).

The word list for the initial guess can be precomputed (I handle this simply as well, top 16 words with 3+ vowels, no duplicated letters and maximized pairwise similarity). The easiest first step that goes a long way on its own is simply maintaining constraints. Having applied constraints, the next issue is selecting from narrowed word lists. To do that I compare every word to every other word by spelling, idea being highest scoring word will be most informative, but also constrained to maximize difference from previous guess (although this step is rarely needed).

As available guesses are already significantly whittled down even by second guess, it runs in << 1s and completes in 3-4 guesses.

thorwawayrus53 · on Jan 28, 2022

view page source its all there