I Simulated 38,612 Countryle Games to Find the Best Strategy

A developer used data science and Shannon entropy to analyze the geography game Countryle, simulating over 38,000 games to find the most efficient starting countries and strategies.

I have always enjoyed geography games. For a long time I was obsessed with GeoGuessr, and more recently I got pulled into the collection of daily geography puzzles on the web. Games like Globle, Travle, and Countryle all test your geographic intuition in different ways, asking you to use hints and spatial reasoning to work your way toward the correct country.

Today's game really challenged my lacking West-Africa geography knowledge.

In Countryle, after entering a country, the game tells you in which cardinal direction the target lies. It also reveals whether you matched the correct continent or hemisphere, and whether the target country is larger or smaller in population and warmer or colder in average temperature.

While playing, I kept asking myself the same questions: Is there a best opening guess? And once I got the feedback, what is the best next move?

The easiest way to “solve” Countryle is to decrypt the daily target country string the backend provides the frontend with. Since everything else from there on happens in the browser, the decryption key has to be shipped somewhere in the JavaScript bundle.

That route is possible, but it is also boring. I wanted to solve the game the same way a player does: only using the feedback that appears after each guess. So instead of reading the answer directly, I extracted the dataset that Countryle uses internally and rebuilt the decision process from scratch.

Countryle’s feedback is provided through five channels:

For population and temperature, the game effectively operates with buckets. In my solver these become three states: wrong, close, and correct. wrong, indicated by Countryle in red, means the target lies clearly outside the tolerance range, close, depicted with yellow, means it is in the broader neighborhood, and the green correct means it falls within a narrow band around the true value. These buckets matter because they determine how aggressively the remaining search space can be filtered after each guess.

Not all feedback channels are equally valuable. Hemisphere is useful once, but afterwards it often becomes irrelevant. That's also why I excluded it from any of the following figures. Continent can be highly informative when it isolates a small region, but much less so when the result is something like Africa, which still leaves dozens of possibilities. Direction, temperature, and population tend to carry much more granular information, especially when combined.

I built five modules, one for each feedback type. Each module has two jobs: The filter always runs first. The scorer only decides what to guess next after the candidate list has already been reduced. This separation made the whole pipeline modular.

The real question, then, is how to define “most informative”. My choice was Shannon entropy. In this context, entropy measures how evenly a guess splits the possible outcomes. A guess is good when its possible responses are well spread out, because any answer we get will eliminate a large portion of the remaining countries.

Imagine opening with Norway. For many targets, the cardinal direction feedback will simply be some variation of “south”, which is not especially revealing. A much better guess is a country whose feedback distribution is balanced across many possible directions. That kind of guess creates more informative answers, and entropy captures exactly that.

It turns out that when it comes to cardinal direction feedback, Greece most evenly splits the world. That makes it a very strong guess if you want to get good directional feedback.

The direction module turned out to be more complex than I expected. I switched to rhumb lines, also called loxodromes and this is where the Mercator projection comes in handy. Its key property is that it preserves angles locally, which means that a path with a constant compass bearing appears as a straight line on the map. Exactly what Countryle uses to determine the cardinal direction feedback.

Once all five modules were working, I combined them into a single scoring pipeline. The theoretical computation of entropy across all the modules led to most informative starting country being Libya balancing all the different feedbacks.

Once the solver worked, I simulated the entire game locally. Countryle uses 197 countries. That means I can start from every country and target every country, summing up to 197^2 = 38,809 games. Subtracting the games where starting and target country are the same, we are left with a total of 38,612 games to be simulated.

The headline result is surprisingly strong: with all modules enabled, the solver reaches the target in 2.85 guesses on average.

Not all feedback channels in Countryle are equally informative. When used on their own, the continent hint performs extremely poorly, requiring on average nearly 25 guesses. Population and temperature provide substantially more useful signals, reducing the average number of guesses to roughly seven. The directional information derived performs even better, bringing the average down to around four guesses. Since Libya came out on top as the best starting guess, I also looked at its standalone performance in more detail. Unsurprisingly, it performs extremely well and leads to a very compact distribution of solve depths.

Source: Hacker News