2024-09 | pkmn

August’s momentum carried forward into September and transitions was once again the prime focus. However, given that it’s nearing the finish line, I started getting excited about how to demonstrate its use, leading me to work on a damage calculator and 1v1 solver interface. The engine’s debug UI was completely rewritten in JSX and the protocol was extended in the form of an extra data field to improve composability in the future (so that the visual components which make up the UI can be repurposed for other UIs or augmented with data from EPOké and 0 ERROR). Work was also done to reduce the resulting output size by more than 10× – a very important optimization given in the near future the debug logs will likely contain enough information to cause browsers to crash upon opening the file otherwise. The debug files embed the raw binary data as a Base64-encoded string and prune any data that isn’t strictly required for rendering the page.

In the process of reworking the debug code paths I added support for dumping the final state that the fuzz test failed on to enable trivial reproducibility. The fuzz tests have always output the seed they failed on, but this had limitations – due to changes over time in the fuzzing logic itself the seed may not reproduce the exact same failing scenario, and the failing state may come after numerous successful states, making debugging substantially more difficult. Often the scenarios involved a specific set of volatile statuses that required several turns of setup and good luck with the RNG to replicate in the standalone transitions harness – dumping the exact frame completely eliminates this manual work and enables tight debug loops.

Saving the minimal failing bytes also means that all interesting states can be saved to form a suite of regression test cases – something which proved necessary after playing whac-a-mole with division-by-zero glitch logic that plagued transitions work. Checking in a bunch of binary files containing the failing fuzz test states into source control is slightly problematic as it kind of ruins the ability to inspect their contents, but ChatGPT helped me figure out how to write a Visual Studio Code extension to display the state in the debug UI within the editor (or well, it did a pretty poor job and I ended up having to crib most of the code from Bun, but who’s keeping score?).

While not required for the debug UI, the damage calculator and 1v1 solver demo necessitated I actually figure out how to get the WASM addon building and loading correctly in the browser. This is slightly non-trivial as composing JS-wrapping-Zig packages is sort of an area of research – Zig prefers builds from source, so 0 ERROR depending on EPOké depending on the pkmn engine doesn’t result in 3 different WASM chunks – there’s one WASM blob with all of the Zig code and the JS wrapper libraries need to understand this and handle loading correctly. Some work here is still necessary (especially to get everything to play nicely with JS bundlers which all have their own incompatible ways of handling things), but a lot of progress has been made here. Trying to put together the demos also led to work on a minimal select/autocomplete component that will likely end up being borrowed for PocketMon’s UI (see! PocketMon is being actively developed!).

Somewhat disheartened by the work involved with trying to tie up the loose ends in order to be able to ship v0.1 of the pkmn engine, I happened to regain motivation after deciding to rewatch The Dawn Wall as I was struck by some of the similarities with Caldwell and Jorgenson’s efforts. Going from zero to one of anything is challenging, as there are a lot of unknowns and it’s not clear that there’s always a path forward. People have written Pokémon engines before, just as people have climbed mountains, but that doesn’t mean one particular way of doing so is going to work. Durations really feels like the “Pitch 15” of the engine, though after finishing the film and fixing the (hopefully) last bug related to Thrashing moves I am on the precipice of completing it. There’s still a lot of work to do before release, but it does seem like much of what lays ahead is well understood and tractable.

Finally, a couple days were dedicated to the usage stats processing logic and ensuring it’s at parity with upstream. The last remaining piece is to complete some form of stats processing loop – hopefully there will be an opportunity this month to work on this, as stats is an important component of the plans for the pkmn.ai project.

— pre