SIMULATED ANNEALING · MADE PLAYABLE
User Research · Industrial Design · 2025

Maze Solver 退火迷宮

A simulated-annealing maze game for a middle-school AI class. Built with Claude in twelve hours, refined across a second day. Less about writing code than about approaching two Others: the student who hasn’t agreed to learn yet, and the AI partner across the round-trip.

Year
2025
12-hour build + iteration
Type
Quantum-annealing tutor
one of two paired modules
Methods
Python · pygame → JS port
VIBE coding with Claude
Project
Solo project
01 · What Is Annealing

The same word, in different fields

A middle-schooler hearing “simulated annealing” prepares to be bored. The word arrives wrapped in formulas, and the formulas have nothing to do with their day. The job is to find a version of the idea that lands before the word does. The idea has two homes: one in materials, one in computer science. The picture is the same in both.

ChemistryMaterial annealing

Heat a metal until its atoms can move freely, then cool it slowly. Heating lets dislocations migrate; slow cooling lets the lattice settle into a low-energy state. The result: fewer defects, larger grains, better ductility, better toughness.

The temperature is not a fact about the metal. It is permission for the atoms to rearrange.

AlgorithmSimulated annealing

Same idea, in computer-science clothing. A virtual temperature controls how willing the searcher is to try a worse-looking move. High temperature, wide exploration: the searcher can jump out of dead ends. As temperature falls, the searcher commits to what now looks best.

The pattern shows up wherever the right answer hides behind wrong-looking ones. Puzzle-solving, route planning, and the one this game teaches: shortest path.

LOW T · SCATTERED, NOT FITTINGHIGH T · PIECES REARRANGECOOLED · PERFECT INTERLOCK
Heat shakes the pieces loose; cooling clicks them into a tight pack. Shake to escape, settle to fit. After a slide by team-mate Abbie Chang.
Teammate’s module

Classical annealing · observational

A two-phase simulation (heating then cooling) runs on screen while the student watches walls open and close, then BFS resolves a shortest path. The viewer observes the algorithm running on its own. The energy landscape is the artefact itself.

My module

Quantum annealing · embodied

Here the student plays the role of the searcher. The dice and shovel make exploration physical. The vision limit lets the student feel what blindness costs. The temperature gradient is something the player walks through, not something that runs on its own.

Two artefacts, two intuitions, one curriculum. Which one you remember depends on which one your hands touched.

02 · Play It

Take the artefact for a spin

Best of five, first to three wins. Watch one tournament loop, then take over Player 1.

Player 2 plays the same game you do, with one advantage: before spending a shovel, it mentally simulates the result and only commits if the tunnel actually shortens its remaining path. No training, no model — just the optimisation idea the game is teaching, turned back on the game itself.

Demo · AI vs AI
P1 0 · P2 0 first to 3 · Round 1
PUBLIC INFO The goal sits at the furthest reachable cell from the start.
Round 1 · demo mode running
03 · What the Mechanics Mean

Each move you just made, mapped to an annealing concept

You just played a round. Each thing your hands touched corresponds to one specific idea in the algorithm: the shovel, the dice, the colour gradient, the vision radius. None of them were decoration. Below is the table that names what you already know.

Abstract concept Game mechanic What the player learns by playing it
Quantum tunneling Shovel that breaks one wall. A particle can pass through a barrier without going over. The defining quantum feature, encoded as the player’s one cheap action.
Temperature Maze size + shovel count. Maze size and shovel count together set the temperature. More space and more wall-breaking tools mean more freedom to explore before the search commits.
Energy landscape Path-distance color field. Cells render cool blue near the goal, warm red far. “Energy” becomes literally visible on the floor; players read terrain like a thermal map.
Local minimum Vision radius of 3.5 cells. The whole point of stochastic optimisation is that the searcher cannot see the answer. Full-map omniscience would void the lesson.

Of the four, the vision radius is the one that carries the lesson. A student playing with the whole map visible would have nothing to feel. The algorithm exists precisely because the searcher can’t see the answer.

04 · VIBE Coding Workflow

Three turns from one build cycle

This page is a simulated-annealing algorithm dressed as a playable game. The two-day build cycled through three turns: one at the start (how to begin without misfiring), one in the middle (how to grow without breaking), one near the end (how to recover when it breaks anyway). The vibe-coding habits that made each turn cleaner are the kind 2024 manuals haven’t named yet, and 2026 ones still skip past.

Turn 01 · Spec the work, then let the AI question it back
Wenyu

Here’s the spec: a simulated-annealing pathfinder on a grid maze, two-day build, audience is fourteen-year-olds. Don’t start coding yet. Read it twice. Then ask me back: which parameters are under-specified, which UX assumptions could go either way, which design call would change the rest if you guessed wrong.

Claude

Three calls to settle first. Vision radius: full visibility kills the lesson, blackout kills the agency, where in between? Failure mode: when the algorithm can’t find a path, freeze on the dead-end or auto-restart? Control surface: one Try button, or also a Tournament mode where multiple runs race? Picking any of these wrong now means rewriting the demo loop.

2024 vibe coding’s default loop is prompt → output → iterate. Iteration is cheap so why not? Except every wrong assumption baked into the first output multiplies through every following one. The AI’s strongest ability isn’t generation. It’s spotting where your brief is ambiguous. Use that ability before the first line of code, not after the third revision.
Turn 02 · Keep the debug surface small
Wenyu

Don’t try to build the GameBoy frame, the chip controls, and the algorithm all in one shot. You can hold the full spec at once but I can’t debug a black box. Bare canvas first, just the algorithm running. Let me see it find paths and fail to find paths. Chrome layers in after I trust the engine.

Claude

Stripped to bare canvas. Vision radius variable, jitter variable, click to set start and end. Watch the search emerge, watch it dead-end, tune the constants until the behaviour feels right. Chip controls, GameBoy frame, mascot. Each layer added only after the one below holds.

Pre-vibe-coding MVP was a resource discipline: ship small because building large was expensive. Post-vibe-coding MVP is a debug-surface discipline: generation is now cheap, but your debugging capacity hasn’t scaled. The AI can hold the full spec in its head simultaneously; you can only chase one bug at a time. Keep the surface small enough that when something breaks, you know where to look.
Turn 03 · Strip one variable, watch what survives
Wenyu

The algorithm isn’t finding paths on some mazes. Stop reading the code. We don’t actually know yet if it’s the algorithm or the maze. Hard-code a maze we KNOW has a solution, run the algorithm against it. If it finds the path, the maze generator is making unreachable cells. If it still fails, the algorithm has a problem. One variable at a time.

Claude

Hard-coded a small maze with a hand-laid solution. Algorithm finds it in eighty iterations. Original generator was carving disconnected regions: start and end could land in different components. Generator patched to ensure connectivity before exit. Algorithm untouched.

When something breaks, the AI’s instinct is to read the code and propose plausible-looking fixes. Half the time those fixes don’t address the actual cause. They just resemble fixes. The habit that works: don’t theorise, strip one variable, see what survives. The bug isn’t where the AI guesses. The bug is wherever the variable you just isolated stops mattering.
05 · Insight

Two Others, one practice of approaching

Across two days, the work split itself in two. The first half held a fourteen-year-old’s hand into an algorithm they hadn’t agreed to learn. The second half instructed an AI that doesn’t need to be asked. Both halves taught the same thing.

A middle-schooler will leave the room the moment they hear “algorithm.” What kept them in their seat was a handheld they could shake, a maze they could dig through, a colour gradient that meant something before it was explained. The teaching never began in theory. It began in the body that arrived already curious.

The AI half rhymed. Claude’s defaults of asking permission, generating before listening, accumulating syntax that almost meant the right thing, all moved the work forward until someone caught them. The corrective wasn’t prompt engineering. It was the same posture I’d used on the student: read what the partner is actually doing, not what the surface output claims, and route around tools when they fail.

What I leave this project knowing: directing an AI and directing a curriculum are the same practice in two registers. Each begins with someone whose horizon is not mine. Each asks me to approach, not to extract. The shortcut is never typing faster. It is being present sooner: to the student in front of the maze, and to the partner across the round-trip.