How This Works
About
Codex Arcade
Codex Arcade is an experiment in making real browser games small enough to live inside a URL and, eventually, inside a scannable QR code.
I have been using GPT models for coding experiments since the GPT-3 era, going back to the early days of testing how far natural-language programming could be pushed. After working at OpenAI as its first prompt engineer, one of the things I kept coming back to was game design, because every new model seemed to reveal a different threshold of what was possible.
One of the earliest moments that really stuck with me was in 2022, when Instruct-era models started showing they could manipulate structured game state well enough to build tiny playable experiments. I wrote about that in Building games and apps entirely through natural language using OpenAI’s code-davinci model, where I showed things like a minimal Legend of Zelda and other small game systems. That felt like the beginning of a very clear trajectory.
Inside One QR
Everything For A Game Can Live In Here
This is the current Dark Flyer payload as a real QR PNG. The code, graphics, sound, and gameplay logic are all packed into the URL behind it. Scan it, open the loader, and the browser inflates the payload back into the full game.
Why 2,048 Characters
Model coding capability has continued to climb from code completion into something much more agentic. As I write this, Codex can run in the background for hours, inspect screenshots, check playability, and work through problems that would have felt implausible a few model generations ago. That created an obvious temptation to make something bigger and more complicated. Instead, I wanted to see what happened if the constraint went in the other direction.
I had already played with extremely small game experiments, including things closer to shader toys and 240-character curiosities, but those were not usually fun in a durable way. They were impressive proofs of concept. For a game to start feeling real, it needed more room, but the cap also had to be meaningful rather than arbitrary.
The number that kept recurring was 2,048 characters. It shows up over and over as a practical ceiling for URLs, paste targets, link systems, and payloads that still have a chance of surviving as a QR code without becoming absurdly dense. You can absolutely go bigger, but then you start paying for it with scan reliability and portability. That made 2,048 a useful hard budget: large enough for a real idea, small enough to force discipline.
I had also seen other people do impressive things in this space, including The Backdooms, a QR-doom-style project that showed how compelling this format could be. That gave me a target to aim at, but I wanted to get under the stricter 2,048-character line and see if the result would still feel like a game rather than just a stunt.
How The Games Get Made
The first project in that direction became XENO Realm 3D. I wanted to know whether Codex and GPT-5.4 could build a fake 3D corridor shooter inside the budget. Watching Codex solve that problem was one of the most interesting parts of the whole exercise. By researching old 1980s rendering tricks and converging on those up-and-down lateral corridor lines, it found a way to create a simulated 3D space that seemed almost impossible at that size. The final game came in under the limit, and the early prototype was even smaller than that.
The workflow usually starts with me proposing a concept in ChatGPT, because it is fast to iterate there and easy to see rough ideas in Canvas. Once a prototype feels promising, I start tightening the constraint and tell it to aim for 2,048 characters even if the first pass goes over. After that it moves into Codex, where the work becomes more aggressive: cut bytes, preserve feel, and decide where a smaller implementation can buy back enough room for a better mechanic or a clearer HUD.
The useful tricks are not just minification. These games lean on emoji as instant art assets, tiny procedural drawing, oscillator audio, repeated geometry, compressed payloads, and ruthless UI trimming. Emoji are absolutely fair game if they make the game clearer and still survive intact on modern devices. The rule is simple: if the whole thing is self-contained and runs without the network, it counts.
That is also where Codex becomes interesting as a collaborator rather than just a generator. It will suggest tradeoffs like adding recoil to a gun, reshaping a HUD, or compressing one section so another mechanic can fit. My role is to decide which of those ideas actually improve the game and which ones are just technically clever. The first eight games here were built in roughly a day and a half, including sleep, which still feels slightly absurd.
How The Payload Gets Small
There are really three stages here. First comes the game itself. I usually start by trying to make something that roughly works and lands somewhere in the neighborhood of 2,048 characters, even if it goes over. Then comes code golfing: the old discipline of shaving the program down without losing the effect. That means removing wrapper HTML that is not needed for a self-contained page, collapsing repeated values into one-letter variables, reusing the shortest possible literals, and leaning on browser behavior when it is reliable. A closing </canvas> tag, for example, costs nine characters and can often be omitted. If a long number or string appears several times, it is usually cheaper to assign it once and reuse the variable. If a color can be represented by a shorter literal, that matters too. Sometimes red is shorter than a long hex value. Sometimes #f00 is shorter than both. The point is not style. The point is byte economy.
The second stage is asking Codex to do another pass specifically for optimization. This is where GPT-5.4 is useful as a collaborator. It can suggest other ways to get to the same visual or gameplay result with less code, or point out places where a little compression buys enough room for a better mechanic. Emoji are part of that toolbox. A taxi emoji is far cheaper than drawing a taxi from rectangles. A large red button emoji can be cheaper than building a button from arcs and fills. If the whole thing still runs self-contained on a modern machine, it counts.
The third stage is the actual payload compression. Once the game HTML is as tight as it reasonably needs to be, the build step compresses the raw HTML with deflate-raw, converts the compressed bytes into URL-safe base64, prefixes the fragment with z, and turns the result into a URL like https://codexarcade.com/qr#z.... The loader page reads the fragment, restores the base64, inflates it back into HTML with DecompressionStream("deflate-raw"), and then writes that HTML directly into the document. That is how the final QR can carry the whole game even though the browser never fetches a separate gameplay file at launch.
To keep the challenge honest, the transport wrapper counts too. The current loader form, https://codexarcade.com/qr#z, costs 28 characters before the compressed payload even starts. A minimal raw data URL prefix like data:text/html, would cost 15 characters, so the website route is paying a 13-character overhead in the final QR URL. On average, that is only worth about 18 raw HTML characters after compression, not dozens. In other words, these games are already being designed to fit inside a smaller effective budget than the headline 2,048-character limit.
You can also paste one of these payload URLs directly into a browser. In the purest version of the experiment, the code is already there and does not need a network round-trip to fetch its logic. The reason this site uses the loader route instead of a pure data URL is practical: many phone scanners and mobile browsers now refuse to launch raw data: payloads directly, so the website route is there to make the games actually run when somebody scans them.
- Build a raw prototype. Get the mechanic and the feel working first, even if it is over budget.
- Hand golf the code. Remove waste, shorten literals, alias repeated values, and drop optional markup.
- Ask Codex for another optimization pass. Trade bytes for better mechanics, art shortcuts, or clearer HUDs.
- Compress the final HTML. Deflate it, encode it into a URL-safe fragment, and hand it to the loader route.
- Generate the QR. The QR points at the compressed payload URL, not a remote asset bundle.
Compression In Practice
In the current set, the raw HTML averages about 2,692 characters, while the final QR URL averages about 1,978 characters. In other words, the raw game code runs about 31.4% over the hard limit on average, then gets squeezed back under the line through compression. The average reduction from raw HTML to final QR URL is about 26.5%. That is the difference between a game that almost fits and a payload that actually scans.
| Game | Raw HTML | QR URL | Savings |
|---|---|---|---|
| XENO Realm 3D | 2489 | 1898 | 23.7% |
| Dark Flyer | 2887 | 2044 | 29.2% |
| Loco Taxi | 2580 | 1944 | 24.7% |
| Night Racer | 3057 | 2007 | 34.3% |
| QROLF | 2651 | 1922 | 27.5% |
| Rescue Chopper | 2776 | 2043 | 26.4% |
| Space Admiral | 2571 | 1982 | 22.9% |
| Tomb Of The Wizard | 2450 | 2000 | 18.4% |
Copy-Paste Prompt
Make me a tiny, self-contained browser game for the Codex Arcade challenge. Requirements: - The final playable game should target a budget of 2,048 characters or less for the main game payload. - No external assets, libraries, fonts, images, audio files, imports, or network requests. - Everything must be generated in code: gameplay, graphics, sound, UI, and effects. - Emoji are completely allowed and encouraged when they save bytes or improve readability. - Output must be a single HTML file or a single payload that can be embedded in a URL or QR code. - Design for a mobile browser first, using a 9:16 layout that feels right on a phone like an iPhone. - It must run in modern desktop and mobile browsers. - Controls should be playable on a phone: tap, swipe, hold, or very simple touch targets. - Prioritize immediate playability, bold visual identity, and clever size-saving tricks. - Use any aggressive but browser-safe compression tricks you want: terse variable names, procedural graphics, repeated geometry, oscillator audio, emoji, compact HUDs, and payload packing. - Include restart behavior and a visible score or win/lose state if the concept supports it. When you answer: 1. Start with a one-paragraph concept. 2. Return the final game as a single code block. 3. Report the character count of the main payload. 4. Briefly explain the tricks used to stay under budget. 5. If the game is over budget, aggressively compress and revise it until it is as close as possible to 2,048 characters.
What The Prompt Does
The prompt forces the model to think like a cartridge engineer: tiny budget, no external dependencies, no hidden assets, and no room for waste.
That constraint is the whole point. If the game can fit in the payload, then the graphics, sound, rules, and interaction all travel together inside the same tiny package.
In practice that means mixing techniques. Sometimes the cheapest art is lines and rectangles. Sometimes it is emoji. Sometimes the right answer is a tiny oscillator blip instead of silence, or a compressed URL fragment instead of raw HTML. The challenge is not purity. The challenge is making the game feel alive inside the byte budget.
If you want to follow the project, the best places are X and the Codex Arcade mailing list. There are already more experiments in progress than the current gallery shows.