Generating images with a 2020 iPhone

June 2026

Here is an image we generated entirely on an old iPhone 12 Pro that a colleague had lying around – using mlx-swift to run PrismML's Bonsai Image model in a small iOS app:

A lone lighthouse on a rocky cliff at golden hour, generated on an iPhone 12 Pro — “a lone lighthouse on a rocky cliff at golden hour”

And here it is running, with the two minutes or so of waiting trimmed from the middle:

The code to reproduce it is at github.com/duration-ai/bonsai-image-ios.

PrismML released their compact image-generation models in late May 2026, and their launch post reports an iPhone 17 Pro Max producing a 512×512 image in 9.4 seconds. At Duration we think a lot becomes possible once on-device generation pushes the marginal cost of inference toward zero.

(We're working pretty hard on on-device TTS at the moment – frontier open-weight models on both iOS and Android, with posts to come – but we were compelled by the sheer novelty of running image generation entirely on a phone.)

PrismML already ship Bonsai Studio, an iOS app that runs the whole image generation on-device. On the iPhone 12 Pro it produced 128×128 and 256×256 images without complaint, but 512×512 timed out after the app's two-minute limit, at every tiling setting. (Spoiler: we suspect a 512×512 image would finish on the 12 Pro if the app lifted that timeout)

So we built our own prototype, to see if we could reproduce the app's results and unblock iPhone 12 Pro image generation. We first tried running the diffusion transformer on Apple's Neural Engine, but that came up short: it converts to Core ML cleanly enough but won't then actually run on the ANE. We fell back to mlx-swift instead, checking our port against the Python reference as we went. This was made much easier by the reference running on the same MLX framework as our Swift port.

In the end we didn't need VAE tiling at all. By loading each stage of the pipeline only when it was needed and freeing it before the next, we kept peak memory at around 3 GB, about half the 12 Pro's total memory. Even so, a 512×512 image takes two to three minutes to generate. While that is a long time, it is also astounding that it runs at all on a six-year-old phone – hats off to PrismML!

A couple of notes:

Sustained, back-to-back generation seems to run into thermal throttling – near the end of testing, images were taking as long as twelve minutes to generate! We suspect a working ANE port would help here.
We didn't attempt 1024×1024, mostly out of an unwillingness to wait around; the code is here if you'd like to fork it and try yourself.

Ultimately, this was mostly an exercise to see if we could take someone else's open-weight frontier model and stand it up on whatever hardware happened to be lying around. It turns out we can. Maybe we'll have a look at Android next.