Field Note · N° 03 · AI & video
AI upscaling for old home videos.

AI video upscaling has gone from a research demo to a commercial service in about five years. Modern neural models can take a blurry 480-line VHS capture and produce 1080p output that looks dramatically sharper, cleaner and more modern. The results are real, and they’re not just marketing.
They’re also not magic. The same model that makes one home video look stunning can make another look plasticky and artificial. Understanding when AI upscaling helps (and when it doesn’t) saves money and avoids disappointment.
What the model actually does
An AI upscaler is a neural network trained on millions of pairs of low-resolution and high-resolution video frames. Given a new low-res frame, it predicts what the high-res version probably looks like, based on patterns it learned from training.
It’s not magnifying. It’s hallucinating plausible detail. For faces, this works remarkably well: the model has seen enough faces to know what eyes, skin texture, hair and mouths look like at high resolution. It can take a blurry face and reconstruct convincing detail.
For everyday objects, the same applies: trees look like trees, fabric looks like fabric, brick walls have plausible mortar lines. The reconstruction is statistically informed guesswork, but the guesses are accurate enough that the result feels real.
For unusual or specific things (a particular logo, hand-written text, an unusual object) the model often gets details wrong. You won’t notice unless you know what was originally there.
It’s not magnifying. It’s hallucinating plausible detail. The guesses are accurate enough that the result feels real.
When AI upscaling really helps
Family portraits and indoor footage with good lighting. Faces shot at MiniDV or Hi8 quality, in living rooms with normal household lighting, often look noticeably better: sharper features, cleaner skin tones, more lifelike eyes. This is the case AI was effectively trained on.
Outdoor daylight scenes. Vacation videos, kids in the backyard, weddings outdoors. Bright, well-lit, low-noise source footage gives the AI a clean signal to work with.
Static or slow-moving scenes. A speech at a graduation, a wedding ceremony, an interview. Slow motion lets the AI use temporal information across multiple frames to reconstruct each one more accurately.
Footage you’ll actually rewatch. The “wow” of an upscaled clip works best when you have an emotional reason to look closely. Generic footage benefits from cleaner color and noise reduction, but the upscale doesn’t shine.
When it helps less
Heavy tape damage. If the source has constant dropouts, severe color loss or major signal degradation, AI doesn’t have enough clean information to reconstruct from. The output is cleaner than the input but the underlying defects remain. Advanced scan first, then AI, is the right pairing for damaged tapes: clean source gives AI more to work with.
Low-light or shadowy footage. Indoor scenes filmed in dim conditions are mostly noise to begin with. AI denoise helps, but the underlying detail isn’t there to enhance. Output looks smoother but not significantly sharper.
Fast camera motion and pans. Motion blur from a moving camera doesn’t reverse. AI may smooth out judder, but it can’t recover detail that was never sharply recorded.
Already-blurry source. Out-of-focus shots stay out of focus. Sharpening adds the appearance of detail but doesn’t reconstruct what wasn’t captured.
The plastic-skin problem
Bad AI upscaling has a recognizable look: overly smooth skin, plastic-doll faces, an uncanny “polished” quality that doesn’t match the era of the footage. This happens when:
- Denoising is too aggressive, eliminating natural film grain and pore-level texture along with the noise.
- Sharpening is overdone, producing halos and artificial-looking edges.
- The model is generating fake detail in areas where there shouldn’t be any (smooth walls, blank backgrounds).
A good AI workflow tunes these to match the source. Period footage from the 1990s should look like enhanced 1990s footage, not like it was shot last week.
Realistic before-and-after expectations
For a typical 1990s VHS family video, well-stored, captured properly:
- Resolution visibly improves: text becomes readable, faces become recognizable rather than approximations.
- Color is more stable and saturated without going neon.
- Noise drops noticeably: fewer sparkles, less hiss-like grain.
- Aspect ratio stays 4:3 (we don’t artificially stretch to 16:9, since that’s distortion, not enhancement).
- Frame rate typically stays at the original 29.97/25 fps. Frame interpolation to 60fps is optional; some people love it, some hate it.
The improvement is real but not transformative. A bad VHS doesn’t become great footage. A decent VHS becomes a noticeably better viewing experience on modern screens.
When to skip AI
AI Enhancement is $0.25 per minute of selected video. For a typical 30-minute clip, that’s $7.50. Worth it when:
- You want to actually watch the footage regularly.
- The original is good enough that AI can build on it.
- The content matters: graduations, weddings, lost relatives, irreplaceable moments.
Not worth it when:
- The footage is generic and you’re archiving rather than rewatching.
- The source is severely degraded (handle restoration and Advanced capture first, see if it’s worth AI after).
- You’re on a tight budget and want to preserve everything at base quality first. AI can always be added later.
How to decide
The honest workflow: digitize first, decide on AI later. Once your tapes are in your dashboard at Basic or Advanced quality, you can preview the footage and mark specific clips for AI Enhancement. You’re only billed for what you actually select. Pick the moments that matter and skip the rest.
Our AI Enhancement service works this way: you don’t have to commit upfront, and you don’t pay to enhance footage you’re not going to rewatch.
