Camera Fingerprints Survive EXIF Stripping

Most people assume that stripping EXIF data makes a photo anonymous. Remove the metadata, remove the trail. It is a reasonable assumption — and it is wrong.

Nearly 5 million images have passed through the snapWONDERS forensic pipeline. The files that arrive already stripped are the interesting ones. No make, no model, no GPS, no timestamp. Clean, on the surface. But the compression structure tells a different story every time.

What stripping actually removes

Standard metadata removal tools — ExifTool, ImageOptim, online strippers — target the APP segments in a JPEG file. APP1 carries EXIF: camera make and model, GPS coordinates, timestamp, orientation. APP12 and APP13 carry Photoshop and IPTC data. Strip those, and the named metadata is gone.

What they do not touch is the entropy-coded bitstream — the actual compression data. Inside that bitstream, two structures sit completely untouched after every standard strip: the quantisation tables (DQT) and the Huffman tables (DHT).

DQT — the encoder’s permanent signature

Every JPEG file contains one or more quantisation tables. These tables are central to JPEG compression — they determine how aggressively each frequency component of the image is reduced. The critical point: camera manufacturers hardcode these tables into firmware. Canon, Nikon, Sony, Apple, Samsung — each produces tables characteristic of specific models and firmware versions.

When snapWONDERS identifies a camera model from a stripped file, it is matching against these tables. DQT is not metadata. It is not in APP1. It is the compression recipe, baked into every JPEG the camera has ever produced.

Software encoders have their own signatures too. Photoshop uses a known table set. Lightroom uses another. GIMP has its own. A file with no EXIF, no ICC profile, no Photoshop marker — but carrying Lightroom’s quantisation tables — was processed through Lightroom. That is not a guess; it is a table match.

Huffman tables — did this file leave the camera as-is?

Alongside the quantisation tables, every JPEG contains Huffman coding tables (DHT). These define how the compressed data is encoded at the bit level.

Camera firmware almost universally uses the fixed standard tables defined in JPEG Annex K — the appendix in the JPEG standard (ISO/IEC 10918-1) that specifies a set of pre-computed reference Huffman tables, identical across every device that adopts them. They require no computation per image; they are simply built into the firmware and applied every time the shutter fires. Software re-encoders — Lightroom, Photoshop, GIMP, FFmpeg — do not use Annex K. They compute optimised Huffman tables for each individual image, tailored to that image’s content, producing a marginally smaller file. The result looks measurably different from any fixed table set.

This means: if a JPEG has optimised Huffman tables, it was re-encoded after it left the camera. The file you are looking at was re-processed — regardless of what any timestamp says, if there even is one.

A real example

A file came through the pipeline recently: no EXIF, no colour profile markers, no software identifier. Visually, an outdoor portrait. Completely clean on any standard metadata viewer.

DQT match: Nikon Z6 II firmware 4.x. The quantisation tables were a definitive match to that camera’s encoder — not a close match, an exact one.

Huffman tables: Annex K fixed — the standard set built into camera firmware. Unmodified. No re-encoding.

Combined read: this file came straight from a Nikon Z6 II and was never processed through software. The only thing done to it was EXIF stripping. Both signals agreed: camera identified, originality confirmed. The EXIF said nothing. The bitstream identified the camera, confirmed the file had not been touched since capture, and told us exactly what had been removed.

What re-encoding actually achieves — and what it gives away

A reasonable follow-up: what if you re-encode the image to destroy the DQT fingerprint?

Re-encoding does replace the original camera’s quantisation tables — they are gone, overwritten by whatever software performed the re-encoding. Run the file through Lightroom and Lightroom’s DQT is now in the file. Through Photoshop, Photoshop’s tables. Through GIMP, GIMP’s pattern. The original camera signature is lost.

Importantly, re-encoding does not selectively replace just the DQT. Both DQT and Huffman tables are outputs of the same encoding process — they are rebuilt together. A re-encode through Lightroom produces Lightroom’s quantisation tables and Lightroom’s optimised Huffman tables simultaneously. The two signals always travel as a pair.

This pairing is what makes partial tampering detectable. A natural re-encode always produces a consistent result: software DQT matched with optimised Huffman. If someone manually patches just the DQT tables to impersonate a camera model while leaving the Huffman tables in place — or alters only the Huffman tables — the mismatch is immediately visible. Forensic inconsistency between the two signals is itself a finding.

Re-encoding also adds something that cannot be hidden: confirmation the file was deliberately processed. The claim that a photo is the original, unmodified capture is finished the moment either table set changes. A re-encoded file is not clean. It carries a different fingerprint, and evidence that the original compression structure was intentionally replaced.

What this means if you have already stripped

If you have stripped EXIF and assume the file is clean — it is not, unless the compression structure was also rewritten. Standard stripping tools do not do this. Many “privacy-focused” tools do not touch the bitstream.

This connects to what I wrote previously about GPS: Why GPS in photos is more dangerous than most people think. Stripping GPS removes one data point — a single location at a single moment. DQT is a different kind of exposure. It is a persistent device identifier: the same tables appear in every JPEG that camera has ever produced, across every shoot, every upload, every platform. Two photos taken years apart, shared anonymously, stripped of all EXIF — but from the same camera — will match on DQT. They are linked. They form a trace that can be followed.

Build up enough images from the same source and you are no longer looking at metadata. You are looking at a profile.

The logical next question: does converting to a different format help? PNG, WebP, and HEIC do not use JPEG quantisation tables — format conversion does destroy the DQT signature. But it is not a clean escape either. The conversion software leaves its own identifiable pattern in the new file’s structure, and JPEG compression artefacts baked into the original pixel data can persist across format conversion. Converting to lossless PNG and stripping all metadata is the most robust option currently available. It removes the DQT vector. It does not guarantee clean provenance, and it confirms the image was processed.

snapWONDERS’ forensic analysis checks DQT and Huffman on every JPEG, with or without EXIF. If you want to see what a stripped file still reveals, upload it now. No account required.

→ Run any image through snapWONDERS’ forensic analysis

EXIF is the label. DQT and Huffman are the ink.

Kenneth Springer is the founder of snapWONDERS and the developer of Vaultify. Two forensic databases power the analysis described in this article: a DQT quantisation table database that identifies camera models and software encoders, and a Huffman fingerprint database that determines whether a file left the camera unmodified or was re-encoded. Both were built from scratch — empirically derived, not sourced from any existing tool or dataset. snapWONDERS forensic analysis — no account required.