You get two statics back from design. Version A has a big lifestyle photo, a model smiling at the camera, your offer in the corner. Version B is plainer: product, claim, price. You run both through a heatmap tool and version A lights up bright red on the model's face. Total attention. Looks like a winner.

Except none of that red is on your offer. The face won the glance and kept it, and the thing you are paying CPMs to communicate sits in a cold blue corner. The heatmap did not tell you which ad will convert. It told you something narrower and still useful: in version A, the eye never gets to the part that does the selling.

I run a saliency pass like this on every static in my own pipeline, so I have strong opinions about where these heatmaps help and where the vendors oversell them.

Short answer: An attention heatmap predicts where a first glance lands on a static ad, computed by a saliency model trained on human eye-tracking data, not measured on real viewers. It tells you whether attention reaches the elements carrying your message. It cannot tell you whether the ad converts.

The takeaways

Nobody's gaze was measured on your ad. An attention heatmap is a prediction from a saliency model (DeepGazeIIE in my pipeline) trained on human eye-tracking datasets. That distinction decides how much weight the output deserves.
Vendor accuracy claims of 90 to 96 percent describe fixation agreement. A model can be right about where the first glance lands and still tell you nothing about whether the ad converts.
Three checks earn their keep on statics: hook element in the hot zone, scan-path order reaching the CTA, and the share of attention your offer region gets. Video is a different problem, and a still-image model has no business judging it.

What does an attention heatmap measure on an ad?

Strictly speaking, nothing. A predictive attention heatmap is computed by a saliency model: a neural network trained on datasets of real human eye-tracking recordings, which then estimates, pixel by pixel, where a first glance at a new image will most likely land. The red zones are high predicted fixation probability. No person looked at your creative.

That puts it in a different category from two things it gets confused with. Live eye tracking measures real participants with cameras, costs more, and takes days per round. Click or scroll heatmaps (the Hotjar kind) record what visitors did on a live page, so they need traffic you have not spent yet. The saliency prediction needs neither people nor spend, which is the whole appeal: you get the read before the first euro leaves the account.

The trade is that you are reading a model's guess about the first seconds of pre-conscious attention. Neurons, one of the vendors in this space, frames its start-attention maps as the first two seconds of exposure. That is the window these models speak to.

How accurate are AI attention heatmaps?

Accurate enough at the narrow thing they do. Attention Insight, comparing its predictions against live eye-tracking studies, puts commercial saliency models at 90 to 96 percent accuracy, with academic models scoring a bit higher on public benchmarks. DeepGazeIIE, the model I use, comes from that academic line and has led the standard saliency benchmarks for static images.

But hold on to what "accurate" means here: agreement with where real fixations land in the opening moments on a still image. The model has never seen your offer, your audience, or your price point. It cannot tell a winning ad from a losing one. Two creatives can produce near-identical heatmaps and perform a 3x apart, because performance lives in the message, the offer, and the match to the person seeing it.

So the honest job description is small: the heatmap tells you whether attention even arrives at the elements that carry the message. Whether those elements persuade is outside its pay grade. It predicts attention, not persuasion.

How do I use a heatmap before launching a static?

Three checks, two minutes per creative:

Is the hook element in the hot zone? Whatever has to register first (the pattern interrupt, the headline claim, the product) should sit where the predicted attention pools. If the hottest region is a decorative element, the design is spending your glance on filler.
Does the scan-path order reach the CTA? Beyond the static heatmap, saliency models can rank the predicted order of fixations. The sequence you want is hook, then proof, then CTA. A path that wanders off the edge after fixation two means the layout leaks.
What share of attention does the offer region get? Region scoring puts a number on it. A claim that collects 4 percent of predicted attention is decoration, whatever the brief says. When the offer is starved because everything is crammed into a square, the upstream fix is often a taller canvas: the right Meta aspect ratio buys the composition room a heatmap then rewards.

One caveat from the version A story above: a face glowing red is not automatically bad. Faces pull attention in nearly every saliency dataset, and a face gazing toward your headline can hand the glance onward. The fix is rarely "remove the face"; it is usually "make the face look at the thing you sell".

And keep the verdict advisory. In my pipeline the saliency pass flags problems; it never rejects a creative on its own, because a human can see context the model can't. The heatmap is a pre-flight check. The proof is still the live test, and reading that test without fooling yourself is its own discipline.

Where heatmaps quietly stop working

Statics only. A saliency model trained on still images has nothing valid to say about video: motion, cuts, and audio rewrite where attention goes, and a frame-by-frame heatmap of a video is a still-image answer to a moving-image question. I deliberately do not run scan-path prediction on video in my own stack for that reason. When a tool sells you video heatmaps from a static model, ask what it was trained on.

The same restraint applies inside Adscalr's creative workflow: every generated static gets a DeepGazeIIE pass with the per-pixel heatmap, the scan-path order, and region scores, the flags land next to the draft, and a person decides what ships. If you want to see how that fits into the rest of the creative loop, that page walks through it.

Attention heatmaps for ad creatives

What does an attention heatmap measure on an ad?

How accurate are AI attention heatmaps?

How do I use a heatmap before launching a static?

Where heatmaps quietly stop working