Contact us
[gtranslate]

Defining false positives: Why outcomes matter with Vision AI

21st October 2025

Insights from Andrew Smith, Principal AI Solutions Engineer, and Lazar Rankovic, Principal Machine Learning Engineer, SeeChange self-service solutions team.

As developers of computer vision AI solutions, we’re often asked by prospective customers:

“What’s your false positive rate?”

A false positive is synonymous with a false alert. So, asking for a system’s false positive rate provides an alluringly simple metric that should allow a retailer to compare ‘like’ solutions.

Yet a false positive rate only tells part of the story. It certainly helps understand how often a system flags normal activity incorrectly, but it doesn’t consider how often events of interest are identified correctly — or in fact, missed altogether. A system tuned to eliminate false positives may in fact miss many real incidents, defeating the purpose of AI.

When it comes to AI-driven systems, false positives are useful: they provide feedback that helps a system learn and improve over time. They also play a part in two important metrics:

  • Precision: Of all flagged incidents, how many were correct?
  • Recall: Of all real incidents, how many were detected?

Using precision and recall instead of just the false positive rate provides a more comprehensive evaluation of a system’s performance.

This blog explores the value of false positives, defines precision and recall in AI-driven systems, and highlights how to ensure your chosen solution helps optimize operations, improve shopper and employee experience.

A false positive rate is often calculated as a percentage of alerts, using the formula: FP / (FP + TP).

However, this method doesn’t allow for fair comparisons between different solution providers as it only reflects results from specific conditions in a particular store. To truly compare systems, they must be tested under the same conditions—identical stores, checkouts, and timeframes. Without this, metrics like the false positive rate can be misleading.

Formally, a false positive rate is calculated using the formula: FP / (FP + TN). While this provides a more universal measure of how often the system flags normal activity incorrectly, it still doesn’t show how well the system detects real incidents (true positives) or how many it misses (false negatives).

This is where precision and recall come in:

Precision measures how many of the alerts raised are correct.

  • High precision = alerts are mostly accurate (few false positives).
  • Low precision = lots of false alerts.

Recall measures how many actual theft or error events the system detects out of all that occur.

  • High recall = the system catches most real events.
  • Low recall = the system misses a lot (many thefts or mis-scans go unnoticed).

Takeaway: Precision and recall need to be considered together to understand the overall performance of a system. Only by measuring both can retailers see how well a solution actually protects against loss, supports employees, and maintains a smooth shopper experience.

Systems are structured. Human behavior is unstructured.

While store layouts will be established, planograms will guide displays, checkouts will follow a designated order, and employees will receive training; humans are inherently unstructured. In other words, we naturally introduce unknowns into any environment – the key is being able to embrace this complexity and adapt effectively.

Takeaway: False positive or alerts may require process realignment which, when fixed, can improve both system performance and shopper experience.

Topics:

Author:
SeeChange