5 Steps to Data Accuracy

Published by Ronny Max on

Accuracy is the single most important attribute in data. Many retailers and solution providers rely of on a simplified Accuracy Rate. But with the advent of the variety of tracking technologies, we need to design a better way.

Here are the 5 steps to Good Enough Accuracy:

Step 1: Start with the Business Purpose

Accuracy consists of a variety of key performance indicators. For example, door/line counting has different criteria than queue management. And InStore Analytics is different if the retailer sells groceries, apparel or electronics. Also the data depends if the store is small in size or a big box layout.

In the context of value, we need to define:

  • Detection: the number of people counted in a particular location (i.e. zone, dept, and store)
  • Recognition: tag the object (i.e. person or signal) to a set of attributes (Object ID)
  • Tracking: the behavior of people in motion (path analysis)
  • Location Accuracy: Geo-location coordinates, which range from zones to pixels
  • Time Accuracy: How accurate are the time segments (i.e. seconds)

Step 2: Identify the Core Technology

Each tracking technology has natural pros and cons, regardless of the solution provider.

Below are topics to think about:

  • Sensors: Video, thermal, and others are sensor-based solutions. The sensors capture the behaviors of all people within the field of view.
  • Devices: Our smartphones are tracking devices. Google and Apple get data on where and how long we stay in a specific location. Device-based solutions provide information on the individual journey.
  • Attributes: Some technologies do better in identifying Buying Groups. Some filter children. Some define gender and age. And there are questions about cost, deployment, and support.

Step 3: Learn the Science of Accuracy

The science of accuracy is based on open standards. We refer to video, and yet they are relevant to other technologies. Tests include 2D vehicle tracking, 2D face tracking, and 3D visual person tracking.

The following standards are for multiple objects tracking:

  • % Recall: The ratio of missed detection (under-counts)
  • % Precision: The rate of relevant results (over-counts)
  • Data Consistency: The trend in detection errors.
  • Threshold for Detection: The trade off for recall/precision in counting objects.
  • #ID Switching: In object tracking (over a series of frames), the number of mismatched IDs
  • % Precision Rate: How well the tracking technology captured the locations of multiple objects.
  • % Tracking Accuracy: Total errors in tracking. It includes misses, false positives, mismatched IDs, failure to recover tracks, etc.

Step 4: Define the Audit Parameters

Not all audits are the same. Avoid real-time (manual) audits and test for accuracy in a scientific way. Repeat the test to validate results against the Ground Truth (manual sample).

The levels of difficulty define the audit:

  • Evaluation Zone: In sensor solutions, accuracy can be tested in the full frame or a limited zone. In device-based scenarios, we determine the size of the Evaluation Zone.
  • Audit Granularity: The length of the benchmark Ground Truth. In detection we aim for shorter time segments (i.e. 15 minutes for door counts or 5 minutes for queues). In tracking objects, we want longer time segments. The recommendation is 1 hour.
  • Traffic Volume: A valid test requires sufficient traffic volumes. . A low-traffic audit (less than 100 people) should be repeated in different time periods.
  • Feeds Frequency: In video, the Frame Rate (i.e. 25 frames per second) has significant impact on tracking. In wireless solutions, we face the same challenges with Signal Frequency

Step 5: Know Thee Tracking Solution

Tracking solutions contain technology, infrastructure, and user interfaces. There are many attributes that define the value of a solution.

Here are some questions to ask:

  • Hardware vs. Software: Many solution providers are value added resellers. They include various technologies, analytics, and services. Learn the components of the solution.
  • Empiric vs. Statistics: Empiric data is the data captured by the sensor or device. Often the raw data is modified to a trend. Find out where the solution incorporates predictive analytics.
  • Can we check? Good solution providers will give you the audit and monitoring tools. After all, best practices in auditing allow for repeated validation.

Bringing It All Together

Data Quality is a function of accuracy in context. Hence we focus on value, technology, metrics, audits, and providers.

Ronny Max

Ronny Max is the Author of Behavior Analytics in Retail, and Founder of Silicon Waves, a consultancy specializing training and educating retailers and solution providers on the business benefits from behavior analytics.