In the context of facial recognition technology, "detection" refers to the process of finding objects in videos for subsequent processing. "Recognition" refers to the process of creating biometric signatures and matching faces. SAFR separates the detection and recognition processes into two distinct services. Although SAFR can run both processes on a single machine, it's helpful to discuss them separately in terms of load effect. This article describes the detection process which occurs in VIRGO video feeds. The recognition process is discussed separately in Auto scaling and manual scaling of recognition loads in SAFR Server.

Video Recognition Gateway (VIRGO) is designed to dynamically scale loads by varying the number of frames per second that are processed. This is done automatically by SAFR across all feeds, but you can also manually control the scaling by adjusting frame rates, whether via video feed configuration settings or via input sources.

Factors that Affect Detection Loads

To understand how this works, its best to review factors that pertain to detection frequency, as described below.

  • Input video frame rate - This is the single most important factor affecting loads on CPUs and GPUs. Doubling the frame rate will also double the load on the client.
  • Input video frame size - Frame size primarily affects memory usage, although it also has a marginal effect on CPU usage.
  • Number of faces - The more faces that are being processed, the more load there is on both CPUs and GPUs.
  • The selected SAFR face detection service - You can select either the High Sensitivity service (best) or the Standard service on the Detection Preferences menu of the SAFR Desktop Client. 
    • The Standard service increases its load in proportion to the number of faces in view of the camera.
    • The High Sensitivity service retains approximately the same load, regardless of the number of faces in view of the camera.
  • Enabling person detection has a significant impact on client GPU loads, and it also has a smaller (but still significant) impact on client CPU loads. Person detection settings that impact load are: (These are configurable in the Detection Preferences menu of the SAFR Desktop Client)
    • Detect persons every - Reduces frame rate.  Useful when camera is sending video at >10 FPS
    • Detection service - Applies different models for either maximum accuracy, maximum speed, or a balanced approach.
    • Input Size - Resizes input video to different sizes which decreases accuracy but increases speed

Detection Behavior

To better understand the discussion below, it helps to understand the detection process.

  1. The SAFR client scans every frame of video to find faces or other objects.
  2. As soon as a face is detected, the image quality metrics (center pose, sharpness, contrast, and occlusion) are evaluated to determine if face quality meets the recognition threshold.
  3. For each face found that meets the recognition criteria, the client makes calls to the SAFR Server to perform recognition
  4. If processing for the frame is incomplete before the next frame arrives, that frame is discarded

Automatic Scaling

Detection load is naturally scaled to lower the number of frames processed per second.  We label this as "Detections per second", or DPS. As noted above, SAFR scans the frames for faces and other objects.  It also perform other analytics such as tracking faces, badge detection, person detection, or direction of travel.  If SAFR completes all processing for a frame before the next frame arrives, it will begin to process that frame.  If SAFR is still busy when the next frame arrive, the incoming frame is discarded. 

To improve performance, SAFR can perform detections on multiple threads. 

SAFR reports if it is automatically scaling through the DPS reported in the Video Feeds window in the SAFR Desktop Client. Below is a screenshot of the Video Feeds window showing both FPS (input frame rate) and DPS (detections per second).

DPS is the number of frames SAFR was able to process.  You can see above that most feeds processed all the frames, while some of the feeds at the bottom of the list only processed 12 of the 15 frames.

SAFR also report the detection time.  This can be useful to determine if there is significant latency with each frame are processed. Below shows the detector latency for the Camera Analyzer or Video Feeds windows. (both found in the SAFR Desktop Client)

Camera Feed AnalyzerVideo Feeds Preview

dDt (detection time) is 7 millisecondsDetector / Latency is 8 milliseconds

The display of performance metrics can be enabled at the following locations:

  • In the Camera Analyzer window, enable "Performance Metrics" in the View drop-down menu.
  • In Video Feeds Video Preview, right click in the video panel and choose Performance Metrics

Note that in both cases, recognition time is ~8 milliseconds. This represents a system that is very low load, and thus you would expect no auto scaling to be occurring.

Nothing needs be done to enable auto scaling on the SAFR Server; it's a natural behavior of the system.

Manual Scaling

There are few ways to reduce load on the SAFR detection process, but one of the most impactful is to reduce the frame rate. For example, cutting the frame rate in half will have an equal reduction in detection load.

Although reducing the frame rate can improve performance, it's important not to reduce the frame rate so far that tracking is no longer possible. Loss of tracking will cause inefficiencies in recognition accuracy.  This is because trackingconfidence is increased and server load decreased when SAFR is allowed to track a face from frame to frame.