This article explains how to get face overlay data and image for a video feed. This can be useful in developing integration with SAFR to show overlay information in a VMS.
For inforation on APIs described in this article, see following:
https://virga.real.com/docs/index.html#/
https://cvos.real.com/docs/index.html#/
The process to get overlay data is as follows:
1. Start a video feed
2. Get Feed Info
Call the GET /config/worker API and obtain following client-id and name as shown below
{ "worker-config": [ { ... "feeds": { "Front Hall Axis 3255": { "mode": "Enrolled and Stranger Monitoring", --> "name": "Front Hall Axis 3255", ... } }, --> "client-id": "Virgo-win-96FA137C" } ], ... }
3. Start Overlay Feed
Call GET /image_stream/{clientId}/{feedId}
Where
- clientId is "client-id" from above
- feedId is stream 'name" from above. Make sure to URL Escape the value. (
e.g. "Front%20Hall%20Axis%203255") - shared=true enables the Get sharedStream functionality described below.
- maximum-frames - Number of frames until stream stops. See Renew Overlay Feed Lease below
The response should look like the following:
{ "tenant": "stevelocasus", --> "url": "http://192.168.0.16:8086/sharedStream/video_3e268d01-0372-4b45-83d0-f60a3d8ebea2", "genericUrl": "cvos://sharedStream/video_3e268d01-0372-4b45-83d0-f60a3d8ebea2", --> "trackingMetadataUrl": "http://192.168.0.16:8086/sharedStream/video_3e268d01-0372-4b45-83d0-f60a3d8ebea2_tracking_result", "genericTrackingMetadataUrl": "cvos://sharedStream/video_3e268d01-0372-4b45-83d0-f60a3d8ebea2_tracking_result", --> "stream-id": "3e268d01-0372-4b45-83d0-f60a3d8ebea2" }
Get the following values in the response for use below.:
- url - Used to get the scene image
- trackingMetadataUrl - Used to get tracking data in JSON format
- stream-id - Used to renew lease (See Renew Overlay Feed Lease)
4. Get Overlay Info (recurrently)
Call GET /sharedStream/{id}
For example:
GET /sharedStream/video_3e268d01-0372-4b45-83d0-f60a3d8ebea2_tracking_result
The response will look something like the following.
{ "timestamp": { "date": 1649727325054, "microseconds": 89484649 }, "updated": [ { "allowRecognizerToLearn": null, "completedSuccessfulIdentificationAttempt": false, "completedSuccessfulRecognitionAttempt": true, "consecutiveFailedIdentityVerifications": 0, "detectedObject": { "centerPoseQuality": 0.44231163359289116, "clipRatio": 0.023518120869994164, "confidence": 0.9993407130241394, "imageContrastQuality": 0.71375000476837158, "imageSharpnessQuality": 0.83861190741989233, "localId": 0, "normalizedBounds": { "height": 0.46527779102325439, "width": 0.19218750298023224, "x": 0.46809372305870056, "y": 0.54566460847854614 }, "objectType": "Face", "pitch": -0.43985384837292452, "pixelBounds": { "height": 753.75, "width": 553.5, "x": 806.48992919921875, "y": 463.6927490234375 }, "roll": 0.025498561752424057, "thumbnailBoundsExpansionFactor": 0.5, "yaw": 0.52417801133801323 }, "identityRecognitionThresholdBoost": 0, "identityVerificationComplete": true, "isNew": false, "isZombie": false, "isolated": true, "lingeringCount": 0, "localId": 16833, "objectType": "face", "occluded": false, "person": { "confidence": 0.9996222, "faceConfirmed": true, "isMasked": false, "isOccluded": false, "maskConfidence": 0.0047501926, "occlusion": 0.06740895, "personId": "6804bb9a-f623-4082-87a7-bb578c62f2e4", "profilePoseConfidence": 0.3754626, "sentiment": -0.99363095, "similarityScore": 1.0877689, "smile": false, "updatableProperties": { "age": { "lowerBound": 50, "upperBound": 50 }, "externalId": "", "gender": "male", "name": "Steve McMillen", "personType": "" } }, "receivedPositiveFaceConfirmation": true, "recognitionCount": 580, "state": "Recognizing", "timeOfInitialDetection": { "date": 1649727262094, "microseconds": 26523376 } } ] }
You can use the normalizedBounds (% of screen position) or pixelBounds to determine the location of faces on the video frame.
For performance reasons, the next time you call the API, you should extract the date field under timestamp and use that for the value of since in the next call as follows.
GET /sharedStream/{id}?since=<date value>
GET /sharedStream/video_3e268d01-0372-4b45-83d0-f60a3d8ebea2_tracking_result?since=1649727425085
Each subsequent call to GET /sharedStream should use the date value from the previous call for the since parameter.
Renew Overlay Feed Lease
When calling GET /image_stream/{clientId}/{feedId}, you pass a maximum-frames. This causes the video feed to generate that number of frames. After that number of frames, the feed will stop sending overlay data.
To ensure overlay data continues uninterrupted, another call to PUT /image_stream/{clientId}/{feedId}/{captureStreamId} must be made to renew the Lease. The captureStreamId is the stream-id obtained from the GET /image_stream API. This will extend the stream to maximum-frames from that point. See PUT /image_stream/{clientId}/{feedId}/{captureStreamId} for information on when to renew the lease.
The VIRGA GET /status API may help in the determination of when to renew the shared stream lease. In particular, the fps along with the maximum-frames can be used to compute how long after the last lease renewal the shared stream will expire. Following shows useful output from GET /status API.
{ "worker-status": [ ... { "status": { ... "feeds": { "Front Hall Axis 3255": { "status": "OK", "statistics": { ... "fps": 30.0, }, ... "capturing": true },
Tracked Object Types
The GET /sharedStream API will return different object types. This is not fully documented on the Swagger page so information below provides more information about this API.
The API returns 2 top level objects:
- timestamp - Indicates time of last update and used to support long polling in subsequent GET sharedStream response.
- TRACKEDED_OBJECT
Where TRACKED_OBJECT can be one or more of the following:
- appeared - List of one or more faces that appeared for first time in this frame
- updated - List of one or more faces that appeared in last frame and updated in this frame
- lingering - List of one or more faces that appeared in last frame and no longer found in this frame but within the tracking delay. Some faces will exit the video frame and go directly to disappeared instead of getting into this state.
- disappeared - List of one or more faces that appeared in last frame and are no longer present in this frame