CrowdSight  3.7.0
The Face Analysis Toolkit

Introduction

CrowdSight SDK is a cross platform software library for automated semantic analysis of people in video and images. CrowdSight SDK can be used to automatically analyze faces in real time via a simple webcam, and communicate the resulting information to the third party application.

Currently, CrowdSight SDK can estimate the eye locations, head pose (yaw and pitch), mood, age, gender, clothing colors, six general facial expressions, attention span and ethnicity of the subjects.

In addition to this, CrowdSight SDK can be used for face tracking and recognition under a wide range of imaging conditions. Face models can be created, updated, stored and (un)loaded. The SDK is based on state-of-the-art technologies developed in collaboration with the University of Amsterdam.

The SDK can be easily integrated in third party software as a collection of C++ libraries for Windows, Mac and Linux platforms.

Usage

CrowdSight SDK operates on frame sequences. Each frame is processed and the information about the detected subjects can be retrieved through getter functions. In the pseudo-code below, a simple example on how to retrieve and display information for each subject is shown:

//Initialize CrowdSight and the capturing device/video input
CrowdSight crowdsight(data_dir);
crowdsight.authenticate(licence_key); // authentication with our server is necessary
for (;;)
{
capture >> frame; // get a new frame from the video stream
crowdsight.process(frame);
crowdsight.getCurrentPeople(people);// extract current people in the frame
for (int i=0; i<people.size();i++)
{
cv::Rect face = people[i].getFaceRect();
draw(face);
idString << "ID #" << people[i].getID();
genderString << "Gender: " << ((people[i].getGender())<0?"male":"female");
ageString << "Age: " << people[i].getAge();
moodString << "Mood: " << people[i].getMood();
drawText(frame, face, idString);
drawText(frame, face, genderString);
drawText(frame, face, ageString);
drawText(frame, face, moodString);
}
counterString << "People Counter: " << crowdsight.getPeopleCount();
drawText(frame, face, counterString);
}

For the full code, refer to the example implementation provided with the SDK. Additional usage examples can be provided upon request.

Scenario

CrowdSight SDK is designed for a multi-person scenario. It is recommended that the face detections are at least 50 x 50 pixels in order to be processed. Camera zoom may be used to adhere to this requirement.

Settings

CrowdSight SDK offers flexibility by granting access to some internal parameter settings through the settings.ini file. This file will be generated in the CrowdSight data directory:

[ClothingStyle]
ClothingHistogramBins = 15
NumClothingColors = 3
[Detectors]
UseAge = 1
UseClothColors = 1
UseEmotions = 1
UseEthnicity = 1
UseGender = 1
UseHeadPose = 1
UseMood = 1
[FaceDetector]
AcceptanceThreshold = 0.5
AdaptiveThreshold = 0
DetectorType = 0
MinFaceSize = 50
MaxFaceSize = 0
MaxNumPeople = 15
UseFaceTracking = 0
UseFastDetection = 1
UseProfileFaceDetection = 0
[FaceRecognition]
EvidenceLevel = 3
MemorySize = 100
RecognitionLevel = 0.35
[General]
AsianMode = 0
NightMode = 0
Smoothing = 3
ReturningCustomer = 10
[ROI]
MarginBottom = 0
MarginLeft = 0
MarginRight = 0
MarginTop = 0
UseROI = 0

The settings are discussed in the following. NOTE: none of the parameters is allowed to be a negative value.

[ClothingStyle] These flags allow the developer to modify the behavior of the ClothingStyle detection feature:

  • ClothingHistogramBins: Increasing this parameter allows to distingish between more colors at the cost of reduced processing speed. Distinguishing between more colors is not necessarily good since slight variations of color are often produced by shadows in clothing folds. The exact amount of colors can be deduced by raising this parameter to the power of 3 (since RGB colorspace has 3 channels). The default setting is 15, equalling 3375 colors (15^3).
  • NumClothingColors: ClothingStyle detection returns the 3 most dominant colors by default. This parameter allows to increase the returned number of colors.

[Detectors] These flags allow the developer to disable any of the listed features of CrowdSight. Disabling features that the developer does not require can lead to improved processing speed. Additionally this allows to remove unused classifiers in the CrowdSight data folder allowing to reduce the package size. By default all detectors are enabled and set to 1. As an exception, the emotions estimation detector is dependant on the headpose detector and requires both headpose and emotion resource files to be present in the CrowdSight data folder.

  • UseAge [0, 1]: Turn Age detection on or off.
  • UseClothColors [0, 1]: Turn ClothingStyle detection on or off.
  • UseEmotions [0, 1]: Turn Emotion detection on or off. Requires UseHeadPose to be set to 1.
  • UseEthnicity [0, 1]: Turn Ethnicity detection on or off.
  • UseGender [0, 1]: Turn Gender detection on or off.
  • UseHeadPose [0, 1]: Turn HeadPose detection on or off.
  • UseMood [0, 1]: Turn Mood detection on or off.

[FaceDetector] These flags allow the developer to modify the behavior of the FaceDetection component. This control is mainly needed to deal with false detections, because face detection can be heavily influenced by cluttered environments or illumination changes. The following parameters can be set for performance optimization:

  • AcceptanceThreshold [0.0 - 1.0] : This threshold is used to filter face detections in order to remove detections that are not actual faces. Decreasing this parameter towards 0.0 increases the effectiveness of the filter at the cost of an occasional actual face being discarded as being false. Increasing this parameter towards 1.0 allows for less strict filtering resulting in more faces being detected at the cost of an occasional false detection. The default value for this parameter is 0.5.
  • AdaptiveThreshold [0, 1] : When enabled, the face detection filter will automatically learn an AcceptanceThreshold from examples passed to CrowdSight process function. When disabled the AcceptanceThreshold in settings.ini will be used. The default value for this paramter is 0.
  • DetectorType [0 - 5] : Determines which face detector is used. The face detector resources can be found in data/haarcascades folder. The following can be selected: 0 - haarcascade_frontalface_default, 1 - haarcascade_frontalface_alt, 2 - haarcascade_frontalface_alt2, 3 - haarcascade_frontalface_alt_tree, 4 - lbpcascade_frontalface, 5 - hog_facedetector. The lbpcascade_frontalface detector is the fastest available with the drawback of many false face detections and should generally be used on mobile devices. The haarcascades are slight variations of relatively fast face detector with a moderate amount of false face detections. The hog face detector is an accurate but very slow detector that also works well for angled faces. The latter should only be used on a system with a lot processing power.
  • MinFaceSize [20 - X] : Allows to restrict the minimum detected face size in order to obtain an increase in processing speed. The setting defines the minimum required width/height of a face in pixels. The absolute minimum size is 20 pixels, however, we advise a setting of at least 50 pixels to have more robust estimation of other features such as age and gender. The optimal face width and height is 120 pixels. MinFaceSize must be smaller than MaxFaceSize paramter. The default value for this paramter is 50.
  • MaxFaceSize [0 - X] : Allows to restrict the maximum detected face size in order to obtain an increase in processing speed. The setting defines the maximum required width/height of a face in pixels. 0 is a special case for arbitrary size. MaxFaceSize must be larger than MinFaceSize or equal to 0. The default value for this parameter is 0.
  • MaxNumPeople [1 - X]: The face detections are sorted by face size in descending order. This parameter determines the top x largest face detections that will be processed and returned. All other face detections will be discarded. Increasing this parameter allows to detect more people simultaneously at the cost of decreased processing speed. The default setting for this paramter is 15.
  • UseFaceTracking [0, 1] : Used for processing video sequences, such as webcam input or video files. This setting enables a face tracker that allows a face detection to be less flickering throughout the video sequence. This setting should be disabled when processing single images or if the developer wants to increase the processing speed. The default value for this paramter is 0.
  • UseFastDetection [0, 1] : When enabled, this setting can provide a speed improvement at the cost of missing a face occasionally. The default value for this parameter is 0.
  • UseProfileFaceDetection [0, 1] : This setting is depricated and should be set to 0. If you require profile face detection please use DetectorType 5. The default value for this paramter is 0. The developer is offered some control over the face detector.

[FaceRecognition] A light-weight algorithm for the task of recognizing and tracking people throughout a sequence of frames.

  • EvidenceLevel [0 - X]: the number of consecutive detections needed to assign an ID to an observation. When the EvidenceLevel is X, the number of consecutive detections needed will X. When EvidenceLevel is set to 0, every observation will get an ID assigned instantly, no extra consecutive observations are needed. The default EvidenceLevel is set to 3.
  • MemorySize [1 - X]: the maximum duration (in number of frames) within which a person may be recognized. The default is set to 100. Depending on system specifications, this can be increased significantly.
  • RecognitionLevel [0.0, 1.0]: a threshold that expresses the minimum required confidence for a match between a new observation and an existing model of a person (either in memory or from a loaded model). It is set to 0.35 by default. Increasing the parameter value leads to stricter recognition.

[General] General settings for the SDK

  • Smoothing [1 - X]: smooth results of the various detectors over a number of frames. Default is 2, for no smoothing set to 1.
  • ReturningCustomer [0 - X]: The timespan, expressed in seconds, during which a person can reside outside the field of view before being recognized as a returning customer.
  • AsianMode [0, 1]: can be used for localisation in Asian countries to improve accuracies of the age/gender classifiers.
  • NightMode [0, 1]: allows the CrowdSight library to be used with infra-red cameras.

[ROI] A Region Of Interest can be specified in terms of offsets from the top, bottom, left and right of the frame. CrowdSight analysis will be applied only to this specified region. As a result, the computational needs are lower.

  • UseROI [0, 1]: (de)activates ROI usage.
  • MarginBottom [0 - X] : Y coordinate of the bottom side of the ROI
  • MarginLeft [0 - X] : X coordinate of the left side of the ROI
  • MarginRight [0 - X] : X coordinate of the right side of the ROI
  • MarginTop [0 - X] : Y coordinate of the top side of the ROI

Installation instructions

Windows instructions

Double-click the .exe installer and follow the instructions.

OSX and Linux

Open a shell and input the following commands:

$ chmod +x CrowdSight-SDK-<x>.<y>.<z>-<os>.bin
$ sudo ./CrowdSight-SDK-<x>.<y>.<z>-<os>.bin

where x is the major version number, y is the minor version number, z is the patch version number and os is the target operative system (OSX or Linux).

Minimum requirements

  • Intel Core 2 Duo 1.6GHZ or better.
  • 2GB RAM
  • Input frames with a resolution of at least 640 x 480
  • Active Internet connection