Sightcorp logo

Multi-task face detection

Detecting faces in real crowd scenarios is no longer a major challenge in current research. Many frameworks exist that accurately localize faces in real-time using efficient object detection methods. To enrich this information, one takes the extracted face and applies classifiers to obtain estimated for the age, gender and other personal information of the person. This two-stage process however is expensive.

Ideally, we would reuse the features used in the face detection model to jointly learn classifiers for age, gender and other face features. This research will aim to incorporate these elements into a single real-time detection method. Challenges include bridging the gap between datasets with mutually exclusive task annotations, tackling domain shift between those datasets and optimizing a single CNN for features of different sizes.

Research questions:

  • How does one design an architecture to jointly perform detection and classification tasks in real-time?
  • How does one combine task-disjoint datasets in optimizing a multi-task architecture?