Artificial emotional intelligence or Emotion AI is also known as emotion recognition or emotion detection technology.
Humans use a lot of non-verbal cues, such as facial expressions, gesture, body language and tone of voice, to communicate their emotions. Our vision is to develop Emotion AI that can detect emotion just the way humans do, from multiple channels. Our long term goal is to develop “Multimodal Emotion AI”, that combines analysis of both face and speech as complementary signals to provide richer insight into the human expression of emotion. For several years now, Affectiva has been offering industry leading technology for the analysis of facial expressions of emotions. Most recently, Affectiva has added speech capabilities now available to select beta testers (learn more here).
Emotion detection – Face
Our Emotion AI unobtrusively measures unfiltered and unbiased facial expressions of emotion, using any optical sensor or just a standard webcam. Our technology first identifies a human face in real time or in an image or video. Computer vision algorithms identify key landmarks on the face – for example, the corners of your eyebrows, the tip of your nose, the corners of your mouth. Deep learning algorithms then analyze pixels in those regions to classify facial expressions. Combinations of these facial expressions are then mapped to emotions.
In our products, we measure 7 emotion metrics: anger, contempt, disgust, fear, joy, sadness and surprise. In addition, we provide 20 facial expression metrics. In our SDK and API we also provide emojis, gender, age, ethnicity and a number of other metrics. Learn more about our metrics here.
Emotion detection – Speech
Our speech capability analyzes not what is said, but how it is said, observing changes in speech paralinguistics, tone, loudness, tempo, and voice quality to distinguish speech events, emotions, and gender. The underlying low latency approach is key to enabling the development of real-time emotion-aware apps and devices.
Our first speech based product is a cloud-based API that analyzes a pre-recorded audio segment, such as an MP3 file. The output file provides the analysis on speech events occurring in the audio segment every few hundred milliseconds and not just at the end of the entire utterance. An Emotion SDK that analyzes speech in real-time will be available in the near future.
Data and accuracy
Our algorithms are trained using our emotion data repository, that has now grown to nearly 6 million faces analyzed in 87 countries. We continuously test our algorithms to provide the most reliable and accurate emotion metrics. Now, also using deep learning approaches, we can very quickly tune our algorithms for high performance and accuracy. Our key emotions achieve accuracy in the high 90th percentile. We sampled our test set, comprised of hundreds of thousands of emotion events, from our data repository. This data has been gathered representing real-world, spontaneous facial expressions and vocal utterances, made under challenging conditions such as changes in lighting and background noise, and variances due to ethnicity, age, and gender. You can find more information on how we measure our accuracy here.
How to get it
Our emotion recognition technology is available in several products. From an easy-to-use SDK and API for developers, to robust solutions for market research and advertising.