A market for emotions

With emotion-tracking software, Affectiva attracts big-name clients, aims for “mood-aware” Internet.

Emotions can be powerful for individuals. But they’re also powerful tools for content creators, such as advertisers, marketers, and filmmakers. By tracking people’s negative or positive feelings toward ads — via traditional surveys and focus groups — agencies can tweak and tailor their content to better satisfy consumers.

Increasingly, over the past several years, companies developing emotion-recognition technology — which gauges subconscious emotions by analyzing facial cues — have aided agencies on that front.

Prominent among these companies is MIT spinout Affectiva, whose advanced emotion-tracking software, called Affdex, is based on years of MIT Media Lab research. Today, the startup is attracting some big-name clients, including Kellogg and Unilever.

Backed by more than $20 million in funding, the startup — which has amassed a vast facial-expression database — is also setting its sights on a “mood-aware” Internet that reads a user’s emotions to shape content. This could lead, for example, to more relevant online ads, as well as enhanced gaming and online-learning experiences.

“The broad goal is to become the emotion layer of the Internet,” says Affectiva co-founder Rana el Kaliouby, a former MIT postdoc who invented the technology. “We believe there’s an opportunity to sit between any human-to-computer, or human-to-human interaction point, capture data, and use it to enrich the user experience.”

Ads and apps

In using Affdex, Affectiva recruits participants to watch advertisements in front of their computer webcams, tablets, and smartphones. Machine learning algorithms track facial cues, focusing prominently on the eyes, eyebrows, and mouth. A smile, for instance, would mean the corners of the lips curl upward and outward, teeth flash, and the skin around their eyes wrinkles.

Affdex then infers the viewer’s emotions — such as enjoyment, surprise, anger, disgust, or confusion — and pushes the data to a cloud server, where Affdex aggregates the results from all the facial videos (sometimes hundreds), which it publishes on a dashboard.

But determining whether a person “likes” or “dislikes” an advertisement takes advanced analytics. Importantly, the software looks for “hooking” the viewers in the first third of an advertisement, by noting increased attention and focus, signaled in part by less fidgeting and fixated gazes.

Smiles can indicate that a commercial designed to be humorous is, indeed, funny. But if a smirk — subtle, asymmetric lip curls, separate from smiles — comes at a moment when information appears on the screen, it may indicate skepticism or doubt.

A furrowed brow may signal confusion or cognitive overload. “Sometimes that’s by design: You want people to be confused, before you resolve the problem. But if the furrowed brow persists throughout the ad, and is not resolved by end, that’s a red flag,” el Kaliouby says.

Affectiva has been working with advertisers to optimize their marketing content for a couple of years. In a recent case study with Mars, for example, Affectiva found that the client’s chocolate ads elicited the highest emotional engagement, while its food ads elicited the least, helping predict short-term sales of these products.

In that study, some 1,500 participants from the United States and Europe viewed more than 200 ads to track their emotional responses, which were tied to the sales volume for different product lines. These results were combined with a survey to increase the accuracy of predicting sales volume.

“Clients usually take these responses and edit the ad, maybe make it shorter, maybe change around the brand reveal,” el Kaliouby says. “With Affdex, you see on a moment-by-moment basis, who’s really engaged with ad, and what’s working and what’s not.”

This year, the startup released a developer kit for mobile app designers. Still in their early stages, some of the apps are designed for entertainment, such as people submitting “selfies” to analyze their moods and sharing them across social media.

Still others could help children with autism better interact, el Kaliouby says — such as games that make people match facial cues with emotions. “This would focus on pragmatic training, helping these kids understand the meaning of different facial expressions and how to express their own,” she says.

Entrenched in academia

While several companies are commercializing similar technology, Affectiva is unusual in that it is “entrenched in academia,” el Kaliouby says: Years of data-gathering have “trained” the algorithms to be very discerning.

As a PhD student at Cambridge University in the early 2000s, el Kaliouby began developing facial-coding software. She was inspired, in part, by her future collaborator and Affectiva co-founder, Rosalind Picard, an MIT professor who pioneered the field of affective computing — where machines can recognize, interpret, process, and simulate human affects.

Back then, the data that el Kaliouby had access to consisted of about 100 facial expressions gathered from photos — and those 100 expressions were fairly prototypical. “To recognize surprise, for example, we had this humongous surprise expression. This meant that if you showed the computer an expression of a person that’s somewhat surprised or subtly shocked, it wouldn’t recognize it,” el Kaliouby says.

In 2006, el Kaliouby came to the Media Lab to work with Picard to expand what the technology can do. Together, they quickly started applying the facial-coding technology to autism research and training the algorithms by collecting vast stores of data.

“Coming from a traditional research background, the Media Lab was completely different,” el Kaliouby says. “You prototype, prototype, prototype, and fail fast. It’s very startup-minded.”

Among their first prototypes was a Google Glass-type invention with a camera that could read facial expressions and provide real-time feedback to the wearer via a Bluetooth headset. For instance, auditory cues would provide feedback, such as, “This person is bored” or, “This person is confused.”

However, inspired by increasing industry attention —- and with a big push by Frank Moss, then the Media Lab’s director — they soon ditched the wearable prototype to build a cloud-based version of the software, founding Affectiva in 2009.

Early support from a group of about eight mentors at MIT’s Venture Mentoring Service helped the Affectiva team connect to industry and shape its pitch — by focusing on the value proposition, not the technology. “We learned to build a product story instead of a technology story — that was key,” el Kaliouby says.

To date, Affectiva has amassed a dataset of about 1.7 million facial expressions, roughly 2 billion data points, ­from people of all races, across 70 different countries — the largest facial-coding dataset in the world, el Kaliouby says — training its software’s algorithms to discern expressions from all different face types and skin colors. It can also track faces that are moving, in all types of lighting, and can avoid tracking any other movement on screen.

A “mood-aware” Internet

One of Affectiva’s long-term goals is to usher in a “mood-aware” Internet to improve users’ experiences. Imagine an Internet that’s like walking into a large outlet store with sales representatives, el Kaliouby says.

“At the store, the salespeople are reading your physical cues in real time, and assessing whether to approach you or not, and how to approach you,” she says. “Websites and connected devices of the future should be like this, very mood-aware.”

Sometime in the future, this could mean computer games that adapt in difficulty and other game variables, based on user reaction. But more immediately, it could work for online learning.

Already, Affectiva has conducted pilot work for online learning, where it captured data on facial engagement to predict learning outcomes. For this, the software indicates, for instance, if a student is bored, frustrated, or focused — which is especially valuable for prerecorded lectures, el Kaliouby says.

“To be able to capture that data, in real time, means educators can adapt that learning experience and change the content to better engage students — making it, say, more or less difficult — and change feedback to maximize learning outcomes,” el Kaliouby says. “That’s one application we’re really excited about.”


Substack subscription form sign up