At a recent CNRS conference organised on the theme “Artificial intelligence: the computer overcomes the language barrier”, the researcher Farah Benamara spoke about advances in automatic processing of evaluative language, a term that refers to the expression of opinions and positive or negative feelings.
What is evaluative language, and how is it studied?
Farah Benamara: Evaluative language is that used to express feelings, opinions, wishes, expectations and future intentions. It relates to how people feel and their emotions, and is therefore frequently found in online content such as the social networks. Although the term ‘evaluative language’ is widely used in linguistics, it has not yet been fully appropriated by information technology. However, the concepts it encompasses go well beyond the mere analysis of feelings.
Research into the subject began at the end of the 1990s, with the automatic identification of adjectives that convey subjective language. Over the following ten years, developments in computational linguistics made it possible to draw up lexicons of words polarised according to whether they have positive or negative connotations. It then became clear that this was not sufficient, since the overall tone of a text is hugely dependent on context. The connections between words and sentences are determined by a great deal of non-written information. For example, the adjective ‘long’ can express a negative or positive opinion depending on the situation, e.g. ‘a long life’ as opposed to ‘a long wait’. Or one might say ‘lovely day!’ even though it’s pouring with rain. The use of irony and sarcasm means that words cannot be analysed in an overly simple and direct manner. Similarly, detecting racist or sexist messages automatically is not easy, as they are often tempered by humour, which makes it more difficult.
What tools do researchers use?
F. B.: Our starting point is the language itself, which we study from a linguistic viewpoint. These approaches then have to be automated, for instance by training artificial intelligences with supervised learning methods of the ‘deep learningFermerA machine learning technique in artificial intelligence (AI) that uses neural networks (mathematical functions). These are able to extract/analyse/classify abstract characteristics from the data submitted to them, without explicit rule production. It is not known how the system arrives at the result: this is known as a “black box” model. On the other hand, the symbolic approach, another important method in AI, relies on logical rules written by human programmers, which are therefore perfectly well understood (“transparent box” model).’ type. This requires gathering large amounts of tagged data, i.e. one that has been assigned a value or attribute. Such elements are fairly widely available in English, but much less so in other languages like French.
How did you become interested in this field?
F. B.: I started off with the detection of evaluative language in film and restaurant reviews, as well as in online reader comments. It is fascinating to see how Internet users adapt to the website where they are writing, adopting a more formal style of language, as in feedback on press articles. Automatic recognition of evaluative language has to take these modifications into account. The traditional approach is to count positive and negative words and then calculate the difference. Yet this is not sufficient, since many reviewers will list the shortcomings of a film, only to conclude that they basically enjoyed it, which is essentially what readers will remember. Along the same lines, in 2013 I jointly supervised a PhD thesis on the identification of irony and sarcasm, since they both have a significant impact on the performance of systems that detect the polarity of a sentence. In collaboration with the University of Turin (Italy) and the Polytechnic University of Valencia (Spain), we studied the extent to which irony detection models could be adapted to a multilingual framework. In the main, we observed similar patterns in French, English, Spanish and Italian. We then left the field of Indo-European languages and turned to Arabic and its many dialects.
You are also working on hate speech. How do researchers deal with it?
F. B.: I’ve been working on automatic detection of these messages, and especially those targeting women. In fact, I am currently co-supervising a thesis on the subject. We are hoping to develop models that can moderate content semi-automatically. Together with linguists from the Institut Jean Nicod and researchers in communication science at the laboratory for studies and applied research in social sciences (LERASS) at Université Toulouse III – Paul Sabatier (southwestern France), we have drawn up a classification of messages as well as a model able to determine when a fact is glorified or, on the contrary, when it is denounced. People relating their own experiences tend to quote the hateful terms they have been addressed and the situations they have been through. As this model is mainly designed to detect hatred of women, we wondered whether it could be applied to other types of problem. We wanted to see whether a model specialised in identifying racism could easily recognise sexism, or whether the identification of anti-immigrant hostility could be swiftly adapted to islamophobia. Our results are encouraging. In the long term, our goal is to reliably train artificial intelligences (AI) without having to systematically recreate corpuses of hand-tagged data, an expensive and time-consuming process. As part of the European project STERHEOTYPES,, we are going even further, in a bid to identify racism in a multilingual context.
How does this work with a language like Arabic, which is widely spoken throughout the world, but has many different forms?
F. B.: In practice, Internet users rarely communicate online in Classical Arabic. Although evaluative language processing is in high demand, we have very few resources for dealing with the various dialects. We began by working on the detection of irony in Arabic vernacular on Twitter. Initially, we chose not to take differences between regional languages into account, and used deep learning to train our algorithms without worrying about that. Together with colleagues from the Algiers-based USTHB University (Algeria), we undertook a more precise analysis of the performance of the models, while at the same time using several dialects. Despite significant variation, we showed that these AI models achieved honourable results even with a mixture of idioms. Yet again, the aim is to obtain high-quality detection without having to tag specific data for each and every one of them. This is a global challenge that transcends the context of Arabic.
What other possible applications are there?
F. B.: The work that I’ve described generally concerns online comments and messages that relate to more or less recent events. However, evaluative language can also help identify intentions and wishes. As part of the INTACT project, we are developing models adapted to environmental crises and disasters. We have observed that, in the aftermath of hurricanes and floods, many people rely on the social networks to ask for help or report injuries instead of calling the emergency services, which are often overwhelmed. In France, there is no algorithm capable of surveying public messages and providing the best possible response.
Outside of my own work, the analysis of evaluative language can also serve to predict voting intentions, or to check whether people are satisfied following an online discussion, with a chatbot for example. Models will also be able to notice whether someone is getting angry or losing interest in a conversation, so as to help the other participant react accordingly. Marketing departments are also keen on this as a way of finding out what their customers think of their products. Similarly, such techniques can spot fake profiles used to make mass postings of positive comments, or negative ones if it is the competition that is being targeted.