Open PhD Position on « Abusive language detection in social media »
The increasing popularity of social media platforms has led to a rise in the presence of hate and aggressive speech on these platforms. Despite the number of approaches recently proposed in the Natural Language Processing research area for detecting these forms of abusive language, the issue of identifying abusive language at scale is still an unsolved problem.
Social media platforms as Facebook, Twitter and YouTube promised to remove abusive language from their platforms within 24 hours in accordance with the EU commission code of conduct and face fines if they systematically fail to remove abusive content within 24 hours. However, human processes do not have the ability to respond so quickly and at scale to this need.
This thesis proposal addresses the challenge of developing scalable computational methods that can reliably and efficiently detect and mitigate the use of abusive language online, that can often be extremely subtle and highly context dependent.
More precisely, the goal is to define algorithms for automatically identifying abusive language in short text messages. Given the high degree of dynamicity of such a kind of harmful and abusive words, adaptive strategies are required to find such words from the streaming of text. To fully capture the semantics of abusive messages, the selected candidate will combine argument mining, sentiment analysis and emotion recognition methods, so that the identification of harmful and abusive words will be coupled with the identification of more subtle messages where harmful arguments are addressed against the victim(s). The identified arguments will also be classified with respect to the topic of the abuse (e.g., sexism, racism). From the methodological point of view, both standard machine learning algorithms and deep learning methods will be experimented.
This PhD thesis proposal is particularly relevant these days given the raising importance the issue of abusive language on social network is gaining.
Skills and profile:
• Master degree in Data Science, Computer Science or Computational Linguistics is required.
• Programming skills are required.
• Knowledge of Natural Language Processing and Machine Learning is preferred.
• Fluent English required, both oral and written. French is appreciated but not mandatory.
WIMMICS ( http://wimmics.inria.fr/ ) is a research team of Université Côte d’Azur (UCA). The research fields of the team are graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities.
Location: I3S laboratory, Sophia Antipolis, France.
Duration: 3 years.
Deadline for applications: May 10th, 2019.