Detecting gender-based hate speech in Spanish with Natural Language Processing

Published in

Becoming Human: Artificial Intelligence Magazine

10 min readOct 5, 2020

By: Alejandra Pedroza, Liliana Badillo, and Iván Galaviz

**Figure 1. Hate Speech Illustration. Source: istock.com**

Chances are that you have been involved in a situation of violence against women and you didn’t even notice it. You could have been the victim or you could have been the victimizer and, again, you didn’t notice it.

How is it possible to be the victim or victimizer without being aware of it? Violence against women takes different forms and one of the most common manifestations seems to be so normalized that it unconsciously comes up in our daily life: verbal violence.

We materialize verbal violence in our daily speech and, nowadays, one part of our daily speech has moved to social media. For this reason, we need effective methods to detect this hate speech that takes place online.

In this article, we explain how we used Natural Language Processing (NLP) techniques to detect online gender-based hate speech in Mexican Spanish.

We call this project Violentómetro Online. It was the result of the work we did as part of the Social Data Challenge contest, powered by Datalab Community in Guadalajara, México. With this effort, we won the 3rd place of the contest.

Problem

In this section, we problematize the gender violence and online violence topics. We also present some theory that helps to better understand the phenomenon.

Gender Violence

Worldwide, three out of 10 women have suffered any type of violence (WHO, 2017). The situation in Mexico is more critical. From the 46.5 million Mexican women aged 15 and over, six out of 10 (30.7 million) have faced any type of violence at some point in their lives (INEGI, 2017, p. 1 ).

Violence against women gets materialized in different ways (INEGI, 2017):

● Emotional

● Physical

● Sexual

● Economic

● Discrimination

● Verbal

To differentiate gender-based violence from other forms of violence, we have to pay attention to the cause. In these cases, there is no other motivation for such abuse than gender. These are women who have been abused in public or private life and there is no other crime (like robbery or drug trafficking) involved as a detonator (WHO, 2017).

Online Verbal Violence

Verbal violence is a type of abuse that does not cause physical injury. Due to its nature, verbal violence is manifested through hate speech (Carmi-Iluz, Peleg, Freud, & Shvartzman, 2005; Mouradian, s.f.).

Nowadays, hate speech has moved to new forums because of the proliferation of information and communication technologies. Currently, it is necessary to talk about “online violence”, which can go unnoticed from a screen (Ruiz, 2014).

The Internet itself stimulates aggressiveness and creates an environment in which even non-aggressive people can have aggressive behaviors in their interactions (Bañón Hernández, 2010). This is because the media in the Internet have properties related to:

● Anonymity: Provides a hidden identity to the victimizer.

● Distance: Makes the victim invisible.

● Ubiquity: Provides a greater exposure and availability to hate speech.

● Disinhibition: Establishes absence of regulations for content.

**Figure 3. Screenshot of Gender-Based Hate Speech. Source:** **https://wecounterhate.com/**

Solution

In this project, we aimed to develop an effective method to automatically detect online gender-based hate speech in Spanish. To achieve this goal, we worked within the field of Machine Learning and took advantage of Natural Language Processing techniques.

In this section, we describe the tools and data we used in our project. We also report the cleaning and preprocessing methods we applied to better process our data. We explain how we executed the experiments using different Machine Learning models and parameters, and how we obtained the best model. Finally, we present the web application prototype we developed to make Violentómetro Online a user-friendly product.

Tools

The following are the tools we used to obtain, preprocess, analyze, and model the data, as well as to develop the web application prototype:

Python: To develop the code for the data processing and for the web application development. We used the following libraries:
SpaCy
Scikit-learn
Seaborn
Streamlit
Github: To control the version of the code we developed collaboratively.
Google Colab: To develop our code collaboratively online.
Heroku: To host the web application prototype.

Data

In this project, we used the following datasets:

Facebook Comments
MEX-A3T Train Aggressiveness
Final Dataset

Facebook Comments

Previous to the Social Data Challenge, one member of our team gathered this dataset using the Facebook API. It was obtained from comments published in news’ posts related to International Women’s Day in the week of March 6 to 13, 2018. The news were published on the Facebook accounts of the following media:

We manually tagged the dataset to identify specific hate speech against women. The dataset ended up with the following characteristics:

1, 968 records
2 columns
2 categories:
1 = Contains hate speech against women
0 = Does not contain hate speech against women

Train Aggressiveness

MEX-A3T: Fake News and Aggressiveness Analysis is an event organized by the NLP community in Mexico to detect fake news and text with hate speech. The researchers that organize this event shared the training dataset with us. The dataset has the following characteristics:

7 mil 332 records
2 columns
2 categories:
1 = Contains general hate speech
0 = Does not contain general hate speech

**Figure 5. Train Aggressiveness Dataset**

Final Dataset

To perform the experiments, we joined the two datasets described before. The Facebook Comments dataset was modified so that category 1 was replaced by 2 (hate speech against women). We also performed some feature engineering and added more columns (for more details, see the Preprocessing section). The final dataset resulted in the following characteristics:

9,300 records
5 columns
3 categories:
2 = Contains hate speech against women
1 = Contains general hate speech
0 = Does not contain any type of hate speech

Preprocessing

We performed some preprocessing with both cleaning and feature engineering techniques. The cleaning techniques were useful to prepare the dataset for the analysis, whereas the feature engineering techniques were useful to improve the performance of our model. The preprocessing included the following actions:

**Table 1. Cleaning and Feature Engineering Techniques Used in this Project**

Exploratory Data Analysis

Using basic tools of statistics, we performed an exploratory data analysis (EDA) to inspect the composition of the two original datasets we are using. We examined characteristics such as the length of the messages (records), the quantity of words in general, and the quantity of unique words.

The EDA shows that the distributions of the Train Aggressiveness dataset and the Facebook Comments dataset are different. For example, the distribution of the length of the comments is different according to the platform in which they were published. Comments on Facebook tend to be larger than tweets and the use of words tend to be different.

**Figure 6. Distribution of the Train Aggressiveness Dataset**

**Figure 7. Distribution of the Facebook Comments Dataset**

Regarding the type of words, in the Train Aggressiveness dataset, we could notice that the content was similar for both hate speech and no hate speech categories.

We noticed that these similarities occur because one message can contain single words related to vulgarity or obscenity, but in its complete structure, it is not hate speech given that it does not attack a specific group or individual. On the other hand, there are also messages that contain single words related to vulgarity or obscenity, and it does attack a specific group or individual. These similarities made it challenging for our algorithms to classify.

The following word clouds show how similar are the most frequent words of the Train Aggressiveness dataset in both hate speech and no hate speech categories.

**Figure 8. Word Clouds Showing the Most Frequent Words of the Train Aggressiveness Dataset**

In the Facebook Comments dataset, the content has more variations for both hate speech and no hate speech categories. In the first one, among the most frequent words appear terms with negative connotations such as “feminazi” and “pinche”. Whereas in the second one, among the most frequent words appear terms related to gender equality discussions such as “igualdad” and “derecho”.

The following word clouds show the most frequent words of the Facebook Comments dataset in both hate speech and no hate speech categories.

**Figure 9. Word Clouds Showing the Most Frequent Words of the Facebook Comments Dataset**

Experiments

In order to develop an effective method to automatically detect online gender-based hate speech in Mexican Spanish, we executed the following experiments:

Creation of base models to see which presented the best performance and improve it. We used Logistic Regression, Naive Bayes, and Random Forest as base models. The following table shows the score results we got with this experiment:

**Table 2. Results Experimenting with Different Algorithms**

Model training using Random Forest with n-grams ranging from 1 to 3. The following table shows the score results we got with this experiment:

**Table 3. Results Experimenting with N-Grams**

Model training using Random Forest with additional features: Length_Text, Number_Words_Text, and Number_Unique_Words. The following table shows the score results we got with this experiment:

**Table 4. Results Experimenting with Additional Features**

Use of the RandomizedSearchCV technique to find the best parameters in Random Forest, using n-grams and the additional features. The following table shows the score results we got with this experiment:

Results

With all the experiments, the best result we got was 0.83 of F1 Score, Precision, and Recall. The model with the best score has the following characteristics:

Algorithm: Random Forest
Preprocessing technique: Bag of words
New features: Length_Text, Number_Words_Text, and Number_Unique_Words
Parameters tuning: According to the best parameters that RandomizedSearchCV found

Prototype of a Web Application

Besides, we developed a prototype of a web application with the model that showed the best results. We used the Streamlit framework to develop the web application and GitHub Actions to deploy it (with continuous integration) on Heroku.

In this app, any user can enter a piece of text, then the algorithm analyzes if the text contains general hate speech and hate speech against women, and it provides an output to the user with the following categories:

2 = Contains hate speech against women
1 = Contains general hate speech
0 = Does not contain any type of hate speech

The following image shows a screenshot of the application prototype:

**Figure 10. Web Application Prototype of the Violentómetro Online**

NOTE: To see the web application prototype, visit https://violentometro-online.herokuapp.com/.

Conclusion and Discussion

Our main goal was to develop an effective method to automatically detect online gender-based hate speech in Mexican Spanish. With this project, we were able to develop a method of detection by using Natural Language Processing techniques and by experimenting with Machine Learning models for classification. Besides, we were able to deploy our best model in a web application. However, we still need to validate the effectiveness of both, the model and the web application.

At this point, we only used open-source tools to develop our solution. We have some extra experiments pending, these require more computing capacity but might improve the performance of our model.

Also, there are open possibilities to test this solution with other Spanish variations different from Mexican.

For future work, it is possible to escalate our solution and to add more features to it, given the code quality and its organization. We have set as future steps to move from bag-of-words to more advanced techniques such as Deep Learning.

In addition, the model can be hosted independently to take advantage of the storage capacity. With extra storage, we would like to gather feedback from the users to improve our model and application.

Finally, we decided to approach the topic of gender-based violence with Natural Language Processing because we believe that the discipline of Data Science has the capacity to be a detonator of great solutions for social problems. We are convinced that detection is the first step to prevent any type of violence.

To see the code, experiments, and results of this project, visit this GitHub repository: violentometro-online.