Abstract of Auto Detection of Offensive Language in Social media

Abstract of  “Auto Detection of Offensive Language in Social media like Facebook and Twitter etc ” final year project

Spamming and Cyberbullying are becoming common these days with the increase in the use of social media like Facebook and Twitter etc.

Keeping this in view, there is a need for automated identification and analysis models that are helpful in offense detection on social media like Facebook and Twitter, etc.

The purpose of our Auto Detection of Offensive Language in Social media project is to Identify and Categorize Offensive languages on social media like Facebook and Twitter etc. The task is divided into two subtasks:

Sub task A is to detect offense and subtask B is to categorize whether the offense is targeted or

not. For this purpose, we used a dataset of the tweets released in 2019 by SemEval named OLID.

As a baseline, we performed four machine learning classifiers including

  1. Random Forest
  2. Naïve Bayes.
  3. SVM,
  4. Logistic Regression

In advancement of this machine

learning models, we have performed deep learning models including state of the art BERT

model, newly introduced Elmo Embedding with SVM and Logistic Regression, Convolutional Neural Network(CNN), LSTM, and BilSTM with word2Vec and Glove

Embedding. Results showed that the BERT model and Elmo Embeddings with SVM performed

well as compared to other models giving an F1-score of 0.84 each in subtask A whereas Elmo

with Logistic Regression performed the best giving an F1 score of 0.921.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top