Abstract
Detection of propaganda techniques is a research and development project aimed at developing a high-quality machine learning model to detect propaganda content in online news articles. Limited work has been done in the field. Active research work related to text classification of news content began after 2016. Traditional methods of natural language processing have been outperformed by deep learning methods in the modern era. This research will also revolve around state-of-the-art deep learning techniques. An article in the training set is first processed to form vector representation of the whole article which is used to identify the spans containing propaganda while each propaganda span is then processed to identify the underlying propaganda technique. Initially a random statistical model shall be used to establish a baseline while various variants of recurrent neural networks (such as LSTM and Bi-LSTM) shall be used for both of these tasks. Finally, a pre-trained BERT model shall be finely tuned on the training set to identify the span containing propaganda and the type of propaganda technique used in the span. The best performing model is then selected to be used in the second half of the research. Due to the growing importance of this research work, SemEval has announced this as one of the tasks for this year’s international workshop. SemEval is responsible for hosting the datasets and final evaluations for this project.