I have finally handed in my master thesis on Predicting Political Party Affiliation from Danish and Norwegian Parliamentary Speeches, and can now call myself MSc in Computer Science. By examining state-of-the-art NLP models against classical text classification methods, I show that party affiliation of a given speaker is indeed possible (best F1-scores=0.67). Predictions, however, depend on the diversity of the political ideologies of the parties, and on the methods used. State-of-the-art neural network methods (BERT) showed to perform on par or better than Support Vector Machines (trained on ‘contextual’ features), despite being trained on fewer words.
Thanks to Analyse & Tal for providing me with the Danish parliamentary dataset, as well as helping me run initial experiments. Check out their website kendditfolketing.dk, based on the same dataset, for a great overview of the intricacies and statistics on the Danish Parliament and politicians.
Please contact me if you are interested or if you want to read the thesis.