Analyzing Sentiments of German Job References
Filling a vacancy takes a lot of (costly) time. Automated preprocessing of applications using artificial intelligence technology can help to save time, e.g., by analyzing applications using machine learning algorithms. We investigate whether such systems are potentially biased in terms of gender, origin, and nobility. Using a corpus of common German reference letter sentences, we investigate two research questions. First, we test sentiment analysis systems offered by Amazon, Google, IBM and Microsoft. All tested services rate the sentiment of the same template sentences very inconsistently and biased at least with regard to gender. Second, we examine the impact of (im-)balanced training data sets on classifiers, which are trained to estimate the sentiment of sentences from our corpus. This experiment shows that imbalanced data, on the one hand, lead to biased results, but on the other hand, under certain conditions, can lead to fair results.