Neural network hate deletion: Developing a machine learning model to eliminate hate from online comments - UTU Tutkimustietojärjestelmä

A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

Neural network hate deletion: Developing a machine learning model to eliminate hate from online comments

Tekijät: Joni Salminen, Juhani Luotolahti, Hind Almerekhi, Bernard J. Jansen, Soon-gyo Jung

Toimittaja: Svetlana S. Bodrunova

Konferenssin vakiintunut nimi: International Conference on Internet Science

Kustantaja: Springer Verlag

Julkaisuvuosi: 2018

Journal: Lecture Notes in Computer Science

Kokoomateoksen nimi: Internet Science: 5th International Conference, INSCI 2018, St. Petersburg, Russia, October 24–26, 2018, Proceedings

Tietokannassa oleva lehden nimi: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Sarjan nimi: Lecture Notes in Computer Science

Vuosikerta: 11193

Aloitussivu: 25

Lopetussivu: 39

ISBN: 978-3-030-01436-0

eISBN: 978-3-030-01437-7

ISSN: 0302-9743

DOI: https://doi.org/10.1007/978-3-030-01437-7_3

Verkko-osoite: https://link.springer.com/chapter/10.1007/978-3-030-01437-7_3

Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/36541856

Tiivistelmä

We propose a method for modifying hateful online comments to non-hateful
comments without losing the understandability and original meaning of
the comments. To accomplish this, we retrieve and classify 301,153
hateful and 1,041,490 non-hateful comments from Facebook and YouTube
channels of a large international media organization that is a target of
considerable online hate. We supplement this dataset by 10,000 Reddit
comments manually labeled for hatefulness. Using these two datasets, we
train a neural network to distinguish linguistic patterns. The model we
develop, Neural Network Hate Deletion (NNHD), computes how hateful the
sentences of a social media comment are and if they are above a given
threshold, it deletes them using a language dependency tree. We evaluate
the results by comparing crowd workers’ perceptions of hatefulness and
understandability before and after transformation and find that our
method reduces hatefulness without resulting in a significant loss of
understandability. In some cases, removing hateful elements improves
understandability by reducing the linguistic complexity of the comment.
In addition, we find that NNHD can satisfactorily retain the original
meaning on average but is not perfect in this regard. In terms of
practical implications, NNHD could be used in social media platforms to
suggest more neutral use of language to agitated online users.

Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

Neural network hate deletion.pdf