Lecture Summaries

Lecture Summaries

Offensive Language Detection in Hebrew: Can Other Languages Help?

Offensive Language Detection in Hebrew: Can Other Languages Help?

Dr. Marina Litvak, SCE

Session II: March 30th, 14:05-14:25

Unfortunately, offensive language in social media is a common phenomenon nowadays. It harms many people and vulnerable groups. Therefore, automated detection of offensive language is in high demand and it is a serious challenge in multilingual domains.
Various machine learning approaches combined with natural language techniques have been applied for this task lately.
This paper contributes to this area from several aspects:

  1. It introduces a new dataset of annotated Facebook comments in Hebrew;
  2. It describes a case study with multiple supervised models and text representations for a task of offensive language detection in three languages, including two Semitic (Hebrew and Arabic) languages;
  3. It reports evaluation results of cross-lingual and multilingual learning for detection of offensive content in Semitic languages; and
  4. It discusses the limitations of these settings.