Automatic detection of (potential) factors in the source text leading to gender bias in machine translation

Automatische detectie van (mogelijke) brontekstfactoren die in automatische vertaling tot gender bias kunnen leiden
Begin - Einde 
2023 - 2023 (lopend)



With a growing use of and interest in machine translation (MT) and a growing demand for gender-inclusiveness, research on social biases (e. g., gender bias) in MT is increasing. Research predominantly focuses on top-down methodologies for predefined categories of parts-of-speech. This research encompasses a novel bottom-up methodology to broaden the scope of research and gender bias by focussing on source text analysis. The goal is the creation of a detection system that can automatically analyse source data and detect features that influence the gender inflection in target translation, and with that, lead to gender bias in MT. This detection system will be a machine learning model trained on a taxonomy created as part of this research proposal, based on data manually annotated and extended with morpho-syntactic information from dependency trees. The aim is to develop a comprehensive methodology to help make AI-powered technologies (i. e. MT) more gender-inclusive for society.