TY - GEN
T1 - Cross-lingual visual pre-training for multimodal machine translation
AU - Caglayan, Ozan
AU - Kuyu, Menekse
AU - Amac, Mustafa Sercan
AU - Madhyastha, Pranava
AU - Erdem, Erkut
AU - Erdem, Aykut
AU - Specia, Lucia
N1 - Publisher Copyright:
© 2021 Association for Computational Linguistics
PY - 2021
Y1 - 2021
N2 - Pre-trained language models have been shown to improve performance in many natural language tasks substantially. Although the early focus of such models was single language pre-training, recent advances have resulted in cross-lingual and visual pre-training methods. In this paper, we combine these two approaches to learn visually-grounded cross-lingual representations. Specifically, we extend the translation language modelling (Lample and Conneau, 2019) with masked region classification and perform pre-training with three-way parallel vision & language corpora. We show that when fine-tuned for multimodal machine translation, these models obtain state-of-the-art performance. We also provide qualitative insights into the usefulness of the learned grounded representations.
AB - Pre-trained language models have been shown to improve performance in many natural language tasks substantially. Although the early focus of such models was single language pre-training, recent advances have resulted in cross-lingual and visual pre-training methods. In this paper, we combine these two approaches to learn visually-grounded cross-lingual representations. Specifically, we extend the translation language modelling (Lample and Conneau, 2019) with masked region classification and perform pre-training with three-way parallel vision & language corpora. We show that when fine-tuned for multimodal machine translation, these models obtain state-of-the-art performance. We also provide qualitative insights into the usefulness of the learned grounded representations.
UR - https://www.scopus.com/pages/publications/85107296187
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=performanshacettepe&SrcAuth=WosAPI&KeyUT=WOS:000863557001034&DestLinkType=FullRecord&DestApp=WOS_CPL
U2 - 10.18653/v1/2021.eacl-main.112
DO - 10.18653/v1/2021.eacl-main.112
M3 - Conference contribution
AN - SCOPUS:85107296187
T3 - EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
SP - 1317
EP - 1324
BT - EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - 16th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2021
Y2 - 19 April 2021 through 23 April 2021
ER -