Abstract: As multilingual artificial intelligence systems proliferate, achieving robust cross-lingual understanding remains an open challenge. Recent works have made progress on visual question ...