Publications
- Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages (ACL)
- LLMs Are Few-Shot In-Context Low-Resource Language Learners (NAACL)
- An empirical study of multilingual reasoning distillation for question answering (EMNLP)
- Efficient Overshadowed Entity Disambiguation by Mitigating Shortcut Learning (EMNLP)
- McCrolin: Multi-consistency Cross-lingual Training for Retrieval Question Answering (EMNLP)
- SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages (EMNLP)
- Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino (PACLIC)
- Batayan: A Filipino NLP benchmark for evaluating Large Language Models (ACL)
- Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia (ACL)
- Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation (ACL)
- Towards better understanding of program-of-thought reasoning in cross-lingual and multilingual environments (ACL)
- SEA-HELM: Southeast Asian Holistic Evaluation of Language Models (ACL)
- ThaiInstruct: An instruction-following Dataset for Culturally-Aware, Multitask, and Multi-domain Evaluation in Thai (EMNLP)
- Shortcut Learning in Safety: The Impact of Keyword Bias in Safeguards (LLMSEC)
- Worldcuisines: A massive-scale benchmark for multilingual and multicultural visual question answering on global cuisines (NAACL)
- Language Surgery in Multilingual Large Language Models (MRL)
- SEA-LION: Southeast Asian Languages in One Network (AACL)
- Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages (ACL)
- LLMs Are Few-Shot In-Context Low-Resource Language Learners (NAACL)
- An empirical study of multilingual reasoning distillation for question answering (EMNLP)
- Efficient Overshadowed Entity Disambiguation by Mitigating Shortcut Learning (EMNLP)
- McCrolin: Multi-consistency Cross-lingual Training for Retrieval Question Answering (EMNLP)
- SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages (EMNLP)
- Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino (PACLIC)
- Batayan: A Filipino NLP benchmark for evaluating Large Language Models (ACL)
- Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia (ACL)
- Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation (ACL)
- Towards better understanding of program-of-thought reasoning in cross-lingual and multilingual environments (ACL)
- SEA-HELM: Southeast Asian Holistic Evaluation of Language Models (ACL)
- ThaiInstruct: An instruction-following Dataset for Culturally-Aware, Multitask, and Multi-domain Evaluation in Thai (EMNLP)
- Shortcut Learning in Safety: The Impact of Keyword Bias in Safeguards (LLMSEC)
- Worldcuisines: A massive-scale benchmark for multilingual and multicultural visual question answering on global cuisines (NAACL)
- Language Surgery in Multilingual Large Language Models (MRL)
- SEA-LION: Southeast Asian Languages in One Network (AACL)
