Beiduo Chen

profile.png

Download my CV

 

I’m currently an ELLIS Ph.D. student in The Center for Information and Language Processing (CIS) at Ludwig-Maximilians-Universität München (LMU Munich), supervised by Prof. Dr. Barbara Plank who is leading MaiNLP Lab, and co-supervised by Prof. Dr. Anna Korhonen in Language Technology Laboratory (LTL) at University of Cambridge.

Before that, I received my Master and Bachelor’s degree with an outstanding graduate award from University of Science and Technology of China (USTC), supervised by Prof. Dr. Wu Guo, and co-advised by Prof. Dr. Zhen-Hua Ling at USTC NLP Group.

 

 


Research Interests

My research focuses on evaluating and interpreting LLM behavior beyond single accuracy numbers. More broadly, I work on Natural Language Processing (NLP) and Large Language Models (LLMs), with a particular focus on LLM evaluation, reasoning and uncertainty, human-centered NLP, and multilingual/cultural evaluation. I am interested in developing evaluation and interpretation frameworks that better capture how models reason, express uncertainty, reflect human variation, and generalize across languages, cultures, and tasks.

  • LLM Reasoning, Uncertainty & Benchmark Evaluation:
    I study how LLMs reason, how Chain-of-Thought (CoT) affects their predictions, and how reasoning traces can be evaluated beyond final-answer accuracy. Recent work investigates CoT mechanisms and uncertainty through human judgment distributions (EMNLP 2025 Oral; ACL 2026 findings-a), as well as how CoT traces transfer across LLMs (Cheng and Chen et al., 2026). I am also interested in robust benchmark design, aggregation-based metrics, LLM-as-a-judge evaluation, and trustworthy comparison of frontier models.

  • Human-centered NLP, Human Label Variation & Explanation-based LLM Alignment:
    I study human label variation as a meaningful signal rather than noise, focusing on how annotators and LLMs differ in labels, explanations, and reasoning patterns. My work explores explanation-based LLM alignment and within-label variation (EMNLP 2025 SAC Highlights Award; ACL 2025 findings; EMNLP 2024 findings), decomposes annotation disagreement through labels and explanations (ACL 2026 findings-b), and post-trains LLMs to simulate annotator-specific label-explanation behavior via reinforcement learning (Chen et al., 2026). Earlier work also explored human-inspired language model pre-training strategies (ACL 2023 findings).

  • Multilingualism, Cross-lingual Transfer & Cultural Evaluation:
    I work on multilingual NLP and cross-lingual generalization, including cross-lingual transfer (EMNLP 2022; ICPR 2022), cross-lingual post-training (ICASSP 2022), and multilingual adaptation (SemEval@NAACL 2022 Champions). More recently, I have contributed to scalable evaluation frameworks for multilingual and cultural awareness in LLMs (EMNLP 2025 findings).

Feel free to reach out if you are interested in topics related to NLP, LLM evaluation, reasoning, human-centered modeling, or multilingual and cultural evaluation.

 


news

Apr 06, 2026 :blue_book: Two papers accepted to ACL 2026 (2 findings)!
Nov 08, 2025 :microphone: Invited Panelist on 4th Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ EMNLP 2025, Suzhou, China.
Nov 07, 2025 :trophy: Our EMNLP 2025 Main paper LiTEx has been selected as SAC Highlights Award (<1% acceptance)! Huge thanks to all wonderful co-authors.
Nov 04, 2025 :mahjong: I’ll be at Suzhou to present our papers “Threading the Needle”(Oral), “LiTEx”(Poster), and “MAKI”(Poster) on EMNLP 2025.
Oct 28, 2025 :microphone: Invited Talk on From “Explanation Lines” to a “Comprehensive Ellipse” — Modeling Rational Variation with LLMs at Dealing with Meaning Variation in NLP - 3rd Yearly Workshop at Utrecht University, Netherlands.
Oct 10, 2025 :microphone: Invited Talk on Explanation-based Human Label Variation Modeling at Language Technology Lab Seminars at University of Cambridge, United Kingdom.
Sep 22, 2025 :sparkles: Start my ELLIS Ph.D. exchange at LTL, University of Cambridge for one month!
Sep 15, 2025 :tada: Our EMNLP 2025 Main Paper Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation has been selected as Oral Presentation (<3% acceptance).
Aug 25, 2025 :microphone: I’ll be in Warsaw to attend ELLIS Doctoral Symposium 2025 on Robust AI and present our recent works on human label variation with LLMs.
Aug 20, 2025 :blue_book: Three papers accepted to EMNLP 2025 (2 main, 1 findings)!