Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2
publications
Multi-Level Contrastive Learning for Cross-Lingual Alignment
Published in 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), 2022
Cross-language pre-trained models such as multilingual BERT (mBERT) have achieved significant performance in various cross-lingual downstream NLP tasks. This paper proposes a multi-level contrastive learning (ML-CTL) framework to further improve the cross-lingual ability of pre-trained models. The proposed method uses translated parallel data to encourage the model to generate similar semantic embeddings for different languages. However, unlike the sentence-level alignment used in most previous studies, in this paper, we explicitly integrate the word-level information of each pair of parallel sentences into contrastive learning. Moreover, cross-zero noise contrastive estimation (CZ-NCE) loss is proposed to alleviate the impact of the floating-point error in the training process with a small batch size. The proposed method significantly improves the cross-lingual transfer ability of our basic model (mBERT) and outperforms on multiple zero-shot cross-lingual downstream tasks compared to the same-size models in the Xtreme benchmark.
Recommended citation: B. Chen, W. Guo, B. Gu, Q. Liu and Y. Wang, "Multi-Level Contrastive Learning for Cross-Lingual Alignment," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 2022, pp. 7947-7951, doi: 10.1109/ICASSP43922.2022.9747720. keywords: {Training;Conferences;Semantics;Bit error rate;Estimation;Benchmark testing;Signal processing;Cross-language pre-trained model;contrastive learning;multi-level;cross-zero NCE;cross-lingual alignment}, https://ieeexplore.ieee.org/document/9747720
USTC-NELSLIP at SemEval-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition
Published in Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), 2022
This paper describes the system developed by the USTC-NELSLIP team for SemEval-2022 Task 11 Multilingual Complex Named Entity Recognition (MultiCoNER). We propose a gazetteer-adapted integration network (GAIN) to improve the performance of language models for recognizing complex named entities. The method first adapts the representations of gazetteer networks to those of language models by minimizing the KL divergence between them. After adaptation, these two networks are then integrated for backend supervised named entity recognition (NER) training. The proposed method is applied to several state-of-the-art Transformer-based NER models with a gazetteer built from Wikidata, and shows great generalization ability across them. The final predictions are derived from an ensemble of these trained models. Experimental results and detailed analysis verify the effectiveness of the proposed method. The official results show that our system ranked 1st on three tracks (Chinese, Code-mixed and Bangla) and 2nd on the other ten tracks in this task.
Recommended citation: Beiduo Chen, Jun-Yu Ma, Jiajun Qi, Wu Guo, Zhen-Hua Ling, and Quan Liu. 2022. USTC-NELSLIP at SemEval-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1613–1622, Seattle, United States. Association for Computational Linguistics. https://aclanthology.org/2022.semeval-1.223/
Feature Aggregation in Zero-Shot Cross-Lingual Transfer Using Multilingual BERT
Published in 2022 26th International Conference on Pattern Recognition (ICPR), 2022
Multilingual BERT (mBERT), a language model pre-trained on large multilingual corpora, has impressive zeroshot cross-lingual transfer capabilities and performs surprisingly well on zero-shot POS tagging and Named Entity Recognition (NER), as well as on cross-lingual model transfer. At present, the mainstream methods to solve the cross-lingual downstream tasks are always using the last transformer layer’s output of mBERT as the representation of linguistic information. In this work, we explore the complementary property of lower layers to the last transformer layer of mBERT. A feature aggregation module based on an attention mechanism is proposed to fuse the information contained in different layers of mBERT. The experiments are conducted on four zero-shot cross-lingual transfer datasets, and the proposed method obtains performance improvements on key multilingual benchmark tasks XNLI (+1.5 %), PAWS-X (+2.4 %), NER (+1.2 F1), and POS (+1.5 F1). Through the analysis of the experimental results, we prove that the layers before the last layer of mBERT can provide extra useful information for cross-lingual downstream tasks and explore the interpretability of mBERT empirically.
Recommended citation: B. Chen, W. Guo, Q. Liu and K. Tao, "Feature Aggregation in Zero-Shot Cross-Lingual Transfer Using Multilingual BERT," 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 2022, pp. 1428-1435, doi: 10.1109/ICPR56361.2022.9956721. keywords: {Fuses;Bit error rate;Tagging;Linguistics;Benchmark testing;Transformers;Pattern recognition}, https://ieeexplore.ieee.org/document/9956721
Wider & Closer: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition
Published in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Zero-shot cross-lingual named entity recognition (NER) aims at transferring knowledge from annotated and rich-resource data in source languages to unlabeled and lean-resource data in target languages. Existing mainstream methods based on the teacher-student distillation framework ignore the rich and complementary information lying in the intermediate layers of pre-trained language models, and domain-invariant information is easily lost during transfer. In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently. Concretely, a multi-channel distillation framework is designed for sufficient information transfer by aggregating multiple distillers as a mixture. Besides, an unsupervised method adopting parallel domain adaptation is proposed to shorten the channels between the teacher and student models to preserve domain-invariant features. Experiments on four datasets across nine languages demonstrate that the proposed method achieves new state-of-the-art performance on zero-shot cross-lingual NER and shows great generalization and compatibility across languages and fields.
Recommended citation: Jun-Yu Ma, Beiduo Chen, Jia-Chen Gu, Zhenhua Ling, Wu Guo, Quan Liu, Zhigang Chen, and Cong Liu. 2022. Wider & Closer: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5171–5183, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. https://aclanthology.org/2022.emnlp-main.345/
Pre-training Language Model as a Multi-perspective Course Learner
Published in Findings of the Association for Computational Linguistics: ACL 2023, 2023
ELECTRA, the generator-discriminator pre-training framework, has achieved impressive semantic construction capability among various downstream tasks. Despite the convincing performance, ELECTRA still faces the challenges of monotonous training and deficient interaction. Generator with only masked language modeling (MLM) leads to biased learning and label imbalance for discriminator, decreasing learning efficiency; no explicit feedback loop from discriminator to generator results in the chasm between these two components, underutilizing the course learning. In this study, a multi-perspective course learning (MCL) method is proposed to fetch a many degrees and visual angles for sample-efficient pre-training, and to fully leverage the relationship between generator and discriminator. Concretely, three self-supervision courses are designed to alleviate inherent flaws of MLM and balance the label in a multi-perspective way. Besides, two self-correction courses are proposed to bridge the chasm between the two encoders by creating a “correction notebook” for secondary-supervision. Moreover, a course soups trial is conducted to solve the “tug-of-war” dynamics problem of MCL, evolving a stronger pre-trained model. Experimental results show that our method significantly improves ELECTRA’s average performance by 2.8% and 3.2% absolute points respectively on GLUE and SQuAD 2.0 benchmarks, and overshadows recent advanced ELECTRA-style models under the same settings. The pre-trained MCL model is available at https://huggingface.co/McmanusChen/MCL-base.
Recommended citation: Beiduo Chen, Shaohan Huang, Zihan Zhang, Wu Guo, Zhenhua Ling, Haizhen Huang, Furu Wei, Weiwei Deng, and Qi Zhang. 2023. Pre-training Language Model as a Multi-perspective Course Learner. In Findings of the Association for Computational Linguistics: ACL 2023, pages 114–128, Toronto, Canada. Association for Computational Linguistics. https://aclanthology.org/2023.findings-acl.9/
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Electromagnetism C, PHYS1004C, 022503.03
Undergraduate theory course, University of Science and Technology of China, Department of Physics, 2018
Serve as: Teaching Assistant
Computer Programming A, CS1001A, 210522.02
Undergraduate theoretical experiment course, University of Science and Technology of China, Department of Information Science and Technology, 2019
Serve as: Teaching Assistant
Signals and Systems 210049, 210049.05
Undergraduate theory course, University of Science and Technology of China, Department of Information Science and Technology, 2020
Serve as: Teaching Assistant
NLP for Climate Change (Algor.u.formale Aspekte) SoSe 2024
Graduate course, Ludwig-Maximilians-Universität München, 13 Fakultät für Sprach- und Literaturwissenschaften, Department II, Centrum für Informations- und Sprachverarbeitung, 2024
Serve as: Teaching Assistant
Multi-modal NLP Übung Erweiterungsmodul Computerlinguistik Sommersemester 2024
Graduate course, Ludwig-Maximilians-Universität München, 13 Fakultät für Sprach- und Literaturwissenschaften, Department II, Centrum für Informations- und Sprachverarbeitung, 2024
Serve as: Teacher