I am Shuzheng Si (εΈδΉ¦ζ£ in Chinese βπ»), a first-year Ph.D. student in the Department of Computer Science and Technology at Tsinghua University. I am lucky to be advised by Prof. Maosong Sun and affiliated with TsinghuaNLP Lab. I obtained my masterβs degree from Peking University, where I was fortunate to be a part of the PKU-NLP Group under the supervision of Prof. Baobao Chang at the Institute of Computational Linguistics. I spent my sweet undergraduate days at the School of Software (rank: 1/307), Yunnan University, a beautiful university π.
Now, my research interests lie in Natural Language Processing (NLP) and Large Language Models (LLMs), specifically focusing on Data-centric Methods, including Data Selection, Data Synthesis, and Learning from Noisy Data, etc. My long-term research goal is to elucidate the influence of data on LLMs and subsequently utilize these insights to effectively guide the organization, selection, and synthesis of high-quality data, thereby enhancing the foundational capabilities of LLMs (e.g., instruction following, factuality, and faithfulness). Find my up-to-date publication list in π Google Scholar.
Feel free to drop an email if you are interested in connecting π§π»βπ€βπ§π».