Skip to content
View S1s-Z's full-sized avatar
  • Tsinghua University

Organizations

@pkunlp-icler

Block or report S1s-Z

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
S1s-Z/README.md

Hi πŸ§‘πŸ»β€πŸ’»πŸ‘‹πŸ»

I am Shuzheng Si (司书正 in Chinese ✍🏻), a first-year Ph.D. student in the Department of Computer Science and Technology at Tsinghua University. I am lucky to be advised by Prof. Maosong Sun and affiliated with TsinghuaNLP Lab. I obtained my master’s degree from Peking University, where I was fortunate to be a part of the PKU-NLP Group under the supervision of Prof. Baobao Chang at the Institute of Computational Linguistics. I spent my sweet undergraduate days at the School of Software (rank: 1/307), Yunnan University, a beautiful university πŸ‚.

Now, my research interests lie in Natural Language Processing (NLP) and Large Language Models (LLMs), specifically focusing on Data-centric Methods, including Data Selection, Data Synthesis, and Learning from Noisy Data, etc. My long-term research goal is to elucidate the influence of data on LLMs and subsequently utilize these insights to effectively guide the organization, selection, and synthesis of high-quality data, thereby enhancing the foundational capabilities of LLMs (e.g., instruction following, factuality, and faithfulness). Find my up-to-date publication list in πŸ”— Google Scholar.

Feel free to drop an email if you are interested in connecting πŸ§‘πŸ»β€πŸ€β€πŸ§‘πŸ».

Pinned Loading

  1. HaozheZhao/MIC HaozheZhao/MIC Public

    MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU

    Python 345 15

  2. SCL-RAI SCL-RAI Public

    [COLING'22] Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER"

    Python 44 3

  3. SANTA SANTA Public

    [ACL'23] Code for "SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recognition"

    Python 40 1

  4. GATEAU GATEAU Public

    Code for "GATEAU: Selecting Influential Samples for Long Context Alignment"

    Python 34

  5. CENSOR CENSOR Public

    [ACL'24] Code for "Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning"

    Python 4

  6. NOVA NOVA Public

    Code for "Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering"

    15