Hi, I'm Zheyu!
I am a 1st-year PhD student at Technical University of Munich, advised by Gjergji Kasneci.
Prior to that, I studied at LMU Munich, where I obtained my bachelor's degree in computer science and mathematics, and master's degree in computational linguistics.
I am broadly interested in natural language generation and model robustness and generalization. Currently, my research focuses on synthetic data generation and LLM post-training.
Email /
GitHub /
Google Scholar /
LinkedIn /
Twitter /
CV
|
|
News
Aug '25 🦀 Three papers are accepted to EMNLP 2025! See you in Suzhou!
May '25 🎻 One paper is accepted to ACL 2025! See you in Vienna!
Dec '24 🥳 I start my PhD journey!!!
Jan '24 🦞 One paper is accepted to EACL 2024 and I will be in Malta as a student volunteer!
Oct '23 🍼 One paper is accepted to BabyLM Challenge @CoNLL 2023!
|
|
Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models
Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
EMNLP 2025 Findings
paper /
code /
We propose a minimally intrusive and structure-aware approach for tabular data synthesis by guiding LLMs attention with sparse feature dependencies.
|
|
Doubling Your Data in Minutes: Ultra-fast Tabular Data Generation via LLM-Induced Dependency Graphs
Shuo Yang, Zheyu Zhang, Bardh Prenkaj, Gjergji Kasneci
EMNLP 2025
paper /
code /
We propose a lightweight framework for tabular data augmentation that models sparse dependencies and accelerates generation by over 9,500× than LLM-based approaches while reducing constraint violations.
|
|
Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models
Chenchen Yuan, Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
ACL 2025 Findings
paper /
We propose a framework that aligns different LLMs by aggregating their moral judgments and optimizing embeddings to improve consistency and fidelity.
|
|
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
Peiqin Lin, Chengzhi Hu, Zheyu Zhang, André FT Martins, Hinrich Schütze
EACL 2024 Findings
paper /
code /
We evaluate the effectiveness of various mPLMs in measuring language similarity and improved zero-shot cross-lingual transfer performance based on the results.
|
|
Baby’s CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models
Zheyu Zhang, Han Yang, Bolei Ma, David Rügamer, Ercong Nie
EMNLP 2023 Workshop CoNLL-CMCL Shared Task BabyLM Challenge
paper /
code /
We propose using LLMs to reconstruct existing data into NLU examples for training compact LMs, demonstrating the effectiveness of synthetic data in small LM training.
|
Academic Service
-
Volunteering
ACL 2025, EACL 2024
-
Reviewing
ECML-PKDD 2025, ACL-ARR 2025, BabyLM Challenge 2023
|
|
|