Yuxuan Wang 王雨轩

I am currently an undergraduate student at the School of Computer Science and Technology at Beijing Institute of Technology. I am also an intern at Mμ Lab at the Institute of Artificial Intelligence, Peking University, where I focus on model architecture under the guidance of Professor Muhan Zhang. I will begin my master's studies at the School of Software and Microelectronics at Peking University in September 2026, under the supervision of Professor Muhan Zhang and Professor Zhonghai Wu.

Yuxuan Wang

News

2025.11 Gave a presentation on MoE experts balancing and training stability. PPT / Video
2025.09 One paper accepted by NeurIPS 2025 Dataset and Benchmark Track! Welcome to evaluate your models with LooGLE-v2!
2024.11 Glad to have joined Mμ Lab at IAI, PKU as a research intern!
2024.10 Honored to receive the National Scholarship!
2023.10 Honored to receive the National Scholarship!

Research

My research interest focuses on foundation models. I am dedicated to model compression, model acceleration, and related areas.

TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference

TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference

Xiaojuan Tang, Fanxu Meng, Pingzhi Tang, Yuxuan Wang, Di Yin, Xing Sun, Muhan Zhang
arxiv preprint, 2025

We propose Tensor-Parallel Latent Attention (TPLA): a scheme that partitions both the latent representation and each head’s input dimension across devices, performs attention independently per shard, and then combines results with an all-reduce. TPLA preserves the benefits of a compressed KV cache while unlocking TP efficiency.

LooGLE v2: Are LLMs Ready for Real World Long Dependency Challenges?

LooGLE v2: Are LLMs Ready for Real World Long Dependency Challenges?

Ziyuan He*, Yuxuan Wang*, Jiaqi Li*, Kexin Liang, Muhan Zhang
NeurIPS Datasets and Benchmarks Track, 2025

LooGLE v2 is a benchmark designed to evaluate the long-context and long-dependency capabilities of large language models. Its key highlight is the use of ultra-long texts, with a strong emphasis on long-dependency, and it is entirely designed with real-world, real-task scenarios.

Education

2026.09 — Present
Graduate Student
Research Advisor: Muhan Zhang
Academic Advisor: Zhonghai Wu

Experience

2025.05 — Present
Research Intern
Research Advisor: Fanxu Meng
2024.11 — Present
Research Intern
Research Advisor: Muhan Zhang

Visitors