Dacheng Li

UC Berkeley Sky Computing Lab Home Publications

Dacheng Li

I am a second-year CS PhD at EECS, UC Berkeley, fortunately advised by Prof. Ion Stoica and Prof. Joseph Gonzalez . I also have the fortune to work closely with Prof. Song Han (MIT). I am affliated with Sky Computing Lab and Berkeley AI Group (BAIR). I am co-leading NovaSky and the 1st new member of lmsys.

Previously, I obtained my master in Machine Learning at CMU with Prof. Eric Xing and Prof. Hao Zhang . I obtained my undergraduate with double majors in Computer Science and Mathematics at UC San Diego with Prof. Zhuowen Tu . I interned at Nvidia Research and Google.

Extended acknowledgment to Rulin Shao (my girlfriend, PhD at UW), Lianmin Zheng and Ying Sheng (Professors at UCLA).

I am open for collaboration and opportunity. Please contact me for presentations, research 1-1 and collaborations.

Google Scholar / GitHub / Resume / PhD SoP / Twitter / Linkedin

Research Interests
My research goal is to develop artifical intelligence and artifical worlds efficiently. I have been working on the intersection of visual and textual generative models and distributed systems.

Machine Learning Systems (MLSys)
Distributed training systems:: AMP (Neurips'22), DistFlashAttn (COLM'24), LongVILA (ICLR'25), MCBench (MLsys'24) are distributed training systems for LLMs under heterogeneous hardware, long context, visual-language models and model parallelism compression.
Distributed serving systems:: MPCFormer (ICLR'23) and Marill (Arxiv'24) accelerates the inference of LLMs for private and secure inference. VTC (OSDI'24) and DLPM (Arxiv'25) are distributed systems for LLM fair scheduling. S-lora (MLsys'24) is a distributed serving system for LoRA.

Machine Learning (ML)
Large Language Models: LLM-as-a-judge (Neurips'23) is one of the foundations of LLMs applications. LongChat (Neurips'23 workshop) is a long-context LLM.
Large Reasoning Models: Sky-T1 (Arxiv'25) is an open large reasoning model with o1 performance trained within academic budget. S* (Arxiv'25) is a simple and extensible test time scaling framework for code generation. Overthinking (Arxiv'25) reveals the Reasoning-Action Dilemma in Agentic Tasks.
Computer Vision: DC-VAE (CVPR'21) is the first VAE that can perform image synthesis at GAN level. Efficient-Vdit (Arxiv'25) and SparseVideoGen (Arxiv'25) are efficient video generation models.
Visual Language Models: VILA-U (ICLR'25) and NVILA (Arxiv'25) are efficient visual language models.
Evaluation: Chatbot Arena (ICML'24) is one of the most influential LLM chatbot evaluation platform. LongEval (Neurips'23 workshop) is a long-context LLM evaluation platform. Sorry-Bench (ICLR'25) is a systematic evaluation of LLM refusal.

Awards

Nvidia Graduate Fellowship Finalist, 2025-2026
AMD AI & HPC Cluster Award, 2025
Amazon Research Awards, 2022

Media Press

Sky-T1 covered by New York Times, Wall Street Journal, and The Information.

Mentorship

Beyond research collaboration, I commit seriously to mentoring junior students. If you are serious about working with me, please fill out this form. We will have regular meetings and we will succeeed.

News

2025-2 Released S*, a simple and extensible test time scaling framework for code generation.
2025-1 Released Sky-T1-32B-Preview , a fully open-sourced reasoning model.
2025-1 Started NovaSky that targets the next-generation of open AI models.
2025-1 LongVILA, VILA-U, and Sorry-Bench are accepted to ICLR'25.
2024-12 Honored to receive the finalist award from Nvidia 2025-2026 Fellowship.
2024-08 Released LongVila, a seris of long-context VLM for videos.
2024-08 Released Marill, an efficient MPC framework for LLMs, extending the idea in MPCFormer.
2024-07 DistFlashAttn is accepted to COLM'2024.
2024-06 Joined Nvidia as a research intern, on multi-modal foundation models with Prof. Song Han.
2024-05 Chatbot Arena is accepted to ICML'2024.
2024-03 VTC is accepted to OSDI'2024.
2024-02 S-lora and MCBench are accepted to MLsys'2024.
2023-09 The official paper of Vicuna (LLM-as-a-judge) is accepted to Neurips'2024.
2023-08 Joined Google as a student researcher, working on LLMs evaluation.
2023-06 Released a series of long-context models and evaluation toolkits LongChat.
2023-04 Released a compact open-sourced chatbot FastChat-T5.
2023-01 MPCFormer is accepted at ICLR'2023 as spotlight.
2022-12 A secure LLMs serving proposal is accepted at Amazon Research Awards.
2022-10 AMP is accepeted at Neurips'2022.
2021-03 DC-VAE is accepted at CVPR'2021.