CV

Education

Renmin University of China, PhD in Natural Science, 2022 - Present

  • Research focus: LLM Reasoning and LLM Alignment
  • Supervisor: Prof. Yankai Lin

Dalian University of Technology, BSc in Engineering in Digital Media Technology, 2017 - 2022

  • GPA: 4.02/5.0 (90.02/100) | Rank: 1/99 (Comprehensive), 2/99 (Academic)
  • Awards: National Scholarship (x2, Top 0.2%), Outstanding Graduate of Dalian City, Technology Innovation Scholarship (x3), First-Class Academic Scholarship (x3).

Research Experience

Learning to Focus (LeaF) Framework (NeurIPS 2025)

Research Intern @ ModelBest (面壁智能) | Supervisor: Prof. Yankai Lin | May 2024 -- May 2025

  • Introduced Learning to Focus (LeaF), a two-stage framework that treats distracting patterns as spurious confounders in LLM reasoning.
  • Confounding Token Detection: Identifies confounding tokens via teacher–student gradient-based comparisons and constructs counterfactual samples by pruning these tokens.
  • Causal Attention Distillation: Captures causal dependencies through a hybrid distillation loss that aligns the student with the teacher on both original and counterfactual samples.
  • Results: Yielded an average accuracy gain of 2.41% on math benchmarks and 2.48% on code benchmarks for LLaMA-1B/3B and Qwen2.5-Math.

Controllable Preference Optimization (EMNLP 2024)

Research Intern @ Tencent (腾讯) | Supervisor: Prof. Yankai Lin | Oct. 2023 -- Apr. 2024

  • Proposed Controllable Preference Optimization (CPO), which explicitly specifies preference scores for different objectives to guide the model in generating responses that meet specific requirements.
  • Achieved superior controllability in single objectives while maintaining alignment performance compared to DPO.
  • Surpassed baselines (SFT, PPO, DPO, Curry-DPO) across all three objectives (helpfulness, honesty, and harmlessness) by explicitly grounding preference conditions.
  • Demonstrated effective mitigation of the conflict issue (alignment tax) related to multi-objective alignment.

Skills

  • Programming: Python, C++, LaTeX
  • Data Analysis: Pandas, NumPy, SciPy
  • Languages: Chinese (Native), English (Fluent), Japanese (Fluent)

Awards & Honors

  • National Scholarship (Top 0.2%) - Ministry of Education of China ( 2020, 2021 )
  • Outstanding Graduate of Dalian City - Dalian Municipal Education Bureau ( 2022 )
  • First-Class Academic Scholarship - Dalian University of Technology ( 2019 - 2021 )
  • Technology Innovation Scholarship - Dalian University of Technology ( 2019 - 2021 )