Weiran Yao

Senior Research Scientist (Multi-Agent + Alignment)

I design and develop AI systems for multi-agent reasoning, software engineering agent, and web agent. I lead team of 4 scientists to develop high-quality synthetic data pipeline for SWE agents in production and communicated insights for executive decision-making. I conducted post-training research to align models to specialize in reflection of task executions. I develop Salesforce in-house xLAM-series agentic model development by aligning the model for function call.

Work Experience

Nov 2024 — Present
Salesforce, Palo Alto, CA
Research Manager, AI Research
Drove cross-functional initiatives for Agentic AI systems for software engineering and multi-agent orchestration, leading teams of 4 scientists/engineers to develop high-quality, diverse data pipeline for code LLMs in production and communicated complex insights for executive decision-making.
Aug 2024 — Present
Salesforce, Palo Alto, CA
Senior Research Scientist, AI Research
Developed Multi-Agent AI systems including CodeGenie Agent, SlackAgents, Ensemble Agents, and CRM WebAgent.
Jan 2023 — Aug 2024
Salesforce, Palo Alto, CA
Research Scientist, AI Research
Developed post-training of Retroformer, a general critic model for agent self-reflection. For engineering products, I developed automatic root cause analysis algorithms, or SRE/AIOps agent.

Education

Aug 2017 — Aug 2023
Ph.D. in Machine Learning
Carnegie Mellon University, Pittsburgh, PA
Advisor: Kun Zhang
Committee: Yuejie Chi, Sean Qian, Matteo Pozzi, Kun Zhang
Aug 2017 — May 2021
M.S. in Machine Learning
Carnegie Mellon University, Pittsburgh, PA
GPA: 4.00/4.00

Tech Stack

Programming Language: Python, JavaScript, HTML, CSS, Bash, SQL

Tools and Frameworks: PyTorch, Triton, Spark, Docker, Kubernetes, Streamlit, FastAPI, LaTex

Publications

Selected: Latest & Greatest

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
arXiv (arXiv). 2024.
PRAct: Optimizing Principled Reasoning and Acting of LLM Agent
Computational Natural Language Learning (CoNLL). 2024.
Project PDF Code *Authors contributed equally
xLAM: A Family of Large Action Models to Empower AI Agent Systems
arXiv:2409.03215 (arXiv). 2024.
Project Demo PDF Blog Preview Slides Code Deployed at Salesforce *Authors contributed equally
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
ACM International Conference on Information and Knowledge Management (ICLR). Singapore, 2025.
Project PDF Blog Talk Code 55% resolve rate on SWE-Bench Lite *Authors contributed equally
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets
Neural Information Processing Systems (NeurIPS). Vancouver, Canada, 2024.
Project PDF Blog Data Invited VentureBeat Article
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
International Conference on Learning Representations (ICLR). Vienna, Austria, 2024.
Project PDF Talk Slides Code Spotlight Presentation, Top 5% Paper
AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System
arXiv:2402.15538 (arXiv). 2024.
Project Demo PDF Preview Talk Code Dreamforce 2024 Databrick + AI Summit *Authors contributed equally
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
International Conference on Learning Representations (ICLR). 2024.
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
International Conference on Learning Representations (ICLR). Kigali, Rwanda, 2023.
Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data
arXiv:2301.10859 (arXiv). 2023.
Project PDF Blog Preview Code Deployed at Salesforce for Incident Causation Analysis
Distribution-aware Goal Prediction and Conformant Model-based Planning for Safe Autonomous Driving
Jonathan Francis, Bingqing Chen, Weiran Yao, Eric Nyberg, Jean Og
International Conference on Machine Learning (ICML). Baltimore, Maryland USA, 2022.
Learning Temporally Causal Latent Processes from General Temporal Data
International Conference on Learning Representations (ICLR). 2022.
Project PDF Talk Slides Code *Authors contributed equally

Conference

C26
SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs
arXiv (arXiv). 2024.
C25
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
arXiv (arXiv). 2024.
C24
PRAct: Optimizing Principled Reasoning and Acting of LLM Agent
Computational Natural Language Learning (CoNLL). 2024.
Project PDF Code *Authors contributed equally
C23
xLAM: A Family of Large Action Models to Empower AI Agent Systems
arXiv:2409.03215 (arXiv). 2024.
Project Demo PDF Blog Preview Slides Code Deployed at Salesforce *Authors contributed equally
C22
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
ACM International Conference on Information and Knowledge Management (ICLR). Singapore, 2025.
Project PDF Blog Talk Code 55% resolve rate on SWE-Bench Lite *Authors contributed equally
C21
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets
Neural Information Processing Systems (NeurIPS). Vancouver, Canada, 2024.
Project PDF Blog Data Invited VentureBeat Article
C20
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
International Conference on Learning Representations (ICLR). Vienna, Austria, 2024.
Project PDF Talk Slides Code Spotlight Presentation, Top 5% Paper
C19
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
arXiv:2402.15506 (arXiv). 2024.
C18
AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System
arXiv:2402.15538 (arXiv). 2024.
Project Demo PDF Preview Talk Code Dreamforce 2024 Databrick + AI Summit *Authors contributed equally
C17
Editing Arbitrary Propositions in LLMs without Subject Labels
arXiv:2401.07526 (arXiv). 2024.
C16
Causal Layering via Conditional Entropy
Proceedings of the Third Conference on Causal Learning and Reasoning (CLeaR). 2024.
C15
CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
Guangyi Chen, Yifan Shen, Zhenhao Chen, Xiangchen Song, Yuewen Sun, Weiran Yao, Xiao Liu, Kun Zhang
International Conference on Machine Learning (ICML). 2024.
C14
DRDT: Dynamic Reflection with Divergent Thinking for LLM-based Sequential Recommendation
arXiv:2312.11336 (arXiv). 2023.
C13
Temporally Disentangled Representation Learning under Unknown Nonstationarity
Neural Information Processing Systems (NeurIPS). New Orleans, Louisiana, USA, 2023.
C12
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
International Conference on Learning Representations (ICLR). 2024.
C11
REX: Rapid Exploration and eXploitation for AI Agents
International Conference on Learning Representations (ICLR). 2024.
C10
On the Unlikelihood of D-Separation
Probabilistic Graphical Model (PGM). 2024.
C9
Non-Parametric State-Space Models: Identifiability, Estimation and Forecasting
OpenReview (OpenReview). 2023.
C8
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
International Conference on Learning Representations (ICLR). Kigali, Rwanda, 2023.
C7
Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data
arXiv:2301.10859 (arXiv). 2023.
Project PDF Blog Preview Code Deployed at Salesforce for Incident Causation Analysis
C6
Temporally Disentangled Representation Learning
Neural Information Processing Systems (NeurIPS). New Orleans, Louisiana, USA, 2022.
C5
Distribution-aware Goal Prediction and Conformant Model-based Planning for Safe Autonomous Driving
Jonathan Francis, Bingqing Chen, Weiran Yao, Eric Nyberg, Jean Og
International Conference on Machine Learning (ICML). Baltimore, Maryland USA, 2022.
C4
Partial Disentanglement for Domain Adaptation
Lingjing Kong, Shaoan Xie, Weiran Yao, Yujia Zheng, Guangyi Chen, Petar Stojanov, Victor Akinwande, Kun Zhang
International Conference on Machine Learning (ICML). Baltimore, Maryland USA, 2022.
C3
Learning Temporally Causal Latent Processes from General Temporal Data
International Conference on Learning Representations (ICLR). 2022.
Project PDF Talk Slides Code *Authors contributed equally
C2
Learning a Distributed Control Scheme for Demand Flexibility in Thermostatically Controlled Loads
Bingqing Chen, Weiran Yao, Jonathan Francis, Mario Berges
IEEE International Conference on Smart Grid Communications (SmartGridComm). 2020.
C1
Condition Monitoring of Wheel Wear for High-Speed Trains: A Data-Driven Approach
Peiwen Xu, Weiran Yao, Yang Zhao, Cai Yi, Lishuai Li, Jianhui Lin, Kwok Leung Tsui
International Conference on Prognostics and Health Management (PHM). 2018.

Journal

J5
Precise and Fast Safety Risk Classification of Lithium-ion Batteries Based on Machine Learning Methodology
Yikai Jia, Jiani Li, Weiran Yao, Yangxing Li, Jun Xu
Journal of Power Sources (Journal of Power Sources). 2022.
J4
How Do New Transit Stations Affect People's Sentiment and Activity? A Case Study Based on Social Media Data in Hong Kong
Haoliang Chang, Jianxiang Huang, Weiran Yao, Weizun Zhao, Lishuai Li
Transport Policy (Transport Policy). 2022.
J3
Data-Driven Safety Risk Prediction of Lithium-Ion Battery
Yikai Jia, Jiani Li, Chunhao Yuan, Xiang Gao, Weiran Yao, Minwoo Lee, Jun Xu
Advanced Energy Materials (Advanced Energy Materials). 2021.
J2
From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction Using Social Media Data
Transportation Research Part C: Emerging Technologies (TRC). 2021.
J1
Learning to Recommend Signal Plans under Incidents with Real-Time Traffic Prediction
Transportation Research Record: Journal of the Transportation Research Board (TRR). 2020.

Engineering Projects

Jan 2024 – Present
SlackAgents: Scalable Collaboration of Multiple AI Agents in Workspaces
Scalable Collaboration for Multiple AI Agents in Workspaces
Aug 2024 – Present
CodeGenie Plan and Execute Agent
Enhancing IDE Productivity through AI Code Planning, Editing, and Execution
Jan 2023 – Dec 2023
AI for IT Operations: Automatic Cloud Incident Causation Analysis
AIOps Augments SREs' Capabilities for Automating Operations

Talks

Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
Sep. 2024
CAMEL-AI Workshop, Cupertino
PRAct: Optimizing Principled Reasoning and Acting of LLM Agent
Jun. 2024
Databricks Data + AI Summit, San Francisco
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
Sep. 2023
Moveworks Workshop, Mountain View
Large Actions Models in a Multi-Agent World
Sep. 2024
Dreamforce Breakout Session, San Francisco

Press

Oct. 2024
"Salesforce upgraded to Outperform by Northland Capital Markets – xLAM powers the Agentforce platform.," Investing.com
Sep. 2024
"Is AI the future of sales? Salesforce’s new models could change the game," VentureBeat
Aug. 2024
"Salesforce DEI: How Diversity Is Driving AI Innovation in Software Engineering," Times of AI
Aug. 2024
"Salesforce AI Research Proposes DEI: AI Software Engineering Agents Org, Achieving a 34.3% Resolve Rate on SWE-Bench Lite, Crushing Closed-Source Systems," MarkTechPost
Jul. 2024
"Salesforce proves less is more: xLAM-1B ‘Tiny Giant’ beats bigger AI Models," VentureBeat
Jul. 2024
"On-device agentic AI is here! Salesforce makes big claims about its ‘Tiny Giant’ LLM," The Stack
Mar. 2024
"Salesforce Research Introduces AgentOhana: A Comprehensive Agent Data Collection and Training Pipeline for Large Language Model," MarkTechPost
Mar. 2024
"AgentLite by Salesforce AI Research: Transforming LLM Agent Development with an Open-Source, Lightweight, Task-Oriented Library for Enhanced Innovation," MarkTechPost
Aug. 2023
"Salesforce AI Researchers Introduce the Evolution of LLM-Augmented Autonomous Agents and the Innovative BOLAA Strategy," MarkTechPost
Aug. 2023
"Meet Retroformer: An Elegant AI Framework for Iteratively Improving Large Language Agents by Learning a Plug-in Retrospective Model," MarkTechPost
May 2021
"From Twitter to Traffic Predictor," Carnegie Mellon University

Mentoring

Summer 2024
Dr. Kexun Zhang at Salesforce AI Research
Ph.D. in Computer Science, Carnegie Mellon University
AI Software Engineer
2022 — 2023
Dr. Xiangchen Song at Carnegie Mellon University
Ph.D. in Machine Learning, Carnegie Mellon University
Disentangled Representation Learning
2022 — 2023
Dr. Zemian Ke at Carnegie Mellon University
M.S. in Machine learning, Carnegie Mellon University
Low-Rank Approximation
Now: Machine Learning Engineer at Google
2022 — 2023
Dr. Lingjing Kong at Carnegie Mellon University
Ph.D. in Computer Science, Carnegie Mellon University
Multi-Source Domain Adaptation
2022 — 2023
Dr. Yuewen Sun at Carnegie Mellon University
Ph.D. in Computer Science, Southeast University
Disentangled Representation Learning