I am a Postdoctoral Research Fellow at the Department of Statistics and Data Science, the Wharton School , University of Pennsylvania, advised by Professor Dylan S. Small. Previously, I obtained my Ph.D. from the Department of Operations Research and Financial Engineering at Princeton University in May 2023, advised by Professor Jianqing Fan. Prior to Princeton, I obtained my B.S. with summa cum laude (Guo Muoruo Scholarship, <1%) from University of Science and Technology of China (USTC), 2018.

My research interests lie broadly in the interface of estimation, inference and decision-making, with applications spanning social science, operations research, and public health. Specifically, I am interested in developing novel methods and investigating relevant applications on robust high-dimensional statistical estimation and inference, human preference learning, causal inference, machine learning safety, data-driven decision-making, etc.

I am on the 2024-2025 academic job market and would be happy to discuss any opportunities!

Misc: I am also an amateur photographer, here is a Gallery that contains some photos that I took in the past few years.

  • Test-Negative Designs with Various Reasons for Testing: Statistical Bias and Solution.
    Mengxin(Maxine) Yu, Kendrick Qijun Li, Nicholas P. Jewell, Eric J. Tchetgen Tchetgen, Dylan S. Small, Xu Shi, Bingkai Wang. arXiv preprint arXiv:2312.03967, 2024+.

  • SymmPI: Predictive Inference for Data with Group Symmetries.
    Edgar Dobriban, Mengxin(Maxine) Yu (Alphabetical);
    Under Major Revision at Journal of the Royal Statistical Society: Series B (JRSSB).

  • Spectral Ranking Inferences based on General Multiway Comparisons.
    Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin(Maxine) Yu (Alphabetical);
    Accepted to Operations Research (OR), Subject to Minor Revision.

  • Uncertainty Quantification of MLE for Ranking with Covariates.
    Jianqing Fan, Jikai Hou, Mengxin(Maxine) Yu (Alphabetical);
    Journal of Machine Learning Research (JMLR), Accepted, 2024+, to appear.

  • Robust High-dimensional Tuning Free Multiple Testing.
    Jianqing Fan, Zhipeng Lou, Mengxin(Maxine) Yu (Alphabetical);
    Annals of Statistics (AOS), 2023, 51(5), 2093-2115 .

  • Are Latent Factor Regression and Sparse Regression Adequate?
    Jianqing Fan, Zhipeng Lou, Mengxin(Maxine) Yu (Alphabetical).
    Journal of the American Statistical Association (JASA), 2024, 119(546), 1076-1088.

  • Policy Optimization Using Semiparametric Models for Dynamic Pricing.
    Jianqing Fan, Yongyi Guo, Mengxin(Maxine) Yu (Alphabetical).
    Journal of the American Statistical Association (JASA), 2023, 119(545), 552–564 .

  • Our paper Uncertainty Quantification of MLE for Entity Ranking with Covariates is accepted to the Journal of Machine Learning Research (JMLR)!

  • I am happy to give a talk at Stanford Statistics Seminar on the recent work SymmPI: Predictive Inference for Data with Group Symmetries , July, 2024.

  • I am happy to announce that I have won the IMS New Researcher Travel Award, 2024.

  • Our paper Ranking Inferences Based on the Top Choice of Multiway Comparisons is accepted to the Journal of American Statistical Association (JASA)!

  • New Paper Out: SymmPI: Predictive Inference for Data with Group Symmetries . Quantifying the uncertainty of predictions is a core problem in modern statistics. Various methods for predictive inference have been developed, such as conformal inference, relying on distributionally invariance assumption of the data after arbitrary permutations. However, this assumption can be violated due to the complex structures of the real world datasets, such as networks, rotationally invariant data, coordinate independent observations in physics. In this work, we propose SymmPI, a unified methodology for predictive inference when data distributions have general group symmetries and in arbitrary observation models. We show that SymmPI has valid coverage and characterize its performance under distribution shift, admitting many existing results as special cases. For more details, feel free to check out our paper!

  • New Paper Out: Test-Negative Designs with Multiple Testing Sources . Previously, Test-Negative Design (TND) studies focused primarily on individuals who sought testing due to symptoms consistent with the disease, aiming to reduce confounding factors associated with healthcare-seeking behavior. However, in scenarios like COVID-19, and Ebola, testing was employed for various reasons, including contact tracing, irrespective of symptoms. Traditional odds ratio estimator introduces bias when assessing intervention efficacy. To this end, we propose an approach to study the vaccine effect by combining both symptomatic and asymptomatic individuals.

  • Our paper Robust High-dimensional Tuning Free Multiple Testing is accepted to the Annals of Statistics (AOS)!

  • New Paper Out: Previous studies on parametric ranking models mostly assume comparison graphs are derived from a specific random graph, using either the spectral method or the MLE, without quantifying the uncertainty of ranks. In this paper, 1. We study a very general and more realistic setup in which the comparison graph consists of hyper-edges of possible heterogeneous sizes. 2. We build a connection between the spectral method and MLE: a two-step spectral method, applying the optimal weighting estimated from the equal weighting vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. 3. We provide a comprehensive framework on both one and two-sample ranking inferences, applicable to both fixed and random graph settings. For more details, please check out our new paper: Spectral Ranking Inferences based on General Multiway Comparisons .

  • I finished my Ph.D. Thesis Defense and will join the Statistics and Data Science Department at the Wharton School as a Postdoc Researcher.

  • I am happy to announce that I won the Institute of Mathematical Statistics Hannan Graduate Student Travel Award.

  • I will give an invited talk at the Pattern Theory Seminar organized by the Department of Applied Math at Brown University.

  • I am happy to announce that our paper ``Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments" won the ASA Best Student Paper Award from the Business and Economic Statistics Section.

  • Our paper Are Latent Factor Regression and Sparse Regression Adequate? is accepted to the Journal of American Statistical Association (JASA)!

  • Despite extensive studies on ranking, few prior literatures investigate this problem under the more realistic setting where covariate information exists. To tackle this issue, we propose a novel model, Covariate-Assisted Ranking Estimation (CARE) model, that extends the well-known Bradley- Terry-Luce (BTL) model, by incorporating the covariate information For more details, please check out our new paper: Uncertainty Quantification of MLE for Entity Ranking with Covariates .

  • The existing developments in studying robust statistics for large-scale experiments, such as Winsorization, Huberization and median of means require the bounded second moments and involve variable-dependent tuning parameters, which hamper their fidelity in applications to large-scale problems. To solve such issue, we propose tuning-free and moment-free high-dimensional inference procedures based on Hodges-Lehmann estimator for testing global null and for large-scale multiple testing with false discovery proportion control. For more details, please check our new paper, which has been submitted to the Annals of Statistics (AOS): Robust High-dimensional Tuning Free Multiple Testing.

  • I have presented a talk on Policy Optimization for Dynamic Pricing via Semiparametric Models at Informs Annual Meeting (Oct, 2022).

  • I have presented a talk on Reinforcement Learning under the existence of strategic agents in the Young Researchers Workshop held by Department of ORIE at Cornell University. (Oct, 2022)

  • I am happy to announce that I have won the the School of Engineering and Applied Science Award for Excellence (<5%, Sep. 2022).

  • Most current practical usage of ranks only involves estimating preference scores and displaying the estimated ranks. A cricial question is how to build confidence intervals for ranks, especially from data realized from multiway comparisons? To see the answers, please check out our new paper: Ranking Inferences Based on the Top Choice of Multiway Comparisons which is submitted to the Journal of American Statistical Association (JASA)! (Sep. 2022)

  • Our paper Policy Optimization Using Semi-parametric Models for Dynamic Pricing is accepted to the Journal of American Statistical Association (JASA)! (Sep, 2022)

  • New paper out: Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments. (Aug, 2022)
  • I have presented a talk on Are Latent Factor Regression or Sparse Regression Adequate? at JSM 2022 in Washington DC. (Aug, 2022)

  • I have presented a talk on Policy Optimization Using Semi-parametric Models for Dynamic Pricing at ICSA, China 2022. (Jun, 2022)

  • I have won best poster award at Applied Statistics Symposium, ICSA, FL 2022. (Jun, 2022)

  • Office: 304 Academic Research Building, Philadelphia, PA 19104
    Primary Email: mengxiny@wharton.upenn.edu