##### My research is supported in part by NSF CAREER-1944904, NSF DMS-1811868, and NIH R01 GM131399.

#### Tensor Data Analysis

- Tensor SVD: Statistical and computational limits (with Dong Xia),
*IEEE Transactions on Information Theory*, 64, 1-28, 2018. [An R implementation] - Cross: Efficient low-rank tensor completion (single author),
*The Annals of Statistics*, 47, 936-964, 2019. [R package] - Optimal sparse singular value decomposition for high-dimensional high-order data (with Rungang Han),
*Journal of American Statistical Association*, 114, 1708-1725, 2019. [R package] - An optimal statistical and computational framework for generalized tensor estimation (with Rungang Han and Rebecca Willett),
*The Annals of Statistics*, to appear, 2021. - Open problem: Average-case hardness of hypergraphic planted clique detection (with Yuetian Luo),
*Conference on Learning Theory (COLT)*, 125, 3852-3856, 2020. [talk and slides] - A sharp blockwise tensor perturbation bound for orthogonal iteration (with Yuetian Luo, Garvesh Raskutti, and Ming Yuan),
*Journal of Machine Learning Research*, 22, 1-48, 2021. - ISLET: fast and optimal low-rank tensor regression via importance sketchings (with Yuetian Luo, Garvesh Raskutti, and Ming Yuan),
*SIAM Journal on Mathematics of Data Science*, 2, 444-479, 2020. [R package] - Sparse and low-rank tensor estimation via cubic sketchings (with Botao Hao and Guang Cheng),
*IEEE Transactions on Information Theory*, 66, 9, 2020.

- Denoising Atomic Resolution 4D Scanning Transmission Electron Microscopy Data with Tensor Singular Value Decomposition (with Chenyu Zhang, Rungang Han, and Paul Voyles),
*Ultramicroscopy*, 219, 113123, 2020. - Learning good state and action representations via tractable tensor decomposition (with Chengzhuo Ni, Yaqi Duan and Mengdi Wang),
*International Symposium on Information Theory (ISIT)*, 2021. - Inference for Low-rank Tensors -- No Need to Debias (with Dong Xia and Yuchen Zhou),
*The Annals of Statistics*, to appear, 2021. - Exact clustering in tensor block model: Statistical optimality and computational limit (with Rungang Han, Yuetian Luo, and Miaoyan Wang).

(Rungang Han received the*Student's Paper Award*from the Statistical Learning and Data Science Section of the American Statistical Association, 2021 through this paper) - Tensor clustering with planted structures: Statistical optimality and computational limits (with Yuetian Luo),
*the Annals of Statistics*, to appear, 2021. - Optimal high-order tensor SVD via tensor-train orthogonal iteration (with Yuchen Zhou, Lili Zheng, and Yazhen Wang),
*IEEE Transactions on Information Theory*, to appear. [R Package]

- Low-rank tensor estimation via Riemannian Gauss-Newton: Statistical optimality and second-order convergence (with Yuetian Luo).
- Guaranteed functional tensor singular value decomposition (with Rungang Han and Pixu Shi).
- Learning polynomial transformations (with Sitan Chen, Jerry Li, and Yuanzhi Li)

- Optimal singular value decomposition for high-dimensional tensor data
- Optimal and efficient tensor regression analysis via importance sketching

#### Theory of Deep Learning

- Learning Polynomial Transformations (with Sitan Chen, Jerry Li, and Yuanzhi Li)

#### Nonconvex/Manifold Optimization

- Nonconvex Factorization and Manifold Formulations are Almost Equivalent in Low-rank Matrix Optimization (with Yuetian Luo and Xudong Li).
- On geometric connections of embedded and quotient geometries in Riemannian fixed-rank matrix optimization (with Yuetian Luo and Xudong Li).
- Recursive Importance Sketching for Rank Constrained Least Squares: Algorithms and High-order Convergence (with Yuetian Luo, Xudong Li, and Wen Huang).
- Low-rank tensor estimation via Riemannian Gauss-Newton: Statistical optimality and second-order convergence (with Yuetian Luo).

#### Markov Process Process State Aggregation, Information-based Reinforcement Learning

- Learning Markov models via low-rank optimization (with Ziwei Zhu, Xudong Li and Mengdi Wang),
*Operations Research*, to appear. - Spectral state compression of Markov processes (with Mengdi Wang),
*IEEE Transactions on Information Theory*, 66, 3202-3231, 2020. - Estimation of Markov chain via rank-constrained likelihood (with Mengdi Wang and Xudong Li),
*International Conference on Machine Learning (ICML)*, PMLR 80:3033-3042, 2018. - Learning good state and action representations via tractable tensor decomposition (with Chengzhuo Ni, Yaqi Duan and Mengdi Wang)
*2021 International Symposium on Information Theory (ISIT)*, 2021.

#### Microbiome Data Analysis

- High-dimensional log-error-in-variable regression with applications to microbial compositional data analysis (with Pixu Shi and Yuchen Zhou),
*Biometrika*, to appear, 2021. - Multi-sample estimation of bacterial composition matrix in metagenomics data (with Yuanpei Cao and Hongzhe Li),
*Biometrika*, 107, 75-92, 2020.

(This paper received*Biometrics Early-Stage Investigator Award*by the Biometrics Section of the American Statistical Association, 2019) - Regression Analysis for Microbiome Compositional Data (with Pixu Shi and Hongzhe Li),
*The Annals of Applied Statistics*, 10, 1019-1040, 2016. [Matlab package]

#### Matrix Estimation, Matrix Completion, Phase Retrieval

- Recursive Importance Sketching for Rank Constrained Least Squares: Algorithms and High-order Convergence (with Yuetian Luo, Xudong Li, and Wen Huang).
- Structured matrix completion with applications in genomic data integration (with Tianxi Cai and Tony Cai),
*Journal of American Statistical Association*, 111, 621-633, 2016. [R package] - ROP: matrix recovery via rank-one projections (with Tony Cai),
*The Annals of Statistics*, 43, 102-138, 2015.

#### Matrix PCA/SVD

- Nonparametric covariance estimation for mixed longitudinal studies, with applications in midlife women's health (with Kehui Chen),
*Statistica Sinica*, 32, 345-365, 2022. - A Schatten-q low-rank matrix perturbation analysis via perturbation projection error bound (with Yuetian Luo and Rungang Han),
*Linear Algebra and Its Applications*, to appear, 2021. - Heteroskedastic PCA: Algorithm, optimality, and applications (with Tony Cai and Yihong Wu),
*The Annals of Statistics*, to appear. - Rate-optimal perturbation bounds for singular subspaces with applications to high-Dimensional statistics (with Tony Cai),
*The Annals of Statistics*, 46, 60-89, 2018.

#### Semisupervised Inference

- Semi-supervised inference: General theory and estimation of means (with Lawrence Brown and Tony Cai),
*The Annals of Statistics*, 47, 2538-2566, 2019.

#### Compressed Sensing and High-dimensional Regression

- Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference (with Tony Cai and Yuchen Zhou),
*IEEE Transactions on Information Theory*, to appear. - Sparse representation of a polytope and recovery of sparse signals and low-rank matrices (with Tony Cai),
*IEEE Transactions on Information Theory*, 60, 122-132, 2014. - Sharp RIP bound for sparse signal and low-rank matrix recovery (with Tony Cai),
*Applied and Computational Harmonic Analysis*, 35, 74-93, 2013. - Compressed sensing and affine rank minimization under restricted isometry (with Tony Cai),
*IEEE Transactions on Signal Processing*, 61, 3279-3290, 2013.

#### High-dimensional Covariance Matrix Estimation

- Minimax rate-optimal estimation of high-dimensional covariance matrices with incomplete data (with Tony Cai),
*Journal of Multivariate Analysis*, 150, 55-74, 2016. - Inference for high-dimensional differential correlation matrices (with Tony Cai),
*Journal of Multivariate Analysis*, 143, 107-126, 2016.

#### Applied Probability

- On the non-asymptotic concentration of heteroskedastic Wishart-type matrix (with Tony Cai and Rungang Han),
*Electronic Journal of Probability*, to appear. - On the non-asymptotic and sharp tail bounds of random variables (with Yuchen Zhou),
*Stat*, 9, e314, 2020.

#### Miscellaneous

- Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization (with Hyunseung Kang, Tony Cai and Dylan Small),
*Journal of American Statistical Association*, 111, 132-144, 2016. [R Package] - Sequential rerandomization (with Quan Zhou, Philip Ernst, Kari Lock Morgan, and Donald Rubin),
*Biometrika*, 105, 745-752, 2018.

#### Collaborative Research and Other Manuscripts

- Ventriculomegaly and postoperative intraventricular blood predict cerebrospinal fluid diversion following posterior fossa tumor resection (Park, C., Liu, B., Harward, S., Zhang, A. R. et al.),
*Journal of Neurosurgery: Pediatrics*, to appear. - Denoising Atomic Resolution 4D Scanning Transmission Electron Microscopy Data with Tensor Singular Value Decomposition (with Chenyu Zhang, Rungang Han, and Paul Voyles),
*Ultramicroscopy*, 219, 113123, 2020. - LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data (Wan et al.)
*Nucleic Acids Research*, 2019. - High-dimensional statistical inference: from vector to matrix

*PhD Thesis*, 2015. - Methods to calculate the upper bound of Gini coefficient based on grouped data and the result for China (with Pixu Shi)

*preprint of Institute of Mathematics, Peking University*, 2010-20.