Slides of recent work
Preprints awaiting publication :-)
- Zihan Zhu, Xin Gai, Anru R. Zhang (2024+), Functional post-clustering selective inference with applications to EHR.
- Brett W. Larsen, Tamara G. Kolda, Anru R. Zhang, and Alex H. Williams (2024+), Tensor decomposition meets RKHS: Efficient algorithms for smooth and misaligned data.
- Shounak Chattopadhyay, Anru R. Zhang, David Dunson (2024+), Blessing of dimension in Bayesian inference on covariance matrices.
- Pixu Shi, Cameron Martino, Rungang Han, Stefan Janssen, Gregory Buck, Myrna Serrano, Kouros Owzar, Rob Knight, Liat Shenhav, Anru R Zhang (2024+), Time-informed dimensionality reduction for longitudinal microbiome studies.
- Joshua Agterberg and Anru R. Zhang (2024+), Statistical inference for low-rank tensors: Heteroskedasticity, subgaussianity, and applications.
Publications
- Muhang Tian, Bernie Chen, Allan Guo, Shiyi Jiang, Anru R. Zhang (2024+), Fast and reliable generation of EHR time series via diffusion models, Journal of the American Medical Informatics Association, to appear.
- Yuetian Luo, Xudong Li, and Anru R. Zhang (2024+), Nonconvex factorization and manifold formulations are almost equivalent in low-rank matrix optimization, INFORMS Journal on Optimization, to appear.
- Joshua Agterberg and Anru R. Zhang (2024+), Estimating higher-order mixed memberships via the l_{2,\infty} tensor perturbation bound, Journal of the American Statistical Association, to appear.
- Jing Lei, Anru R. Zhang, Zihan Zhu (2024+), Computational and statistical thresholds in multi-layer stochastic block models, The Annals of Statistics, to appear.
- Runshi Tang, Ming Yuan, and Anru R. Zhang (2024+), Mode-wise principal subspace pursuit and matrix spiked covariance model, Journal of the Royal Statistical Society, Series B, to appear.
- Yuetian Luo and Anru R. Zhang (2024+), Tensor-on-tensor regression: Riemannian optimization, over-parameterization, statistical-computational gap, and their interplay, The Annals of Statistics, to appear. [R Package]
- Shiyi Jiang, Xin Gai, Miriam Treggiari, William Stead, Yuankang Zhao, David Page, Anru R. Zhang (2024), Soft phenotyping for sepsis via EHR time-aware soft clustering, Journal of Biomedical Informatics, 152, 104615.
- Yuetian Luo, Xudong Li, and Anru R. Zhang (2024). On geometric connections of embedded and quotient geometries in Riemannian fixed-rank matrix optimization, Mathematics of Operations Research, 49, 782-825.
- Yuetian Luo, Xudong Li, Wen Huang, and Anru R. Zhang (2024). Recursive importance sketching for rank constrained least squares, Operations Research, 72, 237-256.
- Ziang Chen, Jianfeng Lu, and Anru R. Zhang (2024), One-dimensional tensor network recovery, SIAM Journal on Matrix Analysis and Applications, 45, 1217-1244.
- Rungang Han, Pixu Shi, and Anru R. Zhang (2024). Guaranteed functional tensor singular value decomposition, Journal of the American Statistical Association, 119, 995-1007.
- Ilias Diakonikolas, Daniel Kane, Yuetian Luo, and Anru R. Zhang (2023). Statistical and computational limits for tensor-on-tensor association detection, Proceedings of Thirty Sixth Conference on Learning Theory (COLT), 195, 5260-5310.
- Shiyi Jiang, Rungang Han, Krishnendu Chakrabarty, David Page, William Stead, and Anru R. Zhang (2023). Timeline registration for electronic health records, AMIA Summits on Translational Science Proceedings, 291-299.
(This paper won the Data Science Distinguished Paper Award from 2023 AMIA Informatics Summit. Only one paper receives this award.)
- Yuetian Luo and Anru R. Zhang (2023). Low-rank tensor estimation via Riemannian Gauss-Newton: Statistical optimality and second-order convergence, Journal of Machine Learning Research, 24, 1-48.
- Peter Hoff, Andrew McCormack, and Anru R. Zhang (2023). Core shrinkage covariance estimation for matrix-variate data, Journal of the Royal Statistical Society, Series B, 85, 1659-1679.
- Rungang Han and Anru R. Zhang (2023). Discussion of "Vintage factor analysis with Varimax performs statistical inference", Journal of the Royal Statistical Society, Series B, 85, 1069-1070.
- Sitan Chen, Jerry Li, Yuanzhi Li, and Anru R. Zhang (2023). Learning polynomial transformations, 2023 Annual ACM Symposium on Theory of Computing (STOC).
- Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, and Anru R. Zhang (2023). Sampling is as easy as learning the score: Theory for diffusion models with minimal data assumptions, International Conference on Learning Representations (ICLR), accept: notable-top-5%.
- Jiashun Jin, Tracy Ke, Paxton Turner, and Anru R. Zhang (2023). Phase transition for detecting a small community in a large network, 2023 International Conference on Learning Representations (ICLR), accepted.
- Chengzhuo Ni, Yaqi Duan, Munther Dahleh, Mengdi Wang, and Anru R. Zhang (2023). Learning good state and action representations for Markov decision process via tensor decomposition, Journal of the Machine Learning Research, 24, 1-53.
A short version was presented at International Symposium on Information Theory (ISIT), 2021.
- Rungang Han, Yuetian Luo, Miaoyan Wang, and Anru R. Zhang (2023). Exact clustering in tensor block model: Statistical optimality and computational limit, Journal of the Royal Statistical Society, Series B, 84, 1666-1698. [R Package]
(This paper received the Student's Paper Award from the Statistical Learning and Data Science Section of the American Statistical Association, 2021)
- Dong Xia, Anru R. Zhang, and Yuchen Zhou (2023). Inference for low-rank tensors -- No need to debias, The Annals of Statistics, 50, 1220-1245.
- Yuetian Luo and Anru R. Zhang (2022). Tensor clustering with planted structures: Statistical optimality and computational limits, The Annals of Statistics, 50, 584-613.
- Tony Cai, Yihong Wu, and Anru R. Zhang (2022). Heteroskedastic PCA: Algorithm, optimality, and applications, The Annals of Statistics, 50, 53-80.
- Rungang Han, Rebecca Willett, and Anru R. Zhang (2022). An optimal statistical and computational framework for generalized tensor estimation, The Annals of Statistics, 50, 1-29.
- Tony Cai, Rungang Han, and Anru R. Zhang (2022). On the non-asymptotic concentration of heteroskedastic Wishart-type matrix, Electronic Journal of Probability, 27, 1-40.
- Yuchen Zhou, Lili Zheng, Yazhen Wang, and Anru R. Zhang (2022). Optimal high-order tensor SVD via tensor-train orthogonal iteration, IEEE Transactions on Information Theory, 66, 5927-5964. [R Package]
- Pixu Shi, Yuchen Zhou, Anru R. Zhang (2022). High-dimensional log-error-in-variable regression with applications to microbial compositional data analysis, Biometrika, 109, 405-420.
- Ziwei Zhu, Xudong Li, Mengdi Wang, and Anru Zhang (2022). Learning Markov models via low-rank optimization, Operations Research, 70, 2384-2398.
- Tony Cai, Yuchen Zhou, and Anru R. Zhang (2022). Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference, IEEE Transactions on Information Theory, 68, 5975-6002.
- Anru R. Zhang and Kehui Chen (2022). Nonparametric covariance estimation for mixed longitudinal studies, with applications in midlife women's health, Statistica Sinica, 32, 345-365.
- Yuetian Luo, Rungang Han, and Anru R. Zhang (2021). A Schatten-q low-rank matrix perturbation analysis via perturbation projection error bound, Linear Algebra and Its Applications, 630, 225-240.
- Anru R. Zhang, Yuetian Luo, Garvesh Raskutti, and Ming Yuan (2021). A sharp blockwise tensor perturbation bound for orthogonal iteration, Journal of Machine Learning Research, 22, 1-48.
(This paper received Biometrics Early-Stage Investigator Award by the Biometrics Section of the American Statistical Association, 2019)
- Anru R. Zhang, Yuetian Luo, Garvesh Raskutti, and Ming Yuan (2020). ISLET: fast and optimal low-rank tensor regression via importance sketchings, SIAM Journal on Mathematics of Data Science, 2, 444-479. [R package]
- Yuetian Luo and Anru R. Zhang (2020). Open problem: Average-case hardness of hypergraphic planted clique detection, Conference on Learning Theory (COLT), 125, 3852-3856.[talk and slides]
- Anru R. Zhang and Yuchen Zhou (2020). On the non-asymptotic and sharp tail bounds of random variables, Stat, 9, e314.
- Botao Hao, Anru R. Zhang, and Guang Cheng (2020). Sparse and low-rank tensor estimation via cubic sketchings, IEEE Transactions on Information Theory, 66, 9.
A short version published in Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020.
- Yuanpei Cao, Anru R. Zhang, and Hongzhe Li (2020). Multi-sample estimation of bacterial composition matrix in metagenomics data, Biometrika, 107, 75-92.
- Anru Zhang and Mengdi Wang (2020). Spectral state compression of Markov processes, IEEE Transactions on Information Theory, 66, 3202-3231.
- Anru Zhang (2019). Cross: Efficient low-rank tensor completion, The Annals of Statistics, 47, 936-964. [R package]
- Anru Zhang and Rungang Han (2019). Optimal sparse singular value decomposition for high-dimensional high-order data, Journal of American Statistical Association, 114, 1708-1725. [R package]
- Anru Zhang, Lawrence Brown, and Tony Cai (2019). Semi-supervised inference: General theory and estimation of means, The Annals of Statistics, 47, 2538-2566.
- Anru Zhang and Dong Xia (2018). Tensor SVD: Statistical and computational limits, IEEE Transactions on Information Theory, 64, 1-28. [An R implementation]
- Quan Zhou, Philip Ernst, Kari Lock Morgan, Donald Rubin, and Anru Zhang (2018). Sequential rerandomization, Biometrika, 105, 745-752.
- Tony Cai and Anru Zhang (2018). Rate-optimal perturbation bounds for singular subspaces with applications to high-Dimensional statistics, The Annals of Statistics, 46, 60-89.
- Mengdi Wang, Xudong Li, and Anru Zhang (2018). Estimation of Markov chain via rank-constrained likelihood, International Conference on Machine Learning (ICML), PMLR 80:3033-3042.
- Tianxi Cai, Tony Cai, and Anru Zhang (2016). Structured matrix completion with applications in genomic data integration, Journal of American Statistical Association, 111, 621-633. [R package]
- Pixu Shi, Anru Zhang, and Hongzhe Li (2016). Regression Analysis for Microbiome Compositional Data, The Annals of Applied Statistics, 10, 1019-1040. [Matlab package]
- Hyunseung Kang, Anru Zhang, Tony Cai, and Dylan Small (2016). Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization, Journal of American Statistical Association, 111, 132-144. [R Package]
- Tony Cai and Anru Zhang (2016). Minimax rate-optimal estimation of high-dimensional covariance matrices with incomplete data, Journal of Multivariate Analysis, 150, 55-74.
- Tony Cai and Anru Zhang (2016). Inference for high-dimensional differential correlation matrices, Journal of Multivariate Analysis, 143, 107-126.
- Tony Cai and Anru Zhang (2015). ROP: matrix recovery via rank-one projections, The Annals of Statistics, 43, 102-138.
- Tony Cai and Anru Zhang (2014). Sparse representation of a polytope and recovery of sparse signals and low-rank matrices, IEEE Transactions on Information Theory, 60, 122-132.
- Tony Cai and Anru Zhang (2013). Sharp RIP bound for sparse signal and low-rank matrix recovery, Applied and Computational Harmonic Analysis, 35, 74-93.
- Tony Cai and Anru Zhang (2013). Compressed sensing and affine rank minimization under restricted isometry, IEEE Transactions on Signal Processing, 61, 3279-3290.
Collaborative Research
- Chenyin Gao, Shu Yang, and Anru R. Zhang (2024). Self-supervised imaging denoising via low-rank tensor approximated convolutional neural network, IET Image Processing, to appear.
- Annalise Schweickart, Richa Batra, Bryan J. Neth, Cameron Martino, Liat Shenhav, Anru R. Zhang, et al. (2024). Serum and CSF metabolomics analysis shows Mediterranean Ketogenic Diet mitigates risk factors of Alzheimer's disease, npj Metabolic Health and Disease, 2, 15.
- Zachary M. Burcham, Aeriel D. Belk, Bridget B. McGivern, Amina Bouslimani, Parsa Ghadermazi, Cameron Martino, Liat Shenhav, Anru R. Zhang, et al. (2024). Universal interkingdom microbial network decomposes mammals despite varied climate, location, and seasonal influence, Nature Microbiology, 9, 595-613.
- Anru R. Zhang, Ryan Bell, Chen An, Runshi Tang, Shana Hall, Cliburn Chan, Al-Khalil, Kareem, Meade, Christina (2023). Cocaine use prediction with tensor-based machine learning on multimodal MRI connectome data, Neural Computation, 36, 107-127.
- Aditi U. Gurkar, Akos A. Gerencser, Ana L. Mora, Andrew C. Nelson, Anru R. Zhang, et al. Spatial mapping of cellular senescence: emerging challenges and opportunities, Nature Aging, 3, 776-790, 2023.
- Sheri Towe, Runshi Tang, Matt Gibson, Anru R. Zhang, Christina Meade, Longitudinal changes in neurocognitive performance related to drug use intensity in a sample of persons with and without HIV who use stimulants, Drug And Alcohol Dependence, to appear.
- Christine Park, Beiyu Liu, Stephen C. Harward, Anru R. Zhang, et al., Ventriculomegaly and postoperative intraventricular blood predict cerebrospinal fluid diversion following posterior fossa tumor resection, Journal of Neurosurgery: Pediatrics, 28, 533-543, 2021.
- Chenyu Zhang, Rungang Han, Anru Zhang, and Paul Voyles, Denoising Atomic Resolution 4D Scanning Transmission Electron Microscopy Data with Tensor Singular Value Decomposition, Ultramicroscopy, 219, 113123, 2020.
- Wan, C. et al. LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data Nucleic Acids Research, 2019.
Other Manuscripts