Subharup Guha

Subharup Guha,

Associate Professor

Department: PHHP-COM BIOSTATISTICS
Business Phone: (352) 294-5921
Business Email: s.guha@ufl.edu

Accomplishments

Top Faculty Achiever
2016 · University of Missouri
Faculty Fellowship Award
2014 · Department of Statistics, University of Missouri
Craig Cooley Memorial Prize -for scholarly excellence and leadership qualities
2004 · Department of Statistics, Ohio State University
Thomas and Jean Powers Teaching Award
2003 · Department of Statistics, Ohio State University
Student Paper Competition Award
2002 · ASA Statistical Computing and Graphics Sections

Research Profile

Subha Guha is an expert in Bayesian biostatistical modeling for cancer genomics and computing for high-dimensional datasets. As PI or co-Investigator of research grants from NIH and NSF, he has developed novel Bayesian models for multi-domain, high-throughput biomedical studies. He has extensive experience with statistical computing to efficiently implement these methodologies for Big Data.

The primary focus of his research has been the development of broadly applicable, nonparametric statistical methodologies that are flexible because they avoid making unrealistic assumptions about the data features and permit nonlinear dependencies; scalable, because the procedures are capable of accommodating the ever-expanding massive, multiple-domain datasets, even on a modest computing budget; and most importantly, scientifically interpretable, because they are based on models that incorporate domain knowledge and provide meaningful answers to key scientific questions that motivate the research.

Areas of Interest
  • Bayesian modeling
  • Biostatistics
  • Causal inference
  • Clustering and classification
  • Cognitive Neuroscience
  • Computational methods for Big Data
  • Connectomics
  • Development and testing of novel medical devices
  • Generalized linear models
  • Health disparities and vulnerable populations
  • Hidden Markov models
  • High-dimensional inference
  • MCMC simulation
  • Microbiome data analysis
  • Nonparametric Bayesian methods
  • Observational studies
  • Process monitoring
  • Quality control
  • Spatial statistics
  • Statistical computing
  • Statistical methods
  • Survival Analysis
  • cancer genomics
  • electronic health records

Publications

2021
Metagenomic Geolocation Prediction Using an Adaptive Ensemble Classifier.
Frontiers in genetics. 12 [DOI] 10.3389/fgene.2021.642282. [PMID] 33959149.
2020
A Bayesian Restoration of the Duality between Principal Components of a Distance Matrix and Operational Taxonomic Units in Microbiome Analyses
.
2020
Probabilistic Detection and Estimation of Conic Sections from Noisy Data
Journal of Computational and Graphical Statistics.
2020
Semiparametric Bayesian Markov Analysis of Personalized Benefit-Risk Assessment
Annals of Applied Statistics.
2019
A genomics-informed computational biology platform prospectively predicts treatment responses in AML and MDS patients.
Blood advances. 3(12):1837-1847 [DOI] 10.1182/bloodadvances.2018028316. [PMID] 31208955.
2017
Semiparametric Bayesian Analysis of High-Dimensional Censored Outcome Data
. 194-204
2016
A Nonparametric Bayesian Technique for High-Dimensional Regression
. 10:3374-3424
2015
A Hidden Markov Model for Detecting Differentially Expressed Genes from RNA-Seq Data
Annals of Applied Statistics. 9:901-925
2015
Nonparametric Variable Selection, Clustering and Prediction for Large Biological Datasets
.
2014
Bayesian disease classification using copy number data.
Cancer informatics. 13(Suppl 2):83-91 [DOI] 10.4137/CIN.S13785. [PMID] 25336897.
2011
Discussion of Sampling schemes for generalized linear Dirichlet process random e_ects models by Kyung, Gill and Casella
Statistical Methods and Applications. 20:291-293
2010
Bayesian Hidden Markov Modeling of Array CGH Data
. 103(482):485-497 [DOI] 10.1198/016214507000000923. [PMID] 22375091.
2010
Parametric and Semiparametric Hypotheses in the Linear Model
. 39:165-180
2010
Posterior Simulation in Countable Mixture Models for Large Datasets
Journal of the American Statistical Association. 105:775-786
2009
Gauss-Seidel Estimation of Generalized Linear Mixed Models with Application to Poisson Modeling of Spatially Varying Disease Rates
Journal of Computational and Graphical Statistics. 18:818-837
2008
Bayesian Hidden Markov Modeling of Array CGH Data.
Journal of the American Statistical Association. 103(482):485-497 [PMID] 22375091.
View on: PubMed
2008
Posterior Simulation in the Generalized Linear Mixed Model with Semiparametric Random Effects
Journal of Computational and Graphical Statistics. 17:410-425
2006
Generalized Post-stratification and Importance Sampling for Subsampled Markov Chain Monte Carlo Estimation
Journal of the American Statistical Association. 101:1175-1184
2006
Mixture Cure Survival Models with Dependent Censoring
Journal of the Royal Statistical Society Series B-Statistical Methodology. 69:285-306
2005
Spatio-temporal Analysis of Ischemic Heart Disease in NSW, Australia
Environmental and Ecological Statistics. 12:427-448
2004
Benchmark Estimation for Markov Chain Monte Carlo Samples
Journal of Computational and Graphical Statistics. 13:683-701
2003
Discussion of A theory of statistical models for Monte Carlo integration by Kong, McCullagh, Nicolae, Tan and Meng
Journal of the Royal Statistical Society Series B-Statistical Methodology. 65

Grants

Sep 2018 – Aug 2020
New Bayesian Nonparametric Paradigms of Personalized Medicine for Lung Cancer
Role: Principal Investigator
Funding: NATL SCIENCE FOU

Education

Ph.D.
2004 · Ohio State University
M.Sc.
1997 · Indian Institute of Technology, Kanpur, India

Teaching Profile

Courses Taught
2018-2021
PHC6937 Special Topics in Public Health
2020-2021
PHC7090 Advanced Biostatistical Methods I
2018-2020
PHC7979 Advanced Research
2021
PHC6905 Independent Study

Contact Details

Phones:
Business:
(352) 294-5921
Emails:
Business:
s.guha@ufl.edu