Subharup Guha

Subharup Guha,

Associate Professor

Department: PHHP-COM BIOSTATISTICS
Business Phone: (352) 294-5921
Business Email: s.guha@ufl.edu

Teaching Profile

Courses Taught
2018-2021
PHC6937 Special Topics in Public Health
2020-2021
PHC7090 Advanced Biostatistical Methods I
2018-2020,2022
PHC7979 Advanced Research
2021
PHC6905 Independent Study
2021-2023
PHC7980 Research for Doctoral Dissertation

Research Profile

Subha Guha is an expert in Bayesian biostatistical modeling for cancer genomics and computing for high-dimensional datasets. As PI or co-Investigator of research grants from NIH and NSF, he has developed novel Bayesian models for multi-domain, high-throughput biomedical studies. He has extensive experience with statistical computing to efficiently implement these methodologies for Big Data.

The primary focus of his research has been the development of broadly applicable, nonparametric statistical methodologies that are flexible because they avoid making unrealistic assumptions about the data features and permit nonlinear dependencies; scalable, because the procedures are capable of accommodating the ever-expanding massive, multiple-domain datasets, even on a modest computing budget; and most importantly, scientifically interpretable, because they are based on models that incorporate domain knowledge and provide meaningful answers to key scientific questions that motivate the research.

Areas of Interest
  • Bayesian modeling
  • Biostatistics
  • Causal inference
  • Clustering and classification
  • Cognitive Neuroscience
  • Computational methods for Big Data
  • Connectomics
  • Development and testing of novel medical devices
  • Generalized linear models
  • Health disparities and vulnerable populations
  • Hidden Markov models
  • High-dimensional inference
  • MCMC simulation
  • Microbiome data analysis
  • Nonparametric Bayesian methods
  • Observational studies
  • Process monitoring
  • Quality control
  • Spatial statistics
  • Statistical computing
  • Statistical methods
  • Survival Analysis
  • cancer genomics
  • electronic health records

Publications

2022
A novel approach to augment single-arm clinical studies with real-world data.
Journal of biopharmaceutical statistics. [DOI] 10.1080/10543406.2021.2011902. [PMID] 34958629.
2021
Metagenomic Geolocation Prediction Using an Adaptive Ensemble Classifier.
Frontiers in genetics. [DOI] 10.3389/fgene.2021.642282. [PMID] 33959149.
2020
A Bayesian Restoration of the Duality between Principal Components of a Distance Matrix and Operational Taxonomic Units in Microbiome Analyses
.
2020
Probabilistic Detection and Estimation of Conic Sections from Noisy Data
Journal of Computational and Graphical Statistics.
2020
Semiparametric Bayesian Markov Analysis of Personalized Benefit-Risk Assessment
Annals of Applied Statistics.
2019
A genomics-informed computational biology platform prospectively predicts treatment responses in AML and MDS patients.
Blood advances. [DOI] 10.1182/bloodadvances.2018028316. [PMID] 31208955.
2017
Semiparametric Bayesian Analysis of High-Dimensional Censored Outcome Data
.
2016
A Nonparametric Bayesian Technique for High-Dimensional Regression
.
2015
A Hidden Markov Model for Detecting Differentially Expressed Genes from RNA-Seq Data
Annals of Applied Statistics.
2015
Nonparametric Variable Selection, Clustering and Prediction for Large Biological Datasets
.
2014
Bayesian disease classification using copy number data.
Cancer informatics. [DOI] 10.4137/CIN.S13785. [PMID] 25336897.
2011
Discussion of Sampling schemes for generalized linear Dirichlet process random e_ects models by Kyung, Gill and Casella
Statistical Methods and Applications.
2010
Bayesian Hidden Markov Modeling of Array CGH Data
Journal of the American Statistical Association. [DOI] 10.1198/016214507000000923. [PMID] 22375091.
2010
Parametric and Semiparametric Hypotheses in the Linear Model
.
2010
Posterior Simulation in Countable Mixture Models for Large Datasets
Journal of the American Statistical Association.
2009
Gauss-Seidel Estimation of Generalized Linear Mixed Models with Application to Poisson Modeling of Spatially Varying Disease Rates
Journal of Computational and Graphical Statistics.
2008
Bayesian Hidden Markov Modeling of Array CGH Data.
Journal of the American Statistical Association. [PMID] 22375091.
2008
Posterior Simulation in the Generalized Linear Mixed Model with Semiparametric Random Effects
Journal of Computational and Graphical Statistics.
2006
Generalized Post-stratification and Importance Sampling for Subsampled Markov Chain Monte Carlo Estimation
Journal of the American Statistical Association.
2006
Mixture Cure Survival Models with Dependent Censoring
Journal of the Royal Statistical Society Series B-Statistical Methodology.
2005
Spatio-temporal Analysis of Ischemic Heart Disease in NSW, Australia
Environmental and Ecological Statistics.
2004
Benchmark Estimation for Markov Chain Monte Carlo Samples
Journal of Computational and Graphical Statistics.
2003
Discussion of A theory of statistical models for Monte Carlo integration by Kong, McCullagh, Nicolae, Tan and Meng
Journal of the Royal Statistical Society Series B-Statistical Methodology.

Grants

Sep 2020 ACTIVE
Improving Sexually Transmitted Infection Screening and Treatment among People Living with or at Risk for HIV
Role: Project Manager
Funding: RUTGERS STATE UNIV via US HLTH RESOURCES AND SERV ADMN HIV/AIDS
Sep 2018 – Aug 2020
New Bayesian Nonparametric Paradigms of Personalized Medicine for Lung Cancer
Role: Principal Investigator
Funding: NATL SCIENCE FOU

Contact Details

Phones:
Business:
(352) 294-5921
Emails:
Business:
s.guha@ufl.edu
Addresses:
Business Mailing:
PO BOX 117450
GAINESVILLE FL 326110001
Business Street:
2004 MOWRY RD., CTRB, RM. 5225
GAINESVILLE FL 32610