Local Risk Modeling of Lead in Drinking Water

Map of Pittsburgh with lead levels indicated

Lead contamination of drinking water is a prominent public health challenge in the United States. Yet, often it is difficult to precisely determine its variable exposures at local levels, especially in old cities. A recent study by Raanan Gurewitsch with Saumyadipta Pyne and coauthors introduced the concept of “infrastructural complexity” of a neighborhood to address this key issue. Their work on “Spatial modeling of lead water contamination risk in local communities of Pittsburgh, PA” was accepted for presentation in the student poster award and excellence in environmental justice track of the American Public Health Association annual meeting (APHA 2020).

A Novel Application of Augmented Reality to Statistical Inference

"The Anatomy Of a Natural Hazard" - image describing radon gas process

In 1984, the “Watras incident” drew media and congressional attention in the U.S. when radon, a carcinogenic gas, at the Watras family home on the Reading Prong in Pennsylvania was recorded as almost 700 times the safe level, a lung cancer risk equivalent of smoking 250 packs of cigarettes a day! Combining synthetic data with real data, i.e., Augmented Reality (AR), can provide key insights into different phenomena. In a new studySaumyadipta Pyne and Prof. Benjamin Kedem of University of Maryland developed an AR approach for estimation of tail probabilities of rare events such as unusual environmental exposures or weather extremes or disease outbreaks from a moderate number of observations. (Image courtesy: NY Times)

AICoV: A New Deep Learning Framework for COVID-19

AICov Framework

A new Long short-term memory (LSTM) based artificial recurrent neural network architecture called AICov was developed as an integrative deep learning framework for COVID-19 forecasting with population covariates. Saumyadipta Pyne and his collaborator Prof. Geoffrey Fox at Indiana University, Bloomington, and coworkers integrated multiple different strategies based on LSTM into AICov to not only include data on the disease but, additionally, socioeconomic covariates and various risk factors at a local level. The compiled data are fed into AICov, leading to a powerful deep learning framework for improved outcome prediction.

Modeling COVID-19 Death Rates in Populations with Comorbidities


Current evidence shows that the prevalence of certain comorbidities in a given population could make it more vulnerable to serious outcomes of COVID-19, including fatality. A new mixture of polynomial-time series (MoPTS) model was developed to simultaneously identify (a) clusters of U.S. cities in terms of their COVID-19 death rates, and (b) the different associations of those rates with some key comorbidities among the populations represented in the clusters. The study was conducted by Saumyadipta Pyne and collaborators (M. Maleki, R. Gurewitsch, M. Aruru, and G.J. McLachlan, University of Queensland).

Identification of patterns linking human mobility and COVID-19 dynamics


The 18th-century French mathematician Gaspard Monge also considered the father of differential geometry, proposed Optimal Transport (OT) theory to determine the minimum effort to move or morph one distribution (say, a sand pile) into another (a fort for Napoleon’s army!). By using OT to measure the minimum cost of translating a city’s distribution of human mobility measures during the pandemic into that of its COVID-19 incidence, temporal patterns of such dependency across more than 150 U.S. cities were analyzed. The overall pattern in each of the identified clusters was summarized in the form of Wasserstein barycenters. The study (synopsis) was conducted by Saumyadipta Pyne with Frank Nielsen of École Polytechnique, Gautier Marti, and Sumanta Ray. Image courtesy: Wikipedia.

COVID-19 model for strategic lockdown policy

CO19 Outbreak lockdowns map

In 1957, M.S. Bartlett, FRS, introduced the concept of critical community size (CCS) below which an infectious disease does not persist in a closed population. With a subaward from a NIH Fogarty grant (PI: D. Burke, Co-PI: C. Bunker, S. Pyne) for training disease modelers in India, Prof. Indranil Mukhopadhyay and Sarmistha Das of Indian Statistical Institute, and their collaborators, used CCS to develop a model of strategic and focused lockdown policy for COVID-19 in a given population. Saumyadipta Pyne is a co-author of the study, which was published in ‘Statistics and Applications’ in June 2020. Image courtesy: Wikipedia.

Probabilistic Event Detection using Data Fusion

PA Radon EPA

Dr. Saumyadipta Pyne and collaborators (Prof. Benjamin Kedem and Xuze Zhang, University of Maryland) have developed a new statistical framework for real and synthetic data fusion to estimate exceedance probabilities in an observed stream of events with only a few observations. Starting with a baseline distribution,​ this method can model a dynamic distortion of that original template, and thus, be used for modeling environmental exposures. The study was published in ‘Applied Stochastic Models in Business and Industry’ in June 2020, and covered in press. A follow-up study addressed the problem of model selection in such data fusion. Image courtesy: US EPA.

A Computational Model to Identify Rare Events in Big Data


Whether it is a forgotten shelf of classics in a large library, or a tiny collection of cells with special properties in our immune system, the presence of rare events in a large sample is often very hard to detect without precise guidance. The problem gets computationally even harder if the search space has many dimensions. Dr. Saumyadipta Pyne of PHDL led an interdisciplinary team of researchers from Europe, Asia, and the United States to develop an efficient solution using a Bayesian hierarchical model and powerful parallel inference.

PHDL Scientific Director Gives Prof. C.R. Rao Centenary Lecture

Rao and Pyne

Dr. Saumyadipta Pyne, Scientific Director of PHDL and faculty member of Biostatistics, delivered the Prof. C.R. Rao Birth Centenary Lecture on January 2, 2020, at the Department of Statistics, University of Pune (formally, Savitribai Phule Pune University). Born in 1920, C.R. Rao, F.R.S., is known for his pioneering work that laid the foundations of many branches of statistics. It includes the topic of the lecture, “On weighted distributions and applications”, featuring Pyne’s recent work on environmental data fusion. Weighted distributions allow inference when it is difficult to observe random samples from a population under study. Rao was University Professor at the University of Pittsburgh in the 1980s when he established a unique Center for Multivariate Analysis at Pitt.