Font Size



A Computational Model to Identify Rare Events in Big Data

i1Whether it is a forgotten shelf of classics in a large library, or a tiny collection of cells with special properties in our immune system, the presence of rare events in a large sample is often very hard to detect without precise guidance. The problem gets computationally even harder if the search space has many dimensions. Dr. Saumyadipta Pyne of PHDL led an inter-disciplinary team of researchers from Europe, Asia and the United States to develop an efficient solution using a Bayesian hierarchical model and powerful parallel inference.

PHDL Scientific Director Gives Prof. C.R. Rao Centenary Lecture

Rao and PyneDr. Saumyadipta Pyne, Scientific Director of PHDL and faculty member of Biostatistics, delivered the Prof. C.R. Rao Birth Centenary Lecture on January 2, 2020, at the Department of Statistics, University of Pune (formally, Savitribai Phule Pune University). Born in 1920, C.R. Rao, F.R.S., is known for his pioneering work that laid the foundations of many branches of statistics. It includes the topic of the lecture, "On weighted distributions and applications", featuring Pyne's recent work on environmental data fusion. Weighted distributions allow inference when it is difficult to observe random samples from a population under study. Rao was University Professor at the University of Pittsburgh in the 1980s when he established a unique Center for Multivariate Analysis at Pitt.

New PHDL members with geoinformatics background

Pedram Gharani and Raanan GurewitschDr. Pedram Gharani completed his PhD in Information Science at University of Pittsburgh in July 2019. His dissertation focused on sensor fusion for indoor positioning and obstacle detection. He joined as a postdoctoral associate at PHDL where he is working with Dr. S. Pyne on data fusion and population synthesis for public health.

Raanan Gurewitsch graduated with BPhil in Information Science at University of Pittsburgh in April 2019. His thesis involved geospatial data analysis to predict lead plumbing in Pittsburgh's water systems. Following a Civic Digital data science fellowship with the US Census Bureau, he joined PHDL to work with Dr. S. Pyne on mortality and environmental data analytics.

Lower Vaccination Rates Make Texas Cities Susceptible to Measles Outbreaks

Austin MeaslesA University of Pittsburgh Graduate School of Public Health study found that large and small cities in Texas are becoming increasingly vulnerable to measles outbreaks due to more parents exempting their children from required vaccinations. Texas is the largest state by population that allows parents to opt their children out of vaccinations for nonmedical reasons making it an interesting place to study measles outbreaks. If the vaccination rate among students in Texas continues to decrease in schools with undervaccinated populations, the potential number of cases associated with measles outbreaks is estimated to increase exponentially. The 2018 vaccination rates in multiple metropolitan areas may permit large measles outbreaks, which could infect not only vaccine refusers but also other members of the population. These findings, published in JAMA Open Network on August 21, 2019, indicate that an additional 5% decrease in vaccination rates, which have been on a downward trend since 2003, would increase the size of a potential measles outbreak by up to 4,000% in some communities.

The Texas Pediatric Society asked the University of Pittsburgh Graduate School of Public Health to model Texas using FRED (Framework for Reconstructing Epidemiological Dynamics), a software platform developed at the Pitt Graduate School of Public Health as a tool for creating simulation models of dynamic processes in human social systems. FRED allows researchers to see how measles could spread from person to person. According to lead author, David Sinclair, "If policy stays as it is and there is no change in the public's perspective of vaccinations and the importance of vaccinating their children, then the potential measles outbreaks will only get worse". When there is "geographic clustering" of unvaccinated people, potential outbreaks get worse.

International Biometric Conference (IBC2020) in Seoul

2020 IBC logoDr. Saumyadipta Pyne, scientific director of PHDL and member of the International Program Committee of the 30th International Biometric Conference (IBC 2020), will organize an invited session on the topics of data fusion and statistical matching at IBC 2020 in Seoul during July 5-10, 2020. These topics are key to the practice of sharing and integration of data, which is becoming increasingly popular in science and medicine due to improved data sharing infrastructures, distributed and inter-disciplinary collaborative efforts, and cost sharing initiatives.

While data related issues such as confounding, causal inference, data security and missing data are not new, extension of these classical methodologies to the modern settings of Big Data is crucial to address some of the emerging challenges in data science. How do we draw joint inferences from very different survey designs, or protect individual privacy when combining data across multiple sources, or account for data that are systematically missing across studies that measured distinct sets of variables? This IBC 2020 session will feature cutting edge methodologies for data fusion and statistical matching presented by experts from across the globe.

The session will be held in honor of the birth centenary year (2020) of the legendary statistician C. R. Rao, FRS, who is also a former President of the International Biometric Society, which organizes IBC. In the US, he was formerly University Professor at University of Pittsburgh, and currently Eberly Professor of Statistics at Pennsylvania State University. Professor Rao's pioneering work (e.g., Rao score test, 1948) laid the foundations of many branches of statistics including data fusion. He also played a key mentoring role in Dr. Pyne's career.

New Health Data Science Course

Split Violin PlotDr. Saumyadipta Pyne, the Scientific Director of PHDL, will be teaching a new Biostatistics course in Fall 2019, BIOSTAT 2036: Introduction to Health Data Science. This course will teach students methods and concepts in data science that are motivated by real life problems in public health. Students will become familiar with data science terms and will learn the concepts of exploratory data analysis, data cleaning, data wrangling, and visualization. Students will learn the necessary skills to tidy, manage, and visualize data and communicate results. This course will mainly use the R programming language but will also teach certain concepts in SQL and Python. The course lectures will cover the following general themes: data structures and representation, data wrangling and processing, computational tools and techniques, and case studies illustrating steps of analysis of real data, including examples from public health.

BIOSTAT 2036 is a newly designed introduction to data science that will be one of the required courses in the soon to be formalized data science concentration of the MS degree in Biostatistics at Pitt Public Health though it will be open to any student. The prerequisite (co-requisite) is BIOST 2039 or permission of instructor. It will be offered in Fall 2019 on Tuesdays, 2:30-4:20 pm in Room A425 Public Health.

You are here: Home News