Biostatistics Course Descriptions : School of Public Health & Health Sciences : UMass Amherst

PUBHLTH 223: Introduction to Biostatistics for Public Health

Primary Audience: Undergraduate

This introductory course is designed to give students the basic skills to organize and summarize data, along with an introduction to the fundamental principles of statistical inference. The course emphasizes an understanding of statistical concepts and interpretation of numeric data summaries along with basic analysis methods, using examples and exercises from medical and public health studies. The course does not require a high-level mathematics background, and will highlight the use and integration of statistical software, spreadsheets and word processing software in conducting and presenting data summaries and analyses.

PUBHLTH 390R: Introduction to Data Science Using R

Primary Audience: Undergraduate

This course focuses on data visualization and data transformation, followed by other topics including exploratory data analysis and programing. You will learn the most important tools in R to do data science and gain hands-on experience through in-class coding activities and homework assignments. Along with the introduction of tools in R, you will also learn about basic concepts in data science. Throughout the course, you will practice communicating your results with others.

PUBHLTH 460: Telling Stories with Data: Statistics, Modeling, and Data Visualization

Primary Audience: Undergraduate

The aim of this course is to provide students with the skills necessary to tell interesting and useful stories in real-world encounters with data. Specifically, they will develop the statistical and programming expertise necessary to analyze datasets with complex relationships between variables. Students will gain hands-on experience summarizing, visualizing, modeling, and analyzing data. Students will learn how to build statistical models that can be used to describe and evaluate multidimensional relationships that exist in the real world. Specific methods covered will include linear and logistic regression. Students will work with the R statistical computing language and by the end of the course will require substantial independent programming. The course will not provide explicit or detailed training in R programming. To the extent possible, the course will draw on real datasets from biological and biomedical applications. This course is designed for students who are looking for a second course in applied statistics/biostatistics (e.g. beyond PUBHLTH 223, 390B or STAT 240), or an accelerated introduction to statistics and modern statistical computing.

PUBHLTH 490Z: Statistical Modeling for Health Data Science

Primary Audience: Undergraduate

This course is aimed at developing a broad understanding of statistical models with application to real data. Specifically, students will gain hands-on, in-depth experience analyzing data using simple/multiple linear regression, logistic regression, multinomial and Poisson regression and an introduction to machine learning. This course is designed for students who are looking for a second course in applied statistics/biostatistics beyond PUBHLTH 460 but can also be taken after PUBHLTH 223, 390B, STAT 240, or an equivalent introduction to statistics and modern statistical computing.

540: Intro Biostatistics

Primary Audience: Foundational pre-requisite for MS Biostatistics

Principles of statistics applied to analysis of biological and health data, evaluation of public health and clinical programs. Gen Ed: R2 (Analytical Reasoning).

640: Intermediate Biostatistics

Primary Audience: Foundational pre-requisite for MS Biostatistics

Principles of statistics applied to analysis of biological and health data. Continuation of Bioepi 540 including analysis of variance, regression, nonparametric statistics, sampling, and categorical data analysis.

597D-E-A: R for data science (Levels 1-3)

Primary Audience: Undergraduates, MS/PhD in Biostatistics, Epidemiology

R has emerged as a preferred programming language in data science. This sequence of three courses covers topics in R programming to develop powerful, robust, and reusable data science tools. Main topics in part I include data wrangling, visualization, and reporting using R markdown. Part II focuses on programming, modeling, iteration, and the development of web apps using R Shiny. In Part III, we learn how to collaborate on code using GitHub and write R packages.

691F: Data Mgmt & Analysis/SAS

Primary Audience: Undergraduates, MS/PhD in Biostatistics, Epidemiology

SAS software is used widely outside academia. Many graduates find it a useful skill in job hunting. This course covers using SAS for basic data manipulation and analysis. It also reinforces understanding of key statistical concepts. You will use SAS to: read data from many formats; generate univariate statistics and histograms; define new variables using logic and functions; merge and subset data sets; make bivariate tables and scatterplots; test for association; perform linear and logistic regression.

690C: Data Mgmt & Analysis/Stata

Primary Audience: Undergraduates, MS/PhD in Biostatistics, Epidemiology

This course is an introduction to the design, management, and use of data management systems for the collection and analysis of research data, especially epidemiologic research data on humans. MS Excel, MS Access, and Stata are emphasized. Topics include data base development, Health Insurance Portability Accountability Act (HIPAA) compliance, data manipulation and cleaning, data summarization, and selected topics in statistical analysis programming.

690A: Fundamentals of Probability and Statistical Inference

Primary Audience: MS Biostatistics, MS/PhD Epidemiology

The goal of this 3-credit course is to introduce fundamentals of probability theory, statistical inference tools and their application to biostatistics. The course is intended for first-year graduate students in Biostatistics MS program. The topics in this course include basic concepts of probability, random variables, important probability distributions (e.g., normal, exponential, binomial and Poisson), marginal distribution, conditional distribution, joint distribution, expectation and variance, conditional expectation, law of large numbers, central limit theorem, sampling distributions, point estimation, maximum likelihood estimation, method of moments and estimating equations, interval estimation, hypothesis testing. Examples from biostatistical applications will be used whenever possible. Simple simulations with R software will be used to illustrate some concepts in probability and statistical inference.

690Z: Health Data Science and Statistical Modeling

Primary Audience: MS Biostatistics, MS/PhD Epidemiology

This course is for students who want to learn essential statistical and computational skills for health data science. Students will obtain hands-on experience in implementing a wide range of commonly used statistical methods with real data from public health and biomedical research using the statistical programming language R. The course motivates statistical reasoning and methods through real health data. The focus of the course is to train students in refining a scientific question into a statistical framework, choosing proper regression models, writing scripts and executing them in R, and interpreting scientifically meaningful findings.

690P: Topics in Biostatistics and Data Science

Primary Audience: MS Biostatistics, MS/PhD Epidemiology

The course introduces advanced central topics in biostatistics and health data science including maximum likelihood inference, survival analysis, design and analysis of clinical trials, models for correlated data, bayesian modeling, and causal inference. The course motivates statistical reasoning and methods through substantive research questions and features of data typically available in public health and biomedical research. Students will obtain hands-on experience in applying selected methods on real data using the statistical programming language R.