Prior Classes 2021

WUSS Online Classes 2021 – PRIOR CLASSES

In lieu of an annual conference, WUSS is offering an extensive menu of online training classes taught by seasoned industry experts throughout 2021. The classes listed below have already taken place. Please click on each class title for a detailed description of the course and information about the instructors.

These are past classes – for information about UPCOMING CLASSES, please click here!

Date Course Title (click for description) Instructor(s)
Apr 26 SAS + R Part 1: Connecting SAS and R in Your Data Science Workflow Hunter Glanz
Apr 28 SAS + R Part 2: Using R Shiny to Make Your Data Wrangling and Visualization Interactive Hunter Glanz
Jun 17 Why You Are Using PROC GLM Too Much (and What You Should Be Using Instead) Part 1 Deanna Schreiber-Gregory and Peter Flom
Jun 18 Why You Are Using PROC GLM Too Much (and What You Should Be Using Instead) Part 2 Deanna Schreiber-Gregory and Peter Flom
Jun 22 How Sick is my Cohort of Patients? A General Approach to Identifying Chronic Conditions Patricia Ferido
Jun 25 Elementary Logistic Regression with Predictive Modeling Bruce Lund

Course Descriptions

SAS + R Part 1: Connecting SAS and R in Your Data Science Workflow

Hunter Glanz
Monday, April 26, 2021, 10:00am-2:00pm Pacific Daylight Time

As robust statistical software packages, SAS and R boast a great number of tools for addressing all of your data-related needs. While there exists large overlap in what they provide, today’s statistical and data science problems increasingly involve multiple software packages. After all, if you have access to all of these tools then why not explore how they can improve your workflow! In this class we will explore the complete workflow of cleaning a dataset, exploring it, visualizing it using a combination of SAS and R.

SAS + R Part 2: Using R Shiny to Make Your Data Wrangling and Visualization Interactive

Hunter Glanz
Wednesday, April 28, 2021, 10:00am-2:00pm Pacific Daylight Time

While both SAS and R include a rich suite of tools for working with your data, there often exists a collection of tasks and activities that get repeated with every new dataset. Traditionally such repetition could be addressed by building macros or functions. R Shiny enhances this process by making your data work interactive! Not only can this save you some code and work, but it provides a way for consumers of your work to do all of your cool data science-y things without needing to know how to program. In this class we will build our very own basic shiny applications using R.

Why You Are Using PROC GLM Too Much (and What You Should Be Using Instead) Part 1

Deanna Schreiber-Gregory and Peter Flom
Thursday, June 17, 2021, 10:00am-2:00pm Pacific Daylight Time

The general linear model (linear regression and ANOVA) is one of the most commonly used statistical methods. However, the GLM makes assumptions and sometimes these assumptions are violated. There are many techniques that can be used to deal with various violations, and there are SAS PROCs to implement these. These include: Quantile regression, Robust regression, Cubic splines and other forms of splines, Multivariate adaptive regression splines (MARS), Regression trees, Multilevel models, Ridge Regression, LASSO, and Elastic Nets, among other methods. Covered PROCs include QUANTREG, ROBUSTREG, ADAPTIVEREG and MIXED.

Part 1: Intro, assumptions, diagnosing violations, quantile regression (QUANTREG), MARS (ADAPTIVEREG) and splines (TRANSREG).

Why You Are Using PROC GLM Too Much (and What You Should Be Using Instead) Part 2

Deanna Schreiber-Gregory and Peter Flom
Friday, June 18, 2021, 10:00am-2:00pm Pacific Daylight Time

The general linear model (linear regression and ANOVA) is one of the most commonly used statistical methods. However, the GLM makes assumptions and sometimes these assumptions are violated. There are many techniques that can be used to deal with various violations, and there are SAS PROCs to implement these. These include: Quantile regression, Robust regression, Cubic splines and other forms of splines, Multivariate adaptive regression splines (MARS), Regression trees, Multilevel models, Ridge Regression, LASSO, and Elastic Nets, among other methods. Covered PROCs include QUANTREG, ROBUSTREG, ADAPTIVEREG and MIXED.

Part 2: Ridge regression (REG), Lasso and elastic nets (GLMSELECT), and multilevel models (MIXED and GLIMMIX).

How Sick is my Cohort of Patients? A General Approach to Identifying Chronic Conditions

Patricia Ferido
Tuesday, June 22, 2021, 10:00am-2:00pm Pacific Daylight Time

With the COVID-19 pandemic, the need for evidence-based healthcare research has become increasingly apparent. Even before the recent health crisis, the volume of available data on healthcare had been growing exponentially. Claims data and electronic health records provide rich insight into the health status of patients and the care provided by health care systems. Successfully uncovering these insights, however, requires an understanding of the data, as well as standardized and validated methods of analysis. This class will provide an overview of best practices when working with claims data, specifically Medicare claims data. Topics covered will include: the general structure of claims data, how best to use that information to identify disease cohorts, different approaches for measuring the health status of patients (e.g., Charlson Comorbidity Index, the Elixhauser Comorbidity Index, Hierarchical Condition Category Coding, etc.), and a deep dive into the Chronic Condition Warehouse (CCW) algorithms. Finally, the class will conclude with the workshopping of a SAS Macro that applies CCW-like rules to any dataset that resembles insurance claims or electronic health records with a full-picture of diagnoses and procedures from patient medical visits. The macro package includes the CCW validated algorithms (the default option), but also has the flexibility for the user to apply the algorithm to a different set of diagnoses and procedures. The user can either implement variations of the CCW-definitions or identify entirely new conditions, so long as they can be implemented using diagnosis or procedure codes, claim types, and CCW-like rules. After taking the class, students will have an understanding of key factors to consider in disease cohort analysis and will have direct experience using this package to identify diseases in simulated data.

Elementary Logistic Regression with Predictive Modeling

Bruce Lund
Friday, June 25, 2021, 10:00am-2:00pm Pacific Daylight Time

This class presents light theory, supported by simulations, for understanding binary logistic regression models using SAS®. This discussion of logistic regression begins at the beginning. No prior experience is assumed.

Once the basics of logistic regression are introduced, the class focuses on using logistic models in predictive modeling on large datasets. Examples from credit risk and automotive marketing are given. The class will be less focused on explanatory models as would arise in the bio-sciences.
Topics include: Logistic regression versus other methods; Likelihood function and maximum likelihood estimators; Statistics for predictor and overall model fit; Screening, binning, transforming of predictors (including weight of evidence coding); Discussion of multicollinearity; Predictor selection methods using PROC LOGISTIC, HPLOGISTIC, HPGENSELECT including best subsets, stepwise with sbc/aic, Lasso; Model validation and assessment including c statistic, R-squares classification error, and lift charts in the context of training, cross-validation, and validation samples.

Class uses BASE SAS and SAS/STAT. No usage of Viya or Enterprise Miner.

Meet the Instructors

Patricia Ferido is a Senior Research Programmer at the Leonard D. Schaeffer Center for Health Policy and Economics where she analyzes medical data for research on dementia care and treatment. Prior to joining the Schaeffer Center, she worked as an economics litigation consultant specializing in the analysis of labor data for wage and hour litigation. She holds a BA in both Economics and International Development Studies from UCLA and is pursuing a Masters in Public Policy Data Science at USC.
Peter Flom is a retired independent statistical consultant who worked with graduate students and researchers in the social, medical and behavioral sciences. He has been using SAS for over 20 years and has given talks at SAS Global Forum and many local and regional SAS user groups.
Hunter Glanz is an Associate Professor of Statistics and Data Science at California Polytechnic State University (Cal Poly, San Luis Obispo). He received a BS in Mathematics and a BS in Statistics from Cal Poly, San Luis Obispo followed by an MA and PhD in Statistics from Boston University. He maintains a passion for machine learning and statistical computing, and enjoys advancing education efforts in these areas. In particular, Cal Poly’s courses in R, SAS, and Python give him the opportunity to connect students with exciting data science topics amidst a firm grounding in communication of statistical ideas. Hunter serves on numerous committees and organizations dedicated to delivering cutting edge statistical and data science content to students and professionals alike. In particular, the ASA’s DataFest event at UCLA has been an extremely rewarding experience for the teams of Cal Poly students Hunter has had the pleasure of advising.
Bruce Lund is a statistical modeling consultant and trainer. For 15 years he was a statistical and modeling consultant for OneMagnify of Detroit. Before OneMagnify, he was the customer database manager at Ford Motor Company and a mathematics professor at University of New Brunswick, Canada. He has a mathematics PhD from Stanford University. Bruce Lund has presented at SAS Global Forum, SAS AnalyticsX, ASA CSP, and at regional SAS user group conferences.
Deanna Schreiber-Gregory is a Lead Research Statistician and Data Manager on contract through the Henry M Jackson Foundation for the Advancement of Military Medicine to the Department of Defense in Bethesda, MD. She is also an Independent Consultant for Statistics, Research Methods, and Data Management in the private sector through Juxdapoze, LLC. Deanna has an MS in Health and Life Science Analytics, a BS in Statistics, and a BS in Psychology. Deanna has presented as a contributed and invited speaker at over 50 local, regional, national, and global SAS user group conferences since 2011.