SAS Workshops
SAS experience is not required
An introduction to the SAS system. This eight hour workshop concentrates on the SAS DATA STEP with emphasis on data input, manipulation, output and summary.
General Considerations
Environment
Modes of Operation
- Interactive
- SAS — Windowed
- Non-Interactive (batch)
- SAS fn (output files)
- Requires Editor (write, execute and examine)
SAS Program Structure
Types of Blocks
- DATA Steps
- PROC Steps
Path of Execution
- Scan Statements
- Delimit Blocks (keywords)
- Execute Blocks Sequentially (any number, any order)
Writing Programs to Reflect SAS Structure
- Separation of Blocks
- Indentation within Blocks
- Use of Comments (human factor)
- Use of Capitalization
DATA STEP
Description of a SAS Data Set
What is in a Data Set
Difference Between a Data Set and a Data File
Getting Data In
Input Statement
- List (Free Field) Input
- lumn Input
- Formatted Input (Pointers)
- Mixed Input
- Importance of Missing Values
Data Control Vector (DCV)
- Status During Data Step Iterations
- Maintaining Rectangular Shape (missing values)
Where Data Sets Come From (sources)
- Internal Data (Cards)
- External Data (Infile, Import)
- Other Datasets
- SET Statement
- MERGE Statement
- PROC's
- OUTPUT Statement
Manipulation of Data (Programming)
IF Statements
- Subsetting IF with Retention
- Subsetting IF with Deletion
- IF - THEN - ELSE (Efficiency)
DO - END Statements
- DO Groups (Multiple Statements)
- DO Loops (Iterative Calculations)
- Use of the ARRAY Statement
Controlling Output
- Implied Output
- Explicit Output (OUTPUT Statement)
- Data Set Options (KEEP, DROP, RENAME)
SAS Functions and Operators
- Handling of Missing Values
- Examples of SAS Operators & Functions
Getting Data Out
Permanent Datasets
- Libraries
- Datasets and Members
Exporting ASCII Files
- FILE Statement
- PUT Statement
Exporting Other Data Types
SAS experience is required
SAS procedures to compute estimated sample sizes and statistical power is a two-three hour workshop. It covers both simple designs, as well as advanced designs including generalized linear mixed models.
Background Material
Basic sample size and power estimation
- Sample size for one and two sample means
- Sample size for one and two sample proportions
Advanced sample size and power estimation
- PROC GLMPOWER for fixed RCB model
- PROC GLMPOWER for literature example
- Article for literature example
- PROC GLIMMIX for mixed effects normal RCB model
- PROC GLIMMIX for mixed effects binomial RCB model
SAS and DATA STEP experience is required
Using the SAS system for setting up and running simple Generalized Linear Models and computing their associated statistical power.
- Handout
- Binomial data
- Count data
- Proportional data
- Binomial power
- Negative binomial power
- Mixed model techniques and warnings in SAS
- Stroup Agronomy journal article (2014). Rethinking the analysis of non-normal data in plant and soil science.
This covers the analysis of designed experiments with PROC MIXED. Topics include mixed models ANOVA, repeated measures, regression and spatial correlation. Contrasts, mean estimation and pair-wise comparisons using mixed models are also discussed.
Introduction
Analysis of Variance (ANOVA)
Requirements
- Data Structure
- Response numeric
- Class var's numeric or alpha
- Yield Dataset
- Statements
- CLASS Statement
- MODEL Statement
- Single terms
- Crossed terms (*)
Additional Statements and Options
- RANDOM Statement
- REPEATED Statement
- LSMEANS Statement
- DIFF Option
- CONTRAST Statement
Common Experimental Designs
- RCB
- Example 1 — RCB
- additional term
- Example 1 — RCB
- Split Plot and Split Block
- Example 2 — Split Plot
- Example 3 — Split Block
- Repeated Measures
- Example 4 — Repeated measures RCB
-
Advanced Analyses
- Multiple Regression
- Example 1 — Meta—analysis of dairy data
- Spatial Correlation ANOVA
- Example 2 — CRD with subsampling
SAS multivariate procedure are covered. The included procedures are PROC PRINCOMP, PROC CLUSTER, PROC DISCRIM, MANOVA (PROC GLM) AND PROC CANCORR.
Introduction
Example Data
Excel spreadsheet
PROC PRINCOMP
PROC CLUSTERPROC DISCRIM
MANOVA (PROC GLM)
PROC CANCORR
This covers the analysis of designed experiments with PROC ANOVA and PROC GLM. Topics include ANOVA, ANCOVA, regression, contrasts and mean estimation.
Introduction
Analysis of Variance (ANOVA)
Requirements
- Data Structure
- Response numeric
- Class var's numeric or alpha
- Statements
- CLASS Statement
- MODEL Statement
- Single terms
- Crossed terms (*)
- Handling of Missing Values
Additional Statements and Options
- MEANS Statement
- Multiple Comparison Options
- LSMEANS Statement
- PDIFF and STDERR Options
- CONTRAST and ESTIMATE Statements
- OUTPUT Statement
- Residual analysis
Common Experimental Designs
- CRD and RCB
- Example 1 — CRD
- ANOVA table layout
- - SS types
- Example 2 — RCB
- Additional term
- Example 1 — CRD
- Split Plot and Split Block (Multiple Error Terms)
- Example 3 — Split Plot
- TEST Statement
- Example 4 — Split Block
- More than one TEST Statement
- Example 3 — Split Plot
Regression
Requirements
- Data Structure
- Response Numeric
- Independents Numeric
- Statements
- No CLASS Statement
- MODEL Statement
- Single terms
- Crossed terms (*)
- Handling of Missing Values
Additional Statements and Options
- OUTPUT Statement
- Regression Diagnostics
- Predicted Values
- WEIGHT Statement
- Variance Stabilization
- MODEL Statement Options
- Predicted values (P) and CI's (CLI, CLM)
- INT and NOINT
Regression Examples
- Simple Linear Regression (SLR)
- Example 5
- Multiple Linear Regression
- Example 6
Analysis of Covariance/Dummy Var Regression
Requirements
- Data Structure
- Response Numeric
- Covariate Numeric
- Class Var's Numeric or Alpha
- Statements
- CLASS Statement
- MODEL Statement
- Single terms
- Crossed terms (*)
- Handling of Missing Values
Additional Statements and Options
- LSMEANS Statement
- Adjusted Means in ANCOVA
- CONTRAST Statement
- Comparison of means in ANCOVA
- Comparison of slopes in D. V. R.
- MODEL Statement Options
- SOLUTIONS for parameter est's in D. V. R.
- INT and NOINT in D. V. R.
Analysis of Covariance (ANCOVA)
- Example 7
- ANOVA Table
Dummy Var. Regression
- Example 8 — with INT
- SOLUTIONS Option
- Estimated Parameters for differential effect
- Example 9 — with NOINT
- SOLUTIONS Option
- Parameter estimates
- CONTRAST and ESTIMATE Statements
- Comparing estimated parameters and predicted values
- SOLUTIONS Option
Analysis of data utilizing regression techniques in SAS. Topics include linear and multiple regression, regression diagnostics and model comparison using dummy variables.
Introduction
Regression
Requirements
-
Data Structure
- Response Variables — numeric
- Regressors — numeric
- Statements
- MODEL Statement
- MODEL Statement Options
- Printing
- Diagnostic
- Model Specification
Regression Diagnostics
-
Residual Analysis
- Options: P, R, etc
- Plotting
- PLOT and PAINT Statements
- Residual Analysis (cont)
- OUTPUT Statement
- Summary Procedures: UNIVARIATE
- External Plotting
- OUTPUT Statement
- Influence and Collinearity
Additional Topics and Techniques
- Weighted Regression
- Model Comparison
- PROC REG
- DATA STEP => Dummy Variables
- TEST Statement
- PROC REG
- PROC GLM
- Crossed terms (*)
- CLASS and CONTRAST Statements
Analysis of discrete categorical data using SAS. Topics include one-way, two-way and multi-way tables, and contrasts using PROC FREQ and PROC CATMOD.
Introduction
Data
Categorical Variable Types
- Nominal
- Ordinal
- Interval
Categorical Responses
- Binary
- Summary Counts
Example Data
Categorical Procedures
PROC FREQ
- One-way contingency tables
- Two-way contingency tables
- Statements
- TABLES
- Options
- WEIGHT
- TABLES
- Other Features
- Output
PROC CATMOD
- Multi-dimensional tables
- Response Functions
- Model Types
- Linear Log-Linear
- Statements
- MODEL
- Options
- WEIGHT
- DIRECT
- LOGLIN
- MODEL
- Examples
- Multi-way structure
- Linear
- Log-Linear
- One-way and two-way structures
- Multi-way structure
- Contrast Statement
Analysis of discrete data using SAS. Topics include binomial, multinomial and count response types.
Introduction
PROC GENMOD
Experimental vs Observational data
- Data Types
- Binary
- Recorded Counts
Statements for Binary Data
- CLASS
- MODEL
- Distribution: Binomial and Multinomial
- Link: Logit
- Interpretation of results
- LSMEANS
- DIFF
- CONTRAST
Statements for Count Data
- CLASS
- MODEL
- Distribution: Poisson
- Link: Log
- LSMEANS
- DIFF
- CONTRAST
Analysis of data through nonlinear regression models in SAS. Topics include model form, parameterization, estimation, diagnostics, comparison.
Introduction
Data Structure
Response Variable — numeric
Regressor (independent) Variable — numeric
Statements
Required
- PROC NLIN
- METHODS
- PARMS
- Parameter Definition
- Starting Values
- MODEL
- DER (Derivatives)
- Naming Convention
Optional
- BOUNDS
- OUTPUT
- Programming
Examples
Special Topics
Diagnostics
- Residual Analysis
- Univariate summaries
- Residual plots
- Nonlinearity
- Profile t-plots
- Profile Pair Sketches
Linearization
- Transformations
Parameterization
- Model Behavior
- Parameter Interpretation
Contrasts and Model Comparisons
- Hypothesis Testing
- Single df tests
- Multiple df tests
Using the SAS system for graphical display and output. Topics include charts, scatterplots, 3D surface plots and contour plots. Output of graphics to external file formats for presentation is also covered.
Introduction
Procedures
GPLOT
- Required Statements
- PLOT
- Options
- OVERLAY
- VREF, HREF
GCHART
- Required Statements
- VBAR, HBAR
- PIE
- General Options
- FREQ
- VBAR/HBAR Options
- FREQ
- PERCENT
- CFREQ
- CPERCENT
- MEAN, SUM and SUMVAR
- PIE Options
- FREQ
- FILL
G3D
- Required Statements
- PLOT
- SCATTER
- General Options
- ROTATE
- TILT
- PLOT Options
- CTOP / CBOTTOM
- XYTYPE
- SCATTER Options
- NONEEDLE
- SHAPE
GCONTOUR
- Required Statements
- PLOT
- Options
- LEVELS
- VREF / HREF
Other Statements
- SYMBOLx
- GOPTIONS
- CBACK
Applied SAS/Graph
Regression Analysis
- Initial Graphics
- GPLOT
- Linear Regression Output
- GPLOT
- Residual Plots
- Observed and Predicted
- Prediction Bounds
- G3D
- Surface plots in Multiple Regression
- GCONTOUR
- Surface contours in multiple regression
- GPLOT
- Nonlinear Regression Output
- GPLOT
- Residual Plots
- Observed and Predicted
- Prediction Bounds
- GPLOT
- DVR Output
- GPLOT
- Overlayed lines
Analysis of Variance
- Outputting Means
- Mean Bar Charts
- GCHART
- Interaction Plots
- GPLOT
- Categorical Analysis
- Distributional Plots
- GCHART
- Histograms
- GCHART
- Pie Charts
- GCHART
Saving Graphic Output
File Formats
- Bitmap
- Vector
Cleaning Up Output
- Importing and Export
Reference books we found useful for researchers in the college (purchase not required). Those in Italics, we feel are essential for the particular topic.
General
Useful for the base SAS system
- SAS Language
- SAS Procedures Guide
- SAS Language and Procedures
Specifics for operating systems
- SAS Companion for the Microsoft Windows Environment
- SAS Companion for the UNIX Environment and Derivatives
Specialized
Statistics related
- SAS/STAT Users Guide (Two Volumes)
- Applied Statistics for the SAS Programming Language (good for beginners)
- SAS System for Linear Models
- SAS System for Regression
- SAS/STAT Changes and Enhancements
SAS/GRAPH related
- SAS/GRAPH Software (Two Volumes)
- Quick Results with SAS/GRAPH Software (good for beginners)
All these books may be ordered through the VandalStore or directly from SAS Books website.
The VandalStore has a free CD containing all of the SAS manuals for U of I SAS users.