Go to file -> Save As -> to save your program. Close SAS. Open the folder where you saved lab1.sas7bdat. Is it still there? Where are the datasets original and restricted? When you close the SAS program, SAS deletes any temporary datasets (i.e. those in the WORK folder). To get back to where we were, open SAS, open your saved editor program. Create permanent datasets Replace data original; set EPI204.lab1; run ; by data EPI204.original; set EPI204.lab1; run ; You should see a file “original” in your P drive or the place where you saved lab1.sas7bdat file 1. Data Exploration 1.1 Summary statistics Open your saved editor program and re-run it. Let’s examine the summary statistics for FVC. proc means data =restricted n mean std min median q1 q3 max nmiss ; tables fvc; run ; What’s wrong with this code? Where did you find this information? Correct the code and describe the distribution of FVC in this cohort. proc means data =restricted n mean std min median q1 q3 max nmiss ; var fvc; run ; 4
EPI204 – Lab Session 1 1.2 Histograms and box plots Create histograms and box plots for forced vital capacity and height using the newly created dataset restricted to children. Describe the distribution of forced vital capacity and height in children. proc univariate data = restricted plot ; var fvc height; histogram ; run ; proc sgplot data = restricted; title "Forced Vital Capacity distribution" ; vbox fvc; run ; Note that once you set a title, it will continue to be the title until replaced by something else. 1.3 Scatter plot Describe the relationship between FVC and height as shown by a scatter plot. proc gplot data = restricted; plot fvc*height; run ; 2. Linear Regression Model A statistical model equation is used to express the relation between a response variable and another set of variables. The model predicts the outcome variable (also called the dependent or response variable) from a function of regressor variables (also called independent variables, predictors, explanatory variables, or factors) and parameters. In a linear regression model the predictor function is linear in the parameters . For example, a linear regression model equation has the following form: (1) where, Y i is the response variable, X i is a regressor variable, β 0 and β 1 are unknown parameters to be estimated, and ε i is the error term for the observations i=1,2,…,n. Model (1) can be fit with different procedures in SAS, including PROC REG and PROC GLM 3. Linear Models: The REG procedure (PROC REG) 5
EPI204 – Lab Session 1 The REG procedure is a general-purpose procedure for regression. It is used when the outcome variable is continuous while the predictor variables may be categorical variables, which divide the observations into discrete groups, or continuous variables. There are numerous statements and options available in PROC REG, and you can look them up in SAS online documentation. In this lab session, we are going to focus on the options you will be using during the class.
You've reached the end of your free preview.
Want to read all 9 pages?