This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: STAT1303A Data Management 10. Data Cleaning Exercise 10 1. The data set NH contains the following variables of 112 elderly. # Variable Type Len Label 1 a02 Char 30 Case ID 2 a05 Num 8 Marital status 9 admit Char 2 Admit from 5 b04 Num 8 Daily cognitive skill 3 b02a Num 8 Short-term memory 4 b02b Num 8 Long-term memory 6 b05a Num 8 Easily distracted 7 b05b Num 8 Altered perception 8 yob Char 4 Year of birth The codes of the variables are given as the following. A05 1 = Never married 2 = Married 3 = Widowed 4 = Separated 5 = Divorced ADMIT PH = Private home/apt. with no home health services NP = Private home/apt. with home health services GS = Board and care/assisted living/group home HO = Nursing home EH = Acute care hospital RH = Rehabilitation hospital PS = Psychiatric hospital, MR/DD facility O = Other B04 0 = Independent 1 = Modi&ed independence 2 = Moderately impaired 3 = Severely impaired B02A 0 = Memory OK B02B 1 = Memory problem B05A 0 = Behavior not present B05B 1 = Behavior present, not of recent onset 2 = Behavior present, over last 7 days appears di¡erent from resident¢s usual functioning (a) For variables ADMIT and A05, (i) list all observations with invalid values. (ii) list all observations with missing values....
View Full Document
- Spring '11