# USMLE Epidemiology and Biostatistics Flashcards

Terms Definitions
 Incidence rate It's a measure of the risk. Number of new events/number of persons exposed to risk Prevalence rate It's a measure of the extent. All cases of a disease/total population at risk Relationship between incidence and prevalence Prevalence = Incidence X Duration What happens to incidence and prevalence if: New effective treatment is initiated Prevalence decreases What happens to incidence and prevalence if: New effective vaccine gains widespread use Both incidence and prevalence decrease What happens to incidence and prevalence if: Number of persons dying from the condition increases Prevalence decreases What happens to incidence and prevalence if: Additional research funds are added No change in either incidence nor prevalence What happens to incidence and prevalence if: Behavioral risk factors are reduced in the poulation Both incidence and prevalence decrease What happens to incidence and prevalence if: Contacts between infected and noninfected persons are reduced Both incidence and prevalence decrease What happens to incidence and prevalence if: Recovery from the disease is more rapid Prevalence decreases What happens to incidence and prevalence if: Long-term survival rates for the disease increase Prevalence increases Morbidity rate Rate of disease in a population at risk. Both incident and prevalent cases. Mortality rate Rate of death in a population at risk. Incident cases only. Attack rate A type of incidence in which the denominator is further reduced for some known exposure Point prevalence Prevalence at a specified point in time Period prevalence Prevalence during a span of time Crude rate Measured rate for whole population Specific rate Measured rate for a subgroup of the population Standardized rate Adjustment to make groups equal on some factor Number needed to treat Inverse of incidence rate. Means that I would have to treat X number of people to prevent one case. 1/ARR; ARR = event rate in control group - event rate in treated group Crude mortality rate Deaths/population Cause-specific mortality rate Deaths from cause/population Cause-fatality rate Deaths from cause/number of people with the disease Proportionate mortality rate (PMR) Deaths from cause/all deaths Sensitivity The percentage of sick people for whom the test was positive: TP / TP + FN or a/a+c or 1-FN rate False negative rate 1 - sensitivity Specificity The percentage of healthy people identified as not having the disease: TN / TN + FP or d/(d+b) or 1-FN rate False positive rate 1 - specificity Positive predictive value The probability that a person with a positive test truly has the disease: TP / TP + FP or a/(a+b) Negative predictive value The probability that a person with a negative test doesn’t have the disease: TN / TN + FN Accuracy TP + TN / total screened patients What is the relationship between positive and negative predictive values and prevalence Prevalence is directly proportional to PPP and inversely proportional to NPP Selective bias The sample is not representative of the population Measurement bias The information is gathered in a manner that distorts the information. Berkson bias Selection bias in which hospital records are used to estimate population prevalence Nonrespondent bias Selection bias in which people included in the study are different than those who are not Hawthorne effect Subject's behavior is altered because they are being studied. Only a factor when there's no control group in a prospective study. Solution to selection bias Use a random, independent sample. Solution to measurement bias Set up a control/placebo group Experimenter expectancy bias Experimenter's expectations are passed on to subjects producing the desired effects. Solution to experimenter expectancy bias Double-blind design - neither the experimenter nor the subject know who receives the intervention. Lead-time bias Gives a false estimate of survival rates. Confuses improved screening with improved survival. Solution to lead-time survival Measure back-end survival - measure increased life-expectancy Recall bias Subjects fail to accurately recall events in the past. It's a problem in retrospective studies. Solution to recall bias Use multiple sources to confirm information Late-look bias Individuals with severe disease are less likely to be uncovered in a survey because they die first Solution to late-look bias Stratify by severity. Confounding bias Factor being examined is related or influenced by other factors of less interest Solution to confounding Do multiple studies and good research design Case report Clinical characteristic or outcome from a single clinical subject or event Case series report Clinical characteristic or outcome from a group of clinical subjects. Just diseased, no control group. Cross-sectional study The presence or absence of disease and other variables in a representative sample at a particular time. Measures prevalence, not incidence. Cause and effect cannot be determined. Case-control study People with disease compared to a control group. Almost always retrospective. Doesn't measure incidence or prevalence but determines causality. Qualities of the healthy are compared to qualities of the sick, determines risk factors. Use odds ratio. Cohort study Group with risk factor is compared to group without it - prospective. Oppossite of case-control. Measure incidence in each group, determines causality. Most reliable and valid. Use relative risk or attributable risk Tools used to analyze cohort studies and incidence data Relative risk and attributable risk Relative risk Incidence rate of exposed group / incidence rate of the unexposed group. Greater chance of one group of disease compared to the other group. Used for cohort studies. Attributable risk Incidence rate of exposed group - incidence rate of unexposed group. How many more cases in one group. Used for cohort studies. Odds ratio AD/BC; where A is the table cell of the object of study and D is diagonally across from it. Chance of risk given disease. Used for case-control studies. Observational studies Case, case series, cross-sectional, case-control, cohort Phase 1 clinical trial Testing safety of drug in healthy volunteers Phase 2 clinical trial Testing protocol and dose levels in small group of patient volunteers Phase 3 clinical trial Efficacy and occurrence of side effects in large group of patient volunteers. Intervention studies Randomized controlled clinical trial, community trial, cross-over study Randomized controlled clinical trial Subjects are randomly allocated into intervention and control groups. Most rigorous study. Double-blind is when neither patients nor doctors know which group a patient is in. Least subject to bias, expensive. Community trial Entire community is tested Cross-over study All subjects receive intervention, but at different times. Combine probabilities for independent events By multiplication Combine probabilities for nonindependent events Multiply the probability of one event by the probability of the second, assuming the first event occurred Combine probabilities for mutually exclusive events By addition Combine probabilities for events that are not mutually exclusive Add the two probabilities and subtract the multiplied probabilities Central tendency values Mean, median, mode Mean Average = sum of the values / number of values Median The 50th percentile. The value which divides the set into an upper half and a lower half. Mode The most frequent value encountered Positive skew of the distribution curve Tail is to the right, mean greater than median Negative skew of the distribution curve Tail is to the left, median is greater than mean Best central tendency measure for skewed distributions Median Best central tendency measure for normal distribution Mean, median and mode are all the same 1 standard deviation 68% of cases 2 standard deviations 95.5% of cases 3 standard deviations 99.7% of cases Between the mean and 1 standard deviation 34% of cases Between 1 standard deviation and 2 standard deviations 13.5% of cases Between 2 standard deviations and 3 standard deviations 2.4%of cases Above 3 standard deviations 0.15% of cases Confidence interval A percentage that assures how much up or down from the sample the true population is. 95% confidence Z = 2 99% confidence Z = 2.5 Confidence interval Mean +- Z (S/square root of the sample size) Confidence interval for relative risk and odds ratio If the CI range excludes 1 then it is significant. If the range is above one --> increased risk; if the range is below one --> decreased risk. If the CI range includes 1, then it is not significant Null hypothesis The opposite of what is trying to prove. E.g. hypothesis: the drug works; null hypothesis: the drug doesn’t work p-value < 0.05 Reject the null hypothesis - reached statistical significance p-value > 0.05 Do not reject null hypothesis - has not reached statistical significance Type I error or alpha error Rejecting the null hypothesis when it's really true - asserting the drug works, when it really doesn’t. The p-value is the chance of a type I error - if p=0.05, then chance of type I error is 5%. Type II error or beta error Failing to reject the null hypothesis when its really false - asserting the drug doesn’t work, when it does. Cannot be estimated from p-value. Statistical power 1 - P = beta error How to increase power Increase the sample size, which increases power and decreases type II errors Types of scales in statistics Nominal, ordinal, interval, ratio Nominal scale Puts objects into different groups or categories. Gender, drug Vs. placebo group, etc… Ordinal scale Puts groups into sequence, ranks or in different states of quality. Olympic medals, class rank, etc… Interval scale A group that is ordered in such a way that we can tell not just that they're different in quality but in quanity as well (how much do they differ). Height, weight, blood pressure, drug dosage, etc. Ratio scale Like interval scale but has a true zero point below which it cant go. Kelvin temperature scale, etc… Pearson correlation All interval data Chi square All nominal data t-test 2 groups with interval and nominal data ANOVA more than 2 groups with nominal and interval data All interval data - which statistical test? Pearson correlation All nominal data - which statistical test? Chi-square Combined interval and nominal data - which statistical test? If two groups: t-test; if more than two groups: ANOVA Meta analysis Statistical combination of the results of many studies, yielding a single p-value that represents the sum of all. What is the range of correlation analysis values? minus 1 to plus 1 What can be infered from a correlation analysis value of -1? Strong negative correlation - the variables are inversely proportional. Scatterplot shows bunched up dots with a negative slope. What can be infered from a correlation analysis value of +1? Strong positive correlation - the variables are directly porportional. Scatterplot shows bunched up dots with a positive slope.
/ 113
Term:
Definition:
Definition: