Terms  Definitions 

AB Design 
A singlesubject research design that includes a single baseline (A) phase and a single treatment (B) phase. Shortcoming: not adequately control "history", which can threaten internal validity

Alpha (level of significance) 
probability of rejecting the null hypothesis when it is true (i.e., probability of making a Type I error). Set by experimenter prior to collecting or analyzing data

ANCOVA (Analysis of Covariance) 
A verson of ANOVA used to increase the efficiency of the analysis by statistically removing variability in the DV that is due to an extraneous variable. Each person's score on the DV is adjusted on the basis of score on the extraneous variable

AREAS UNDER THE NORMAL CURVE 
when scores are normally distributed,possible to conclude that specific number of observations fall within certain areas of the distribution that are defined by SD. 60% fall plus and minus one SD, 95% plus and minus 2 SD, 99% plus and minus 3 SD

BETWEEN GROUPS DESIGN 
studies in which the effects of the different levels of one or more IVs are compared by administering each level or combination of levels to a different group of subjects

BISERIAL CORRELATION COEFFICIENT 
a bivariate correlation coefficent used when one viarble is an aritifical dichotomy (i.e., a continuous variable that has been artificially dichotomized) and the other is a continuous variable

BLOCKING 
used to control an extraneous variable when an investigator wants to statistically analyze its main and interaction effects on the DV. Involves grouping (blocking) subjects with regard to their status on the extraneous variable and then randomly assigning subjects in each block to one of the treatment groups

CENTRAL LIMIT THEOREM 
derived from probability theory that predicts that the sampling distribtuion of the mean (1) will approach a normal shape as the sample size increases, regardless of the shape of the population distribution of scores; (2) has a mean equal to the population mean, and (3) has a SD equal to the pop. SD divided by square root of the sample size

CLUSTER SAMPLING 
entails selecting units or groups (clusters) of individuals from the population (vs. involves selecting indvds from pop)

COEFFICIENT OF DETERMINATION 
name given tothe r when it is squared, indicate amt of variability in Y that is accounted for by the variability of X (amt of variability shared by the two variables). r is squared only when it is the correlation between two different variables

CORRELATION COEFFICIENT 
index of the relationship between two or more variables. indicates strength of the relationship; can be positive or negative

COUNTERBALANCED DESIGN 
research design used to control carryover (order) effects; involves administering the differetn levels fo IV to different subjects or gorups of subjects in a different order. ex. Latin square design

CROSSVALIDATION 
validating a correlation coeffience (e.g.,criterion related validity coefficient) on a new sample. B/c the same chance factors operating in the original sample are not operating in the subsequent sample, the correlation coefficient tends to "shrink" on cross validation. In terms of multiple correlation coefficient (R) shrinkage is greatest when the original sample is small and the number of predictors is large

DEMAND CHARACTERISTICS 
cues in the experimental situation that inform research participants of how they are expected to behave during the course of the study. threaten internal and external validity

DEPENDENT VARIABLE 
variable that is observed and measured in a study and is believes to be affected by the IV

DISCRIMINANT FUNCTION ANALYSIS 
the multivariate technique used when there are 2 or more continuous predictors and one discrete (nominal) criterion. Multiple discriminant function when the criterion has more than 2 categories

ETA 
bivariate correlction coefficent used when both variables are continuous and the relationship between them is nonlinear

EVENT SAMPLING 
method of behavioral sampling that is useful for behaviors that are rare or that leave a permanent product. involves recording each occurrence of a behaivor during a predefined or preselected event

EXPERIMENTALWISE ERROR RATE 
probability of amking Type I error; as number of statistical comparisons in a study increases, the experimentwise error rate also increases

EXTERNAL VALIDITY 
degree to which a study's results can be generalized to other people, settings, conditions, etc

FRATIO 
test statistic yielded by the analysis of variance; represents a mesure of treatment effects plus error divided by a measure of error only (MSB/MSW); when treatment has an effect, the ratio is larger than 1.0

FACTORIAL ANOVA 
type of ANOVA used when a study includes two or more IV's (i.e., when the study has used a factorial design) Also referred to as 2way ANOVA, etc...referring to the number of IVs

INTERACTION 
occurs when the impact of one IV differs at different levels of another variable

FACTORIAL DESIGN 
name given to any research design that includes two or more "factors" (IVs); permit analysis of main and interaction effects

HISTORY 
event that is external to a research study and that is not relevant to the research hypothesis but that affects subjects' performance ont he DV in a systematic way and thereby confounnds the results of the study; threatens internal validity

INDEPENDENT VARIABLE 
variable that is manipulated in research for the purpose of determing its effects on the DV; each IV must have at least 2 levels

INTERNAL VALIDITY 
the degree which a research study allows investigator to conclude that observed variability in a dependent variable is due to the IV rather than other factors

INTERVAL RECORDING 
method of behavioral sampling that involves dividing a period of time into discrete intervals and recording whether the behavior occurs in each interval; useful in behaviors that have no clear beginning or end

LISREL 
a casual (structural equation) modeling technique used to verify a predefined causal model or theory. More ocmplex than path analysis; allows 2way (nonrecursive) paths and takes into account observed variables, the latent traits they are believed to measure, and the effects of measurement error

MANOVA (MULTIVARIATE ANALYSIS OF VARIANCE) 
form of ANOVA used when a study includes one or more IVs and 2 or more DVs, each of which is measured on an interval or ratio scale; helps reduce the experimentalwise error rate and increases power by analyzing the effects of the IV(s) on all DVs simultaneously

MATCHING 
method of controlling an extraneous variable or other source of systematic error; involves pairing or grouping subjects on the basis of their status on the extraneous variable and randomly assigning members of each pair or group to a different treatment group so that groups are initially equivalent with regard to the extraneous variable

MATURATION 
any physical or psychological process or even that occurs as the result of the passage of time (e.g., fatigue, motivation) and that has a systematic effect on subjects' status on the DV. Acts as a threat to internal validity

MEAN 
measure of central tendency that is the arithmetic average of a set of scores; used when scores are measured on an interval or ratio scale

MEDIAN 
measure of central tendency that is the middle score in a distribution of scores when scores have been ordered from lowest to highest; used with ordinal data (and often with interval and ratio data when the distribution is skewed or contains one or few outliers)

MIXED DESIGN 
research designs in which both betweengroups and withinsubjects comparisons are made

MODE 
measure of central tendency that represents the most frequently occurring category or score in a distribution

MULTICOLLINEARITY 
an undesirable condition in multiple regression and other multivariate (prediction) techniques; occuring when predictors are highly correlated with one another

MULTIPLE BASELINE DESIGN 
a singlesubject design that involves sequentially applying a treatment to different "baselines" (e.g., to different behaviors, settings, or subjects); useful when a reversal design would be impractical or unethical

MULTIPLE CORRELATION COEFFICIENT (R) 
correlation coefficient that indicates the degree of association between 3 or more variables. R can be squared to obtain a mesure of shared variability

MULTIPLE REGRESSION 
multivariate technique used for predicting a score on a continuous criterion based on performance on 2 or more continous and/or discrete predictors; predictors will have low correlations with each other and high correlations with criterion; output is mutiple correlation coefficient and multiple regression equation

MULTIPLE SAMPLE CHI SQUARE TEST 
nonparametric inferential statistical test used when a study includes 2 or more variables and the data to be analyzed are reported in terms of frequencies in each category; involves comparing observed frequencies to expected frequencies to determine if the 2 distributions of frequencies differ (Df = (C1)(R1)

NONPARAMETRIC TESTS 
inferential statistical tests used when the data to be analyzed represent either an ordinal or nominal scale or when the assumptions for a parametric test have not been met; does not make the sampe assumptions about the population distribution as the parametric tests and therefore, are also known as "distribution free tests"; includes chisquare, MannWhitney U, Wilcoxon matched pairs test

NORMAL DISTRIBUTION (CURVE) 
symmetrical bell shaped distribtuion that is defined by a specific mathematical formula; "Gaussian distribution"

ONE WAY ANOVA (ANALYSIS OF VARIANCE) 
parametric statistical test used to compare the means of 2 or more groups when a study includes 1 IV and 1 DV that is measured on an interval or ratio scale; yields F ratio that indicates if any group means are significantly different; preferable to multiple ttests when a study involves more than 3 groups b/c helps control the experimentwise error rate

PARAMETRIC TESTS 
inferential statistical tests that are used when the data to be analyzed represent an interval or ratio scale and when certain assumptions about the pop distributions have been met (ie., when scores on the variable of interest are normally distributed and when there is homoscedasticity); more powerful

PATH ANALYSIS 
a causal modeling technique used to verify a predefined causal model or theory; involves translating the theory into a path diagram, collecting data on the variables of interest and calculating and interpreting path coefficients

PEARSON R 
correlation coefficient that can be used when both variables have been measured on an interval or ratio scale; requires linearity, unrestricted range of scores, and homoscedasticity

POINT BISERIAL CORRELATION COEFFICIENT 
bivariate correlation coefficient used when one variable is a true dichotomy (male/female) and the other variable is continuous

POWER 
probability of rejecting a false null hypothesis; cannot be directly controlled but can be increased by including a large sample, maximizing the effects of the IV, increasing the size of alpha, and reducing error

PROTOCOL ANALYSIS 
technique used by cognitive psychologists to identify cognitions underlying problemsolving and decision making; thinking aloud while working and then analyzing the record (protocol) of the individual's verbalization

QUASIEXPERIMENTAL RESEARCH 
experimental research in which experimental control is limited, especially to assign subjects to groups b/c intact groups must be used, the variable of interest is an organismic variable or the study includes only one group; limitation: not allow to conclude causal relationship

RANDOM ASSIGNMENT 
method of assigning subjects to treatmetn groups using random method; hallmark of trus experimental research and can conclude that any observed effect of an DV due to IV rather than error

RANDOM BLOCK FACTORIAL ANOVA 
version of ANOVA that appropriate when blocking has been used as a method of controlling extraneous variable; can statistically analyze main and interaction effects of the extraneous variable (which is being treated as an additional IV)

REGRESSION ANALYSIS 
statistical technique used to predict a score on a criterion based on the person's obtained score on a predictor; involves identification of a regression line (line of best fit) and use of equation for that line

REJECTION REGION 
region of sampling distribtuion that contains those sample values (e.g., means) that are unlikely to be obtained simply as the result of sampling error. when an inferential statistical test indicates that the obtained sample value falls in the rejection region, the null hypothesis is rejected and the alternative hypothesis is retained; size of the region determined by alpha

RETENTION REGION 
region of sampling distribution that contains those values that are likely to be obtained simply as the result of sampling error; when inferential statistical test indicates that an obtained sample value is in the retention region, the null hypothesis is retained an the laternative hypothesis is rejected; region is equal to one minus alpha

REVERSAL (WITHDRAWAL) DESIGN 
type of singlesubject design that includes at a minimum, two baseline phases and one treatment phase (e.g., an ABA or ABAB design); treatmetn is withdrawn during the second and subsequent baseline phases

SAMPLING DISTRIBUTION OF THE MEAN 
distribution of sample mean that would be obtained if an infinite number of equalsize samples were randomly selected from the population and the mean for each sample calculated; normally shaped, mean is equal to the pop mean, SD (standard error of the mean) is = to pop SD divided by square root of the sampel size; used in inferential statistics to determine how unlikely it is to obtain a particular sample mean given the pop mean, the pop SD, sample size, and level of significance

SAMPLING ERROR 
type of random error that is due to uncontrolled factors and that is responsible for the fluctuations found between sample values and the corresponding value for the population from which the samples were randomly drawn

SCALES OF MEASUREMENT 
method of categorizing the various ways to measure variables; four types that differ in terms of mathematical sophistication; nominal, ordinal, interval, iand ratio; nominal yields frequency of observations; ordinal, interval and ratio yields scale values or scores

SCATTERGRAM (SCATTERPLOT) 
summary in graphic (pictorial) form of the degree of association between two variables; wide scatter in a scattergram indicates a low correlation between variables

SELECTION 
a potential threat to both the internal and external validty of research when subjects are not randomly assigned or selected; threatens internal validty when subjects in different treatmetn groups are initially different and would differ at the end of the study; threatens external validity when the characteristics of the subjects in different groups cuases them to react in an idiosyncratic way to the treatment

SINGLESAMPLE CHISQUARE TEST 
a nonparemetric inferential statistical test sued when a study includes 1 variable and the data therefore are reported in terms of frequencies in each category (level) of that variable; involves comparing the observed frequencies to expected frequencies to determine if the two distributions of frequencies differ (Df = C1)

SKEWED DISTRIBUTION 
asymmetrical distributions in which the majority of scores are located on one side of the distribution; positively skewed most scores are in the low side but a few scores are in the high (positive) side; negatively skewed distribution the majority of scores are in the high side of the distribution, but a few on the low side

STANDARD DEVIATION 
a measure of dispersion (variability) of scores around the mean of the distribution; calculated by dividing the sum of the squared deviation by N (or N1) and taking the square root of the result; the square root of the variance

STANDARD ERROR OF THE MEAN 
SD of the sampling distribution of the mean; calculated by dividing the pop SD by the square root of the sample size

TTEST FOR A SINGLE SAMPLE 
version of ttest used to compare a single obtained sample mean to a known or hypothesized pop mean (Df = N1)

TTEST FOR CORRELATED SAMPLES 
version of ttest used to compare two sample means when subjects in the 2 groups are related in some way (e.g., b/c they were matched on an extraneous variable or b/c a singlegroup pretest/posttest design was used) Df= number of pairs of scores 1

TTEST FOR INDEPENDENT SAMPLES 
version of ttest used to compare two samples means when subjects in the two groups are independent (unrelated) Df = N2

TREND ANALYSIS 
type of analysis of variance used to assess linear and nonlinear trends when the ID is quantitative

TRUE EXPERIMENTAL RESEARCH 
experimental research that provides the investigator with maximal experimental control; randomly assign subjects

TYPE I ERROR 
decision error that occurs when a true null hypothesis is rejected; equal to alpha

TYPE II ERROR 
decision error that occurs when a false null hypothesis is retained; equal to beta

WITHINSUBJECT DESIGN 
experimental design in which each subject receives at different times, each level of the IV ( or combination of IVs) so that comparisons on the DV are made within subjects rather than between groups

CLASSICAL TEST THEORY 
theory of measurement that regards observed variability in test scores as reflecting two components; true differences between examinees on the attribute measured by the test and the effects of measurement (random) erro

COEFFICIENT ALPHA 
method of assessing internal consistency reliability that provides an index of average interitem consistency; primary sources of error are content (item) sampling differences and heterogeniety of the content domian

KUDERRICHARDSON FORMULA 20 (KR20) 
used to substitute for coefficient alpha when test items are scored dichotomously

CONSTRUCT VALIDITY 
extent to which a test measures the hypothetical trait (construct) it is intended to measure

CONTENT VALIDITY 
extent to which test adequately samples the domain of information, knowledge, skill that it purports to measure

CORRECTION FOR GUESSING 
method used to ensure that examinees do not benefit from "wild" guessing; involves using formula to adjust scores; the resulting distribution has lower mean and larger SD than original distribution

CRITERION CONTAMINATION 
bias introduced into a person's criterion score as a result of the knowledge of the scorer about his/her performance on the predictor; tends ot artificially inflate the relationship between the predictor and criterion

CRITERIONREFERENCED INTERPRETATION 
interpretation of a test score in terms of a prespecified standard (i.e., terms of % of content correct or predicted performance on an external criterion (e.g., regression equation, expectancy table)

CRITERIONRELATED VALIDITY 
validity involving determining the relationship (correlation) between the predictor and the criterion; can be either concurrent or predictive

CROSS VALIDATION 
process of reassessing a test's criterionrelated validity on a new sample to check the generalizability of the original validity coefficient; ordinarily the validity coefficient "shrinks" on cross validation b/c the chance factors operating in the original sample are not all present in the sample

FACTOR ANALYSIS 
multivariate statistical technique used to determine how many factors (constructs) are needed to account for the intercorrelations among a set of tests, subtests, or test items; used to assess test's construct validity by indicating the extent to which the test correlates with factors that it would and owuld not be expected to correlate with

INCREMENTAL VALIDITY 
extent to which predictor increases decisionmaking accuracy; calculated by subtracting the base rate from the positive hit rate; terms to have linked with this are predictor and criterion cutoff scores

INTERRATER RELIABILITY 
measure of degree of consistency of scores for 2 or more rates; can be measured either by correlation coefficient or percent agreement

ITEM DIFFICULTY INDEX 
measure of an item's difficulty level; calculated by dividing number of individuals who answered the item correctly by total number of individual; ranges in valie from 0 (very difficult) to 1.0 (very easy); .50 is preferred

ITEM RESPONSE THEORY (IRT) 
approach to test construction that involves identifying an item characteristic curve for each test items; vs. classical test theory and has advantage identified item parameters are sample invariant; scores from different tests or different sets of test items can be equated

MULTITRAIT MULTIMETHOD MATRIX 
systematic way to organize the correlation coefficient obtained when assessing a measure's convergent and discriminant validity; requires measuring at least 2 different traits using two different methods for each trait

NORMREFERENCED INTERPRETATION 
interpretation of an examinee's test performance relative to the performance of examinees in a normative (standardization) sample; %tile rank, standard score, age and grade equivalent scores are examples

ORTHOGONAL ROTATION 
factor analysis; produces uncorrelated factors; rotation done to simplify the interpretation of identified factors

OBLIQUE ROTATION 
factor analysis; produces correlated factors

PERCENTAGE SCORE 
type of criterionreferenced (aka content referenced) score that indicates the percent of test content (items) endorsed or answered correctly;

PERCENTILE RANK 
type of normreferenced score that indicates the percent of examinees in the normative group who obtained lower score; 60% of the examinees in the norm group obtained lower raw scores

PRINCIPAL COMPONENT ANALYSIS 
multivariate technique similar to factor analysis that is used to identify the fewest dimensions (components) that can explain the variability in a set of tests; amount of variability eplained by each identified component is provided by an eigenvalue

RELATIONSHIP BETWEEN RELIABILITY AND VALIDITY 
reliability is necessary but not sufficient for validity; in terms of criterion related validityvalidity coeffficent can be no greater than the square root of the product of the reliabilities of the predictor and criterion

RELIABILITY 
consistency of test scores (i.e., extent to which test measures an attribute without being affected by random fluctuations measurement errorsthat produce inconsistencies over time;

TYPES OF RELIABILITY 
testretest, alternative form, splithalf, coefficient alpha, interrater

SPEARMANBROWN FORMULA 
used to estimate the reliability of a test if it were lengthended or shortened, with items from the same content domain or measuring the same construct; used with splithalf reliablity to dtermine what reliability coefficient would have been had it been on full length test

SPLITHALF RELIABILITY 
method of assessing internal consistency relability that involves "splitting" the test half; contet sampling is the primary source of error; using corrected with Spearman Brown formula

STANDARD ERROR OF ESTIMATE 
an index of error when predicting criterion scores from predictor scores; used to construct a confidence interval around an examinee's predicted criterion score; magnitude depends on: 1. criterion's SD, 2. predictors validity coefficient

STANDARD ERROR OF MEASUREMENT 
an index of measurement error; used to construct confidence interval around an examinee's obtained test score; magnitude depends on 1. test's SD, 2. test's reliability coefficient

STANDARD SCORE 
transformed score that reports test performance in terms of SD unites from the mean achieved by normative sample; tscores, zscores, deviation IQs

TESTRETEST RELIABILITY 
method for assessing reliability that invovles administering the same test to the same group on 2 different occasions and correlating the 2 sets of scores; time sampling factors are primary sources of error; yields a coefficient stability

VALIDITY 
extent to which test accurately measures what it is intended to measure; three types: content, construct, and criterionrelated

Leave a Comment ({[ getComments().length ]})
Comments ({[ getComments().length ]})
{[ comment.comment ]}