EPPP Testing
1 / 62
Term:
Definition:
Show example sentence
Show hint
Keyboard Shortcuts
  • Previous
  • Next
  • F Flip card

Complete list of Terms and Definitions for EPPP Testing

Terms Definitions
Effect on the floor of adding easy questions to a test * Will raise the floor
Def: criterion related validity coefficient Pearson r correlation between predictor and criterionacceptable range is +/- .3 to .6
test theory ttest theory
Def: computer adaptive assessment Computerized selection of test items based on periodic estimates of ability
Def: dynamic assessment Variety of procedures following on standardized testing to get further information, usually used with learning disablity or retardation
Target values for item difficulty by objective .5 for most tests.25 for high cutoff (matching selection %).8 or .9 for masteryhalf way between chance and 1, eg t/f exams would be .75
Def: communality (factor analysis) The proportion of variance of a test accounted for by the factorsSum of the squared factor loadingsInterpreted directly, ie .4 = 40%Only valid when factors are orthogonal
Relationship between item difficulty and discriminability Difficulty creates a ceiling for discriminabilityDifficulty of .5 creates maximum discriminabilityThe greater the mean discriminability the greater the reliability
Differences between principle components analysis and factor analysis In principle components analysis:Factors are always uncorrelatedVariance = explained + errorIn factor analysis:variance = common + specific + error
Def: correction for attenuation Estimate of how much more valid a predictor would be if it and the criterion were perfectly reliable
Def: interval recording All behavior within a specified period of time
Factors affecting reliability coefficient Anything reducing the range of obtained scores (eg a homogeneous population)Anything increasing measurement errorShort (vs long) testsPresence of floor or ceiling effectsHigh probability of guessing a correct answer
Def: eigenvalue explained variance= Sum of the squares of the loadingssum of the eigenvalues <= number of testsApplied to unrotated factors only
Measure of inter-rater reliability Kappa coefficient
Def: power test Assesses the attainable level of difficultyNo time limitGraduated difficultyQs that everyone can doQs that no one can doEg: WAIS information subtest
Measures of internal consistency Split-half: divide test in 2 and correlate scores on the subtests; sensitive to selection strategyCoefficient alpha: used with multiple choice questionsKuder-Richardson Formula 20 (KR-20) used for questions with dichotomous answersReliability increases with item homogeniety
Def: face validity Appearance of validity to test takers, administrators and other untrained people
Use: standard error of measurement Construction of a confidence interval
What can you determine from an item response (aka item characteristic) curve? Difficultypoint where p(correct response) = .5Discriminabilityslope of the curve; lower more discriminableProbability of a correct guessintersection with y axis
Def: ipsative measures Scores reported in terms of relative strength within the individualPreference is expressed for one item over another
Relationship between reliability and validity The criterion-related validity coefficient cannot exceed the square root of the predictor's reliability coefficientReliability coefficient sets a ceiling on the validity coefficient
Factors improving inter-rater reliability Well trained ratersExplicit observation of the ratersMutually exclusive and exhaustive scoring categories
Def: mastery test Cutoff for predetermined level of performance
Def: content validity Adquate sampling of relevant content domain
Def: validity Measures what it says it does
Factors affecting test-retest reliability MaturationDifference in conditionsPractice effects
Techniques for assessing an item's discriminability Correlation with total scorean external criterion
Def: construct validity Extent to which a test successfully measures an unobservable, abstract concept such as IQ
Def: item difficulty or difficulty index * % of examinees answering correctlyan ordinal value, because an item with an index of .2 is not necessarily half the difficulty of an item with an index of .4
Def: criterion contamination Occurs when person assessing criterion knows predictor for an individual
Def: factor loading Correlation between a given test and a factor derived from a factor analysisCan be squared to give % of variance that the test accounts for in the factor
Def: normative measures Absolute strength measuredAll items answeredComparison among people possible
What are the mean and std deviation for the following standard scores: z, t, stanine and deviation IQ? mean SDz 0 1t 50 10stanine 9 ~2deviation IQ 100 15
Range and interpretation of a reliability coefficient 0 (unreliable) to 1 (perfectly reliable).9 means 90% of the variance accounted forYou do NOT square a reliability coefficient
Factors affecting shrinkage Small original validation sampleLarge original item poolRelative number of items retained is smallItems not sensibly chosen
Def: false negative Predicted not to meet a criterion but in reality does
Types of rotation (factor analysis) * Orthogonal - uncorrelatedOblique - correlatedChoice depends on what you believe the relationship is among the factors
What are the advantages of a test item of moderate difficulty (p = .5) Increases variability which increases reliability and validityMaximally differentiates between low and high scorers
Factors affecting criterion related validity Restricted range of scoresUnreliability of predictor or criterionRegressionCriterion contamination
Use: cluster analysis Categorize or taxonimize a set of objects
Techniques for assessing construct validity Convergent validity techniquesHigh correlation on a trait even with different methodsDivergent / discriminant validity techniquesLow correlation on different traits even with the same methodFactor analysis
Probability of scores falling within a specified confidence interval 68% +/- 1 SE 95% +/- 1.96 SE99% +/- 2.58 SE
Formula to convert eigenvalue to % = eigenvalue * 100 / number of tests
Contents of the Mental Measurements Yearbook AuthorPublisherTarget populationAdministrative timeCritical reviews
Def: unique variance (factor analysis) Variance not accounted for by the factorsu2 = 1 - h2, where h2 is the communality
Def: moderator variable Variables affecting validity of a testA moderator variable confers differential validity on the test
To reduce the number of false positives... Raise the predictor cutoffand / orLower the criterion cutoff
Def: shrinkage Reduction in validity coefficient on cross-validation (revalidation with a second sample)A result of noise in original sample
Def: reliability Repeatable and consistentFree from errorReflects 'true score'
Def: types of criterion related validity ConcurrentScores collected at the same timeUseful for diagnostic testsPredictive validityScores tested before and laterUseful for eg job selection tests
Def: 'testing the limits' in dynamic assessment Following a standardized test, using hints to elicit correct performance. The more hints necessary, the more severe the learning disability
Differences between cluster analysis and factor analysis Cluster analysisall types of dataclusters interpreted as categoriesFactor analysisinterval or ratio data onlyfactors interpreted as underlying constructs
Characteristics of alternate forms reliability coefficient Best, because to be high must be consistent across time and contentLikely to have a lower magnitude than other coefficients
Use: eta * Correlation of continuous non-linear variables
Def: convergent/divergent analysis Convergent validity is high correlation between different measures of same constructDivergent validty is low correlation between measures measuring different constructs
The difference between norm-referenced and criterion referenced scores Norm referenced is a comparison to others in a sampleCriterion referenced measure against an external criterion
Utility of internal consistency measures Measurement of unstable traitsNot good for speed testsSensitive to item content / sampling
Def: standard error of measurement How much error is expected from an individual test score
Appropriate measure of speed test reliability Test-retestAlternate forms
Def: item discriminability Degree to which an item differentiates between low and high scorersD = difference between high and low % correctly answeredrange from 100 to -100moderate difficulty optimal
Differences betweenstandard error of measurementandstandard error of estimate Standard error of measurementrelated to reliability coefficientused to estimate true score on a given testStandard error of estimateDetermines where a criterion will fall given a predictor
Formula: standard error of measurement * SE = SD * square root of (1-r)where r = the reliability coefficient which ranges from 0 to 1