| Terms |
Definitions |
|
Achievement
|
previous learning
|
|
Homogeneous
|
One factor/topic/content area
|
|
Gesell Developmental Schedules
type
|
infant scale
|
|
Reliability Coefficient
|
true score variance
total variance
|
|
Selected response format
|
MC, T/F, matching
|
|
Gesell Developmental Schedules
score/scale
|
gross motor
fine motor
adaptive
language
personal social
|
|
WAIS- III
Standardization Sample
|
Good Stratified sample
|
|
Test bias
|
Systematic variance...something in the test favors one group over another
|
|
Da Big Tres
trinitarian view
|
Content, criterion, construct
|
|
Benton Visual Retention Test
type
|
Visiographic Test
|
|
Brazelton Neonatal Assessment Scale
type
|
infant scale test
|
|
WPPSI III
|
W. Preschool and primary scale
2yrs6mo-6
|
|
Scaling
|
The process of setting rules for assigning #'s in measurement
|
|
Types of Scales
|
Nominal, Ordinal, Interval, Ratio
Age scale
Grade scale
Stanine Scale
|
|
Kuder-Richardson 20
|
Stat of choice for determining inter-item consistency
Good for dichotomous items (right or wrong)
|
|
Concurrent validity
|
the relationship between test scores and the criterion measured at the same time
good for depression
|
|
Test Tryout
|
5-10 people per item
Use target population
|
|
Bayley Scales of Infant Development
scales
|
motor and mental
|
|
Woodcock- Johnson III
purpose
|
asesses both achievemtn and cognition
|
|
Brazelton Neonatal Assessment Scale
purpose
|
measure new born competence
|
|
SAT
Scholastic Assessment Test
|
aptitude test
predicts 1st year college gpa
psychometrically sound
m = 500 sd = 100
measures verbal and math reasoning
|
|
Equivalent forms reliability
|
Gives coefficient of equivalence, also known as parallel forms reliability. Accomplished by administering one test, waiting, and giving alternate form to same group. Advantages- less time between administrations. Fewer carryover effects, because we are not using the same Q's.Disadvantages- must develop two tests instead of one. Still requires two test administrations
|
|
Coefficient of Equivalents
|
Used with alternative or parallel forms
|
|
Content Validity
|
How adequately does a test sample behavior and represent the universe of behavior it was designed to measure
|
|
Factor Analysis
|
Stat program thing that classes people or things together
|
|
Confidence Interval
|
is the plus or minus 3 part
|
|
Does test-retest work with dynamic or static characteristics?
|
Static--non changing
|
|
Illinois Test of Psycholinguistic Abilities
type
|
Special Population: Learning Disabilities
|
|
Bender Visual Motor Gestalt Test
age
|
5 - 8 years
|
|
Armed Services Vocational Aptitude Battery
composites
|
4 occupational composites:
mechanical / crafts
business / clerical
electronics / electrical
health / social
|
|
WAIS- III
Validity
|
revised edition does not provide new validity data,compared to earlier version
|
|
Stanford Binet V
Background
|
initiated the modern field of intelligence testing
Binet commissioned by french government, they needed a method of identifying intellectually deficient children for their placement in special education programs
|
|
Sampling techniques- stratified random sampling
|
population is divided into subgroups/strata based on demographics
|
|
Developmental norms
|
age norms, grade norms, etc. This allows us to determine whether an individual's test score is similar to, below, or above the average of others at the same age or grade level
|
|
Standard Error of Difference
|
Scores can change from one administration to the other based on stuff other than error
Use this stat to see if it is significantly diff
|
|
Criterion-referenced test
|
Compares a testaker to an objective or standard
Traditional reliability is not appropriate for this type of test
|
|
Messick
|
Feels big 3 is incomplete, he wants to include
societal values
consequences
how the test is being used
|
|
LSAT
Law School Admission Test
facts
|
Requires specific knowledge, may be taken by any major
one of the most difficult tests
|
|
Stanford Binet V
Purpose
|
aiming to identify students who could benefit from extra help in school: his assumption was that lower IQ indicated the need for more teaching, not an inability to learn.
|
|
Sampling techniques- cluster sampling
|
Used when target population is large and when it is not feasible to list all individuals. Clusters are selected and participants are selected from each cluster (difference from stratified is that this is regional, not demographic)
|
|
Item-Difficulty Index
|
P = # of students who got ? right
Total test takers
|
|
McCarthy Scale of Children's Ability
age
|
2 1/2 - 8 1/2 years
|
|
Gesell Developmental Schedules
uniquenes
|
not used so much
one of the first tests
developmental quotient
|
|
non verbal group ability test:
Goodenough Harris Drawing Test
facts
|
only used with other tests
quickest, easiest ability test
|
|
Stanford Binet V
Age Scale
|
Items are grouped according to age level
|
|
Sampling techniques- simple random
|
every member of the population has an equal chance of being sampled
|
|
Standard Error of Measurement
|
Provide a measure of the precision of an observed test score
HIGHER=lower reliability
Mean plus or minus error
ex. poll is accurate plus or minus 3%
|
|
Peabody Picture Vocabulary Test
purpose
|
participant does not have to read or write
measures hearing vocabulary
|
|
Stanford Binet V
Format: Basal
(Adaptive Testing)
|
Level at which a min. of correct responses obtained
|
|
Internal consistency reliability- split halves
|
give the test to one group, then split the items up and correlate scores on both halvesAdvantages- only one administration solves carryover problemDisadvantages- not confident that even and odd items are equal. by splitting test in half, we lose information. Doesnt tell us about stability of test scores over time.Our reliability coefficient here is underestimated, therefore we correct with the Spearman Brown formula
|
|
non verbal group ability test:
Raven Progressive Matrices Test
age
|
5 - adult, for groups or individuals
|