Lecture7-Jan+28th-reliability-validity2

Lecture7-Jan+28th-reliability-validity2 -  ...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview:   Quiz
5
due
NOW
   Quiz
6
due
will
be
available
after
class.
   Due
date
is
next
THURSDAY,
before
class
   Exam’s
scores
will
be
available
by
end
of
 week
   We’ll
discuss
results
next
tuesday
   The
problem
of
artifacts
   Reliability
   Validity
   The
problem
of
artifacts
   Uncontrolled
human
aspects
of
the
research
 situation
that
CONFOUND
researcher’s
 conclusions
   Participants
related
artifacts
 (aka
Demand
Characteristics)
   Cooperative
 ▪  Tries
to
give
the
‘best
performance’
that
matches
the
 presumed
hypothesis
   Non‐Cooperative
 ▪  Doesn’t
care
about
study
or
tries
to
sabotage
results
   Defensive
 ▪  Wants
to
be
portrayed
in
good
light
   Researchers
related
artifacts
   Observer
bias
 ▪  Over‐
or
under‐estimation
of
what
was
observed
   Expectancy
bias
 ▪  ‘Self‐fulfilling
prophecy’
   Blind
experiments
   Deception
   ‘Double
blind’
   Automation
(Standardization)
   Computers
   Recording
instructions
   
Question
participants
   The
problem
of
artifacts
   Reliability
   Are
our
measurements
precise?
   Validity
   Are
we
really
measuring
what
we
think
we
are
 measuring?
   Extent
to
which
measurements
are
free
 of
random
errors
   Random
error:
nonsystematic
mistakes
in
 measurement
 ▪  misreading
a
questionnaire
item
 ▪  observer
looks
away
when
coding
behavior
 ▪  nonsystematic
misinterpretations
of
a
behavior
   What
are
the
implications
of
random
 measurement
errors
for
the
quality
of
our
 measurements?
  O
=
T
+
E
+
S
 O
=
a
measured
score
(e.g.,
performance
on
an
exam)
 T
=
true
score
(e.g.,
the
value
we
want)
 E
=
random
error
 S
=
systematic
error

  O
=
T
+
E
 (we’ll
ignore
S
for
now,
but
we’ll
return
to
it
later)
   O
=
T
+
E
   The
error
becomes
a
part
of
what
we’re
 measuring!
   Do
random
errors
accumulate?
   Answer:
No.

If
E
is
truly
random,
we
are
just
as
 likely
to
overestimate
T
as
we
are
to
 underestimate
T.
 Note:
The
average
of
the
seven
O’s
is
equal
to
T
   important
way
to
reduce
the
influence
of
 random
errors
of
measurement
is
to
use
 multiple
measurements.
   Operationally
define
latent
variables
via
multiple
 indicators
   Use
more
than
one
observer
when
quantifying
 behaviors
   Multiple
observations
   How
do
we
assess
reliability?
 (a)
test‐retest
reliability
 (b)
alternate‐forms
reliability
 (c)
internal
consistency
reliability
   All
rely
on
correlating
two
variables.
   Reliability
is
measure
by
correlation
coefficient
 ▪  Always
positive.
Thus
reliability
index
varies
from
0‐1
   Test‐retest
reliability:

   measure
something
at
least
twice
at
different
 time
points.
   If
errors
of
measurement
are
truly
random,
then
 the
same
errors
are
unlikely
to
be
made
more
 than
once.


   If
two
measurements
of
the
same
thing
agree,
it
is
 unlikely
that
they
contain
random
error.
   Test‐retest
reliability:

   IMPORTANT:
we
are
assuming
that
what
we
are
 measuring
does
NOTvary
over
time!!!!!!
   Sometimes
people
remember
previous
answers
 ▪  INFLATED
reliability
coefficient
   Other
methods?
   Alternate‐forms
reliability

   Use
two
equivalent
tests
   Correlation
should
be
high
   Internal
consistency
   Extent
to
which
items
in
questionnaire
correlate
 with
each
other
   If
measuring
same
thing,
correlation
should
be
 high
 Split‐half:
based
on
an
arbitrary
split
(e.g,
 comparing
odd
and
even,
first
half
and
second
 half)
 Cronbach’s
alpha
(α):
based
on
the
average
of
all
 possible
split‐halves
   Inter‐rater
reliability
   Percentage
of
time
agreed
   Correlation
ratings
   Kappa
Coefficient
   ASSUMPTION
   The
entity
being
measured
is
not
changing.
   IMPLICATIONS
   
As
you
increase
the
number
of
indicators,
the
 amount
of
random
error
in
the
averaged
 measurement
decreases.

   NOTE

   Common
indices
of
reliability
range
from
0
to
1;
 higher
numbers
indicate
better
reliability
(i.e.,
less
 random
error).
   The
problem
of
artifacts
   Reliability
   Are
our
measurements
precise?
   Validity
   Are
we
really
measuring
what
we
think
we
are
 measuring?
  O
=
T
+
E
+
S
 O
=
a
measured
score
(e.g.,
performance
on
an
exam)
 T
=
true
score
(e.g.,
the
value
we
want)
 E
=
random
error
 S
=
systematic
error

   Validity
   Degree
to
which
measurements
are
free
of
 both
random
error,
E,
and
systematic
error,
S.
   Systematic
errors
reflect
the
influence
of
any
 non‐random
factor
beyond
what
we’re
 attempting
to
measure.
   Do
systematic
errors
accumulate?
   YES!
Systematic
errors
exert
a
constant
 source
of
influence
on
measurements.

   
We
will
always
overestimate
(or
 underestimate)
T
if
systematic
error
is
 present!
 Note: Each measurement is 2 points higher than the true value of 10. The errors do no average out. Note: Even when random error is present, E averages to 0 but S does not. Thus, we have reliable measures that have validity problems.   How
do
we
ensure
validity?
   3
questions
   Are
we
measuring
what
we
think
we’re
 measuring?

   Construct
Validity
   Is
the
cause‐effect
relationship
really
there?

   Internal
Validity
   Are
our
results
generalizable?

   External
Validity
   How
well
did
we
measure
what
we
 intended
to?
   Especially
important
when
we
are
interested
 in
the
theoretical
construct
per
se
   Nomological
network
   represents
interrelations
 among
variables
involving
 the
construct
of
interest
 achieve in school selfesteem   Nomological
validity
   Degree
to
which
our
 measure
behaves
in
the
 way
assumed
by
the
 theoretical
network
 +
 +
 ‐
 distrust friends ability to cope   Should
predict
grades
in
 school.


 achieve in school +
 +
   Should
fail
to
be
related
 ability to cope selfesteem to
variables
unrelated
 to
self‐esteem
 ‐
 distrust friends like coffee   How
well
did
we
measure
what
we
 intended
to?
   Nomological
validity
 ▪  Degree
to
which
measure
behaves
in
the
way
assumed
by
 theoretical
network
   Face
validity
 ▪  Does
it
look
like
it’s
measuring
the
construct?
   How
well
did
we
measure
what
we
 intended
to?
   Nomological
validity
   Face
validity
   Content
validity
 ▪  Does
it
include
all
relevant
component
of
the
construct
 and
exclude
irrelevant
ones?
   How
well
did
we
measure
what
we
 intended
to?
   Nomological
validity
   Face
validity
   Content
validity
   Convergent
validity
 ▪  Does
it
correlate
with
measures
that
assess
same
 construct?
   How
well
did
we
measure
what
we
 intended
to?
   Nomological
validity
   Face
validity
   Content
validity
   Convergent
validity
   Discriminant
validity
 ▪  Whether
it
FAILS
to
correlate
with
measures
that
assess
 different
construct
   Concerns
cause
and
effect
relationship
   Low
internal
validity
for
predictive
designs
   Correlation
does
not
imply
causation
   High
internal
validity
for
explanatory
designs
   Manipulation
changes
outcome
   Need
to
rule
out
effect
of
extraneous
variables
 ▪  Extraneous
vs.
Confounding
variable
   Do
finding
generalize?
   Representative
Sample
   Report
sample
characteristics
   Representative
Setting
   Difficult
in
experimental
designs
   Report
setting
characteristics
 High
 Internal
Validity
 High
 External
Validity
 Causality
 Generalizablilty
 Explanatory

 designs
 Predictive
 designs
 Low
External
Validity

 Low
Internal
Validity

   Cannot
have
validity
if
there
is
no
 reliability
   Reliability
does
NOT
guarantee
 validity
 ...
View Full Document

Ask a homework question - tutors are online