This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Cognitive Neuroscience Recognizing Objects: The Computational Problems The problem: The visual recognition processes need to be general enough so as to allow us to recognize objects under variable conditions--achieve object constancy Yet specific enough to allow us to detect differences between objects, and exemplars of objects Object constancy We can recognize objects across changes in: Illumination Size Occlusion Viewing position These changes create very different signals at the retina the brain must be able to determine their constancy despite these differences Connors, E. How do we do this? The evidence that we don't know the answer to this question is the fact that we can't build a system to do these things Machine vision works well only with: simplified artificial stimuli (e.g., bar codes) A model world with very few objects How do we do this? Evidence from cognitive and neuro psychology Pieces of the puzzle: (1) Independent visual feature dimensions Illusory conjunctions Feature and conjunction search Neuropsychological dissociations (2) Vision is "nonveridical": not simply stimulusdriven but also knowledge driven Illusions Gestalt grouping principles Independence of color and shape: Illusory conjunctions Brief presentation, 200 msec Report the 2 digits Report position, colors and names of any letters seen 5 T S N 8 Illusory conjunctions In brief displays, subjects may correctly perceive the colors and the shapes, but may "misconjoin" them Target: 5 T S N 8 "5 8, T S N" "5 8, N S T" Illusory conjunctions The misperceptions reveal that color and shape were independent at some point in processing Only at a subsequent point, brought together, or "conjoined" (assigned to the same location) Spatial attention is thought to be required for feature conjunction Independence of color and form Feature and conjunction search (Treisman et al.) What are the basic visual features used by the visual system? Visual search methods can be used to reveal the features that are independently processed by the visual system e.g., color/form If features are independent characteristic feature/conjunction search pattern Neuropsychological Cases achromatopsia akinotopsia Achromatopsia Akinetopsia Akinetopsia (Zihl, et al., 1983) % correct control LM Color Shape Motion 100 100 100 100 100 60 RT control LM 285 296 275 315 306 max Perception Data (stimulus) driven: bottomup + Knowledge driven: topdown Knowledgedriven perception: Illusory contours: Knowledgedriven perception: Illusory contours: If the edge is not in the stimulus, where does it come from? From processes searching for edges and "generating" them where they are "likely" To do, the visual material must be appropriate "grouped" Knowledge driven perception Knowledge of familiar shapes Grouping principles What direction is the triangle pointing? Perceived pointing of ambiguous Triangles Effects of grouping How does the visual system group? Grouping (Gestalt) principles Color Size Orientation A functional architecture of object recognition and naming A functional architecture of object recognition and naming Perception grouping affects perception of form: Illusory conjunctions within vs. across perceptual groups Misperception of a + Within group Across group 22% 15% Edges surfaces Information must be stored in longterm memory in a format that will support both general and specific object recognition across a wide range of viewing conditions and viewpoints How is shape information stored in memory so as to allow for viewpoint independent object recognition? There needs to be able to be a "match" between how the stimulus is represented and the representation in long term memory. Possible hypotheses: Templates 3D, viewnormalized structural descriptions Stimulus representation: Extracting invariance Longterm memory: Templates Stored templates: the prototypical shape Match the stimulus representation to the stored template Templates Multiple Views Match image against stored representations of multiple views. Not flexible enough How to deal with the variability of instances What about? 3D view normalized representations Construct 3D representation View normalization: Identify the major axes of symmetry or elongation of the object itself rather than on the viewpoint of the viewer Compare normalized 3D rep to stored 3D reps 3D, objectcentered, structural descriptions? based on principal axes of elongation and symmetry major features represented relative to these axis Can these be determined from an image? 3D Objectcentered structural descriptions Vuilleumier et al. (2002) fMRI Priming Exp. 1 40 different kinds of manmade objects, 2 exemplars each 40 nonobjects Each object shown twice Exp. 2 Each real object shown again, under size x viewpoint manipulation 40 new real + 40 new nonreal objects RepetitionInduced Reductions in BOLD Signal Exp 1 Real = Nonsense Real > Nonsense Viewspecific Real Across exemplars of object Exp 2
Viewinvariant Real SizeInvariant Predictions for breakdown? Categoryspecific knowledge and processes Visual Agnosia JGE 74 yearold male Master's degree, highschool business teacher Left CVA affecting left occipitoparietal region and a small right occipital infarct rack of trays mule bowl cabinet Picture Naming Accuracy: 75% (349/464) correct Errors: Visual+semantic: Semantic Visual Circumlocutions Don't Know 45% 26% 17% 10% 1% (sheep > cow) (shoe > sweater) (baseball bat > cigarette) (helicopter > flies about, flying ambulance) Where is the breakdown? Naming? A naming problem? Naming from other modalities of input: Tactile: 50 objects to name from tactile presentation with eyes closed (including 21 items previously named incorrectly with visual input) results: 96% correct (100% correct on subset) To definition: asked to provide a name when given a definition of 42 objects misnamed from vision results: 98% correct Where is the breakdown? Knowledge of The visual attributes of objects? Knowledge of visual attributes? Drawing from LTM memory: ("please draw a ____") 36 objects he had previously misnamed results: 92% correct (inclusion of object parts and spatial configuration) Knowledge of visual attributes? Verbal definitions: asked to define 42 objects previously misnamed results: 100% correct, including info about shape and appearance Shoe: "made of leather, opens up, may be laced, has a heel, maybe a two inch heel and a sole. You wear them on your feet" Chair: "four legs, various kinds, folding chairs, Used for sitting.... Have upholstery, has a back made of wood, has a back Knowledge of visual attributes? Color, size and function matching to auditory stimuli: color: Is an apple or a pineapple the color of a strawberry? size: Is a harp or a football the size of a toaster function: Does a pen or a stapler have a similar function a a typewriter? AUDITORY 100% 100% 100% Function: Size: Color: Knowledge of visual attributes? Color, size and function matching auditory Function: Size: Color: 100% 100% 100% visual 76% 65% 57% How could this come about? Where is the breakdown? Can he see forms, edges, surfaces? Perception? Copying asked to copy 30 objects previously misnamed results: 100% correct Overlapping figures: asked to outline the objects 100% (35/35) although only named 27/35 correctly Where is the breakdown? Extracting invariance ? Matching across orientation? Results: 71% correct (control Ss: 97%) Accessing stored structural descriptions Object Decision Task Results: 80% (control Ss: 94%) Where is the breakdown? Extracting invariance Working hypothesis: in constructing or manipulating 3D representations of object shapes, extracting invariance This disrupts access to subsequent processes and components from visual input ...
View Full Document
- Spring '08