Computation with Neurons

Computation with Neurons - Computation with Neurons...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Computation with Neurons Overview Representational Issues Sparse/Distributed Rate/Temporal Code Combining/Separating Computational Approaches Rate models Spiking models Abstract models Neural Constraints Spiking neurons Substrate influences computation! XIII MLIII Issues Local or Distributed ? Distributed Neuron participates in representing many different things. Local Neuron represents one thing Derisively called "Grandmother Cell" But what looks like an Angelina Jolie cell has been found! Sparse Neuron represents a few things. Distributed Representations Population code One quantity represented by activity pattern across set of cells Each cell tuned to preferred value Cell's firing rate reflects difference from preferred value Color Three types of cone cells Wavelength encoded by pattern across cone cells Distributed Representations Feature Code Each neuron represents different attribute E.g., semantic features Mammal Swims Stripes Tiger Zebrafish Dolphin 1 0 1 0 1 1 1 1 0 Distributed Representations Abstract Pattern Tiger: Dolphin: 1 0 0 1 1 0 1 0 0 1 0 1 0 ... 1 1 0 0 1 0 1 1 0 1 1 0 0 ... Issues Rate or Temporal Code? Same or not ? time 50 ms Or How Long is Integration Window? Issues Combining and Separating How to bind? Green Square Orange Circle Binding Local Green Triangle Orange Circle Green Triangle Orange Circle Problem Combinatorial Explosion Binding Distributed Combine patterns Green Triangle Green Triangle 1 0 1 1 0 1 1 1 1 0 1 1 0 ... 1 0 0 0 1 1 0 1 0 1 1 0 0 ... 0 0 1 1 1 0 1 0 1 1 0 1 0 ... Problem Suitable Combinatory Functions Binding Temporal Green Triangle Orange Circle time Problem Longterm Memory, Hierarchy But it gets worse ... (Top = (Green + Triangle), Bottom = (Red + Square)) (Top = (Orange + Circle), Bottom = (Blue + Hexagon)) Language John knows that Bill thinks that Sue loves Jane. (Knower = John What = (Thinker = Bill What = (Lover = Sue Lovee = Jane) ) ) Neural Networks Hidden Layer Two Types of Networks Graded activation Activation level between 0.0 and 1.0 Models firing rate Activation level does not model membrane potential. You can think of activation level as: (# spikes over 20 ms interval) / (max possible # spikes) Spiking Simulate individual spikes Activation level models membrane potential Activation propagates only if a spike occurred A = S (0.57 0.5) = 0.60 4 S = 0.5 0.5 I = 0.3*0.4 + 0.9*0.2 + 0.3*0.9 = 0.57 4 F = S (I 0.5) i A = 0.6 4 A = F(I ) i i I = S w A i si s s 0.3 0.9 0.3 A = 0.4 1 A = 0.2 2 A = 0.9 3 Network Determined By: Connectivity Number of Layers Types of Connections Feed Forward Hidden Layer Feed Back Lateral Input Function Activation Function Weights Set Learned Adding Time Now have A (t) i Programmer sets values for A (0) i Equations give A(t+1) as function of A(t). I (t+1) = S w A (t) i si s s A (t+1)= F(I (t+1), A (t)) i i i Example Interactive Activation McClelland & Rumelhart (1981) Words Letters Letters Letters Letters Features Features Features Features Weights Small, do not change Weights on same type of connections are all equal. E.g., all excitatory lettertoword connections have the same weight. E.g., featuretoletter weights may be different than lettertoword weights. Weights are parameters. McClelland & Rumelhart (1981) A (0) = R Freq i i A (0) = R = 0 i i A (t) encodes input i A (t+1) = A (t) d*(A (t) R ) + I (t+1)*(1.0 A (t)) i i i i i i Old Decay + Effect of Input What Can Model Be Used For? Simulate reaction times Settling criterion E.g., one word node > 0.8 and others < 0.2 Number of cycles until settling criterion reached Do settling times correlate with reaction times? Simulate degraded stimulus / brain lesion Add noise at feature or word level Do `responses' look like experimental results? Example Suppose LYNX and SANE have same resting value. Which will settle faster? SANE Large neighborhood (HighN) More lateral inhibition at word level More topdown (feedback) excitation of letters LYNX Small neighborhood (LowN) Less lateral inhibition at word level Less feedback excitation of letters Answer depends on whether lateral inhibition or feedback excitation is stronger. Effect of Neighborhood Size? Can run lexicaldecision experiment to find out. In English: highN faster than lowN. In French, Spanish: variable results. What are Feedback Connections For? Expectation When input matches expectation, faster processing. Output Could be used for spelling. Is language difference in effect of neighborhood size related to regularity? Regular languages: rules are effective wordtoletter connections less important??? English: rules less effective wordtoletter connections stronger??? Learning Set the input layer Run the network Change weights based on output Three Types of Learning Unsupervised / Hebbian Feedback = none Increase weights between nodes that are both active Supervised Feedback = desired output Change weights based on difference between actual output and desired output Reinforcement Feedback = good/bad Change weights to increase good responses and decrease bad responses History 1940's: Research on supervised learning in twolayer networks. 1969: Minksy and Papert point out a simple function that cannot be learned by twolayer networks. Research on computational modeling dies. 1980's: Books on backpropagation algorithm published McClelland, Rumelhart, Hinton : Connectionists Algorithm specifies supervised learning for networks > two layers. Rekindles research on computational modeling. Connectionists Train `em up approach. Unrealistic modeling of cognitive processes. No temporal encoding. Now backlash against connectionist approach. Spiking Model Leaky Integrator Each node also has time of last spike, S i Activation function models membrane potential. Update A (t+1) based on decay, input i If A (t + 1) Threshold: i S = t + 1 i A (t + 1) = resting value i Input to i from s models PSP = w * f(tS ) si s Instead of A (t) s f = Graded vs. Spiking If t = 6 and S = 2 s f = w w * A (t) s w * f(t S ) s A (t) S = time s last spiked s s Learning in Spiking Models Spike Timing Dependent Plasticity Change weights based on timing of spikes in sending and receiving nodes. Unsupervised learning, or twolayer supervised. Cannot use backpropagation because activation function is not differentiable. Computational Neuroscientists Usually use spiking models Try to be realistic Don't usually approach highlevel cognitive problems Alternative Abstract Model Does not depend on implementations. Specify nature of neural representation within/ between levels of processing. E.g., claims that: Abstract letter representations fire serially, ~15 ms/letter At lower level: parallel encoding, activation gradient Conclusion It's important to think about implications of neural substrate for cognitive representations. A computational approach doesn't necessarily mean doing simulations. ...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online