Pigeon has potential to earn the reward 6 times in an hour. o Scallop Pattern: Cumulative record shows period in which response drops, then slow gradual increase, then a peak before the next reinforcement is scheduled. (Looks like exponential buds) VARIABLE: Rewards on variable ratio and interval scales are provided following a variable amount of work or length of time. - VARIABLE Ratio schedule: Pigeon on VR-10 schedule must peck on average of 10 times to get reward, but exact number of pecks that yields a reward changes across trials. (Trial 1: 5 times, Trial 2: 15 times, Trial 3: 10 times, Average is 10) o Capable of supporting very constant and high response rates. (Slot machine example). Cumulative record of responses may almost look like a diagonal line with no pauses. o VR Cumulative record of reflects the average number of responses required before reinforcement is delivered. So, VR schedules with more frequent reinforcement will be
steeper than VR schedules with less frequent reinforcement. Eg, VR-10 schedule will have a slope steeper than VR-40. - VARIABLE Interval schedule: Pigeon on VI-10 min schedule will be rewarded after the first response following an average of 10 minutes. But, exact length of time between rewards changes across trials. (Trial 1: 5 minutes, Trial 2: 15 minutes, Trial 3: 10 minutes. Average is 10 minutes) o Can receive reinforcement at any time, but subject will have an idea of how often reinforcement is. Subject tends to respond at a steady rate, since they will not want to miss a reinforcement opportunity. So, the cumulative record will show a straight, diagonal line. VI-2 will have a steeper slope than VI-10 since VI-2 offers more frequent reinforcement. Human Examples: - Manufacturer gives worker $30 after 3 shirts are sewn. Schedule: FR-3. - After some random number of plays around a pre-set mean, a slot machine returns rewards. (Very low mean). Schedule: VR-very low mean. - Psychology quizzes every week: Students will study a lot right before the quiz, but after the quiz the studying (response) will drop for a period of time before slowly picking up and then peaking before the next scheduled test. Partial reinforcement is far more robust (stronger) than continuous reinforcement. Better to use PR than CR, because once CR stops, it is realized right away that no reinforcement will be received and the response will drop. If PR stops, it is not yet realized that reinforcement has stopped.
- Winter '14
- Instrumental Conditioning