hw6_Solution_Part I - CS 6375 Homework 6 Chenxi Zeng UTD ID...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 6375 Homework 6 Chenxi Zeng, UTD ID: 11124236 1 Paper Reference Intrinsically Motivated Reinforcement Learning, Satinder Singh, Andrew G. Barto and Nuttapong Chentanez. Abstract The authors present initial results from a computational study of intrinsically motivated reinforcement learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. States: We assume that the agent has intrinsic or hardwired notions of interesting or “salient” events. So there are two states: salient and not salient. Actions: The agent behaves in its environment according to an ε -greedy policy. Reward: The agent’s intrinsic reward is generated in a way suggested by the novelty response of dopamine neurons. The algorithm is shown on next page.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 (a) * () n J S =10+10 γ + 2 +…= 10 1 .
Background image of page 2
(b) Since the largest reward is 10 at state , other rewards are all 1s. So the optimal policy, easily to find is that, at state to , go right till reach .
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/25/2012 for the course CS 6375 taught by Professor Yangliu during the Spring '09 term at University of Texas at Dallas, Richardson.

Page1 / 4

hw6_Solution_Part I - CS 6375 Homework 6 Chenxi Zeng UTD ID...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online