hw6_Solution_Part I - CS 6375 Homework 6 Chenxi Zeng UTD ID...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 6375 Homework 6 Chenxi Zeng, UTD ID: 11124236 1 Paper Reference Intrinsically Motivated Reinforcement Learning, Satinder Singh, Andrew G. Barto and Nuttapong Chentanez. Abstract The authors present initial results from a computational study of intrinsically motivated reinforcement learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. States: We assume that the agent has intrinsic or hardwired notions of interesting or “salient” events. So there are two states: salient and not salient. Actions: The agent behaves in its environment according to an ε-greedy policy. Reward: The agent’s intrinsic reward is generated in a way suggested by the novelty response of dopamine neurons. The algorithm is shown on next page. 2 (a) * ( ) n J S =10+10 γ + 2 γ +…= 10 1 γ- . (b) Since the largest reward is 10 at state n S , other rewards are all 1s. So the optimal policy, easily to find is that, at state 1...
View Full Document

{[ snackBarMessage ]}

Page1 / 5

hw6_Solution_Part I - CS 6375 Homework 6 Chenxi Zeng UTD ID...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online