hw6_Solution_Part I - CS 6375 Homework 6 Chenxi Zeng, UTD...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 6375 Homework 6 Chenxi Zeng, UTD ID: 11124236 1 Paper Reference Intrinsically Motivated Reinforcement Learning, Satinder Singh, Andrew G. Barto and Nuttapong Chentanez. Abstract The authors present initial results from a computational study of intrinsically motivated reinforcement learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. States: We assume that the agent has intrinsic or hardwired notions of interesting or salient events. So there are two states: salient and not salient. Actions: The agent behaves in its environment according to an -greedy policy. Reward: The agents intrinsic reward is generated in a way suggested by the novelty response of dopamine neurons. The algorithm is shown on next page. 2 (a) * ( ) n J S =10+10 + 2 += 10 1 - . (b) Since the largest reward is 10 at state n S , other rewards are all 1s. So the optimal policy, easily to find is that, at state 1...
View Full Document

This note was uploaded on 01/25/2012 for the course CS 6375 taught by Professor Yangliu during the Spring '09 term at University of Texas at Dallas, Richardson.

Page1 / 5

hw6_Solution_Part I - CS 6375 Homework 6 Chenxi Zeng, UTD...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online