This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CS 6375 Homework 6 Chenxi Zeng, UTD ID: 11124236 1 Paper Reference Intrinsically Motivated Reinforcement Learning, Satinder Singh, Andrew G. Barto and Nuttapong Chentanez. Abstract The authors present initial results from a computational study of intrinsically motivated reinforcement learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. States: We assume that the agent has intrinsic or hardwired notions of interesting or “salient” events. So there are two states: salient and not salient. Actions: The agent behaves in its environment according to an εgreedy policy. Reward: The agent’s intrinsic reward is generated in a way suggested by the novelty response of dopamine neurons. The algorithm is shown on next page. 2 (a) * ( ) n J S =10+10 γ + 2 γ +…= 10 1 γ . (b) Since the largest reward is 10 at state n S , other rewards are all 1s. So the optimal policy, easily to find is that, at state 1...
View
Full Document
 Spring '09
 yangliu
 Machine Learning, arg max, Barto, Chenxi Zeng, Nuttapong Chentanez

Click to edit the document details