hw2 - CS 6375 Machine Learning Fall 2010 Assignment 2...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 6375 Machine Learning Fall 2010 Assignment 2: Markov Decision Processes and Reinforcement Learning Part I: Due by Sunday, October 3, 11:59 p.m. Part II: Due by Sunday, October 10, 11:59 p.m. Submission instructions for the written problems: As in Assignment 1, you may either slip a written (hard-copy) solution under Eduardo’s of- fice door (do not leave it in the rack outside his office) or submit your solution electronically via eLearning. If you choose to submit electronically, submit only one PDF file containing your solution to all the problems. Files in any other format will be ignored. Submission directories that contain more than one file will also be ignored. According to Eduardo, most UTD computers have a PDF printer installed and he can explain to you how to use it if needed. He will also be happy to point you to free PDF printers for Windows and Linux. Regardless of the submission method you use, make sure that your name appears at the beginning of your submission. Five points will be taken off if you fail to do so. Whenever possible, you should provide brief justifications for your solution. Part I: Programming (50 points) In this problem you will implement the value iteration algorithm for finding the optimal policy for each state of an MDP using Bellman’s equation. Your program should assume as input a file that contains a description of an MDP. Below is a sample input file: s1 5 (a1 s1 0.509) (a1 s2 0.491) (a2 s1 0.31) (a2 s3 0.69) s2 10 (a1 s1 0.4) (a1 s2 0.3) (a1 s3 0.3) (a2 s2 0.5) (a2 s3 0.5) s3 -5 (a1 s1 0.3) (a1 s2 0.3) (a1 s3 0.4) (a2 s1 0.2) (a2 s2 0.8) Each line in this file stores information for one state in the given MDP. For instance, the first
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 11/03/2010 for the course COMPUTER S CS6375 taught by Professor Vincentng during the Fall '10 term at University of Texas at Dallas, Richardson.

Page1 / 4

hw2 - CS 6375 Machine Learning Fall 2010 Assignment 2...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online