This preview shows page 1. Sign up to view the full content.
Unformatted text preview: L 2 ALP • Ignores policy • Potentially large error • Weak PL guarantees: L 1 ABP • Concurrently optimizes value and policy • Strong PL guarantees : L ∞ • Approximate algorithm for solving ABP ≈ Convergent version of API Value Error Policy Error L ∞ norm...
View Full Document
This note was uploaded on 02/12/2010 for the course COMPUTER S 10586 taught by Professor Jilinwang during the Fall '09 term at Zhejiang University.
- Fall '09
- Computer Science