No the best action for achieving target one can

b) A simple reflex agent is sufficient a. The agent hits unless the world state is such that a hit would cause a bust (that is, hit unless your current total + the peeked card is greater than 21) b. No. The best action (for achieving target one) can always be uniquely determined from the current percept. Thus keeping a model is doesn't help any or change the action selected. c. Any change to the task environment that would require remembering information from past percepts (a.k.a. keeping a model) is acceptable as an answer here. c) Any pseudocode that expresses something like this is acceptable: if (CalcSum(agent) + next card) <= 21 then return "hit"/true else return "stand"/false d) Yes. The agent knows how the dealer plays
