Depending on the nature of the interface mappings and whether the environ-ment receives actions from sources other than the learner, there are four basictypes of learning from the environment. It is important to identify these typesbecause different learning methods may only work for different types of configu-rations.2.3.1Transparent EnvironmentIn the first type of learning from the environment, the environment has no hid-den states and the perception interface mappingψis one-to-one.We call thistype of learning “learning from atransparentenvironment” because the learnercan clearly see the inside of the environment. Learning from a transparent envi-ronment is not always easy. The size of the environment may be very large, andthere may be too much for the learner to see. In order to act effectively in thistype of environment, a compact, high-level model must be abstracted from theenvironment.2.3.2Translucent EnvironmentIn the second type of learning from the environment, the environment has hiddenstates or the perception interface mappingψis many-to-one. We call this type oflearning “learning from atranslucentenvironment” because the learner cannot see
2.3. THE ENVIRONMENT AND ITS TYPES17all the states of the environment at once (but can infer them through experience).Clearly, learning from a translucent environment is more difficult than learningfrom a transparent environment. The learner is bound to experience nondetermin-ism and it must look into the history of its actions and observations in order toconstruct appropriate hidden model states.2.3.3Uncertain EnvironmentIn the third type of learning from the environment, the interface mappings arenoisy.This means that the observations of the learner do not truly reflect theoutputs of the environment, and the actions of the learner are not delivered tothe environment undisturbed. This kind of learning occurs frequently in the realworld because the sensors and the effectors (devices that carry out actions) cannotbe built so that they are completely precise. We call this type of learning “learningfromuncertain(ornoisy) environments.” To learn from such environments, thelearner must abstract models that can filter out the noise or that best estimate theuncertainty.2.3.4Semicontrollable EnvironmentIn the fourth type of learning from the environment, the environment receives ac-tions from sources other than the learner. From the learner’s point of view, theaction interface mappingηseems to be a one-to-many mapping and the environ-ment changes its outputs by itself. We call this type of learning “learning fromasemicontrollableenvironment” because the environment receives actions thatthe learner cannot control. An extreme of this type of learning occurs when thelearner’s actions have no effect at all on the behavior of the environment yet theoutput of the environment keeps changing. The learner cannot control the envi-ronment but observes the changes of the outputs. We call this type of learning“learning from anobservation-onlyenvironment.” In general, we will refer to thetype of environment that changes without the actions from the learner a “time-variantenvironment.” The reasons for the changes in the environment vary, butthese reasons have no significance from the learner’s point of view. To learn fromthis type of environment is even more difficult. The learner must model not only
Upload your study docs or become a
Course Hero member to access this document
Upload your study docs or become a
Course Hero member to access this document
End of preview. Want to read all 415 pages?
Upload your study docs or become a
Course Hero member to access this document