CS 165A Artificial Intelligence, Winter 2011 Assignment #1 Artificial Intelligence Due Thursday, Jan 20 before the class
Notes:
Be sure to re-read the Policy on Academic Integrity on the course syllabus. Any updates or corrections will be posted on the
make a trip to a new destination, the taxi might take a while to consult
its map and plan the best route. But the next time a similar trip is
requested, the SPEEDUP LEARNING planning process should be much
faster. This is called speedup learning, and is d
public policy involves both millions of dollars and life and death. For
example, in deciding what levels of a carcinogenic substance to allow
into the environment, policy makers must weigh the prevention of deaths
against the economic hardship that might
17.1, find all the threshold values for the cost of a step, such that the
optimal policy changes when the threshold is crossed. 17.5 Prove that
the calculations in the prediction and estimation phases of the basic
decision cycle (Equations (17.8) and (17.
INFERENCES EXPLAINING AWAY MIXED INFERENCES In both
of these problems, the reasoning is diagnostic. But belief networks are
not limited to diagnostic reasoning and in fact can make four distinct
kinds of inference: 0 Diagnostic inferences (from effects to
to generate a large number of concrete models of the domain that are
consistent with the network distribution. They give an approximation of
the exact evaluation. In the general case, exact inference in belief
networks is known to be NP-hard. It is fairly
John calling. Over the course of 1000 days, we expect one burglary, for
which John is very likely to call. However, John also calls with
probability 0.05 when there actually is no alarmabout 50 times over
1000 days. Thus, we expect to receive about 50 fal
Connected Belief Networks 455 CUTSET BOUNDED CUTSET
CONDITIONING polytrees. Each simple network has one or more
variables instantiated to a definite value. P(X\E) is computed as a
weighted average over the values computed by each polytree. A set of
variab
within a wellunderstood reasoning system. Work on such a language is
one of the most important topics in knowledge representation research,
and some progress has been made recently (Bacchus, 1990; Bacchus
etal, 1992). Overall, the potential payoff of comb
closely related to the general computational technique of dynamic
programming. Slightly more complex methods are needed to handle
the case where the length of the action sequence is unbounded. We
briefly discussed the use of system gain and discounting. S
Boolean nodes are replaced by a meganode that takes on four possible
values: TT, TF, FT, and FF. The meganode has only one parent, the
Boolean variable Cloudy, so there are two conditioning cases. Once the
network has been converted to a polytree, a linea
measures the probability that the cost is less than or equal to any given
amount that is, it integrates the original distribution. If the cumulative
distribution for S\ is always to the right of the cumulative distribution for
52, then stochastically spea
another rule, namely, that M generally is effective against D. Given this
rule, and the student's prior knowledge, the student can now explain why
the expert prescribes M in this particular case. We can generalize this
example to come up with the entailme
operationality of each subgoal in the rule. A subgoal is operational,
roughly speaking, if it is "easy" to solve. For example, the subgoal
Primitive(z) is easy to solve, requiring at most two steps, whereas the
subgoal Simplify(y + z, w) could lead to an
candidates for the performance element. Decision-tree learning
algorithms that provide real-valued output can also be used (see for
MODEL TREES example Quinlan's (1993) model trees), but cannot use
the gradient descent method. The formulation of the induc
Nationality^, n) A Language(x, I) => Language(y, I) (2 1 .4) (Literal
translation: "If x and y have a common nationality n and x speaks
language /, then y also speaks it.") It is not difficult to show that, given
this sentence and the observation National
proof. After such an experience, we would like the program to solve the
same problem much more quickly the next time. MEMOIZATION The
technique of memoization has long been used in computer science to
speed up programs by saving the results of computation
visited. That is, the most important aspect of an implicit representation is
not that it takes up less space, but that it allows for inductive
generalization over input states. For this reason, methods that learn such
representations are said to perform i
whose left-hand side consists of the leaves of the proof tree, and whose
right-hand side is the variabilized goal (after applying the necessary
bindings from the generalized proof). 4. Drop any conditions that are
true regardless of the values of the vari
But the same generalization would be forthcoming from a traveller
entirely ignorant of colonial history. The relevant prior knowledge in this
case is that, within any given country, most people tend to speak 628
Chapter 21. Knowledge in Learning RELEVANCE
neural network would in some cases have to be exponentially larger in
order to represent the same input/output mapping as a belief network
(else we would be able to solve hard problems in polynomial time).
Practically speaking, any neural network that can
identify the necessary conditions for those same steps to apply to another
case. We will use for our reasoning system the simple backwardchaining theorem prover described in Chapter 9. The proof tree for
Derivative(X2 ,X) = d is too large to use as an exa
knowledge as well as with the new observations, the effective hypothesis
space size is reduced to include only those theories that are consistent
with what is already known. 2. For any given set of observations, the
size of the hypothesis required to cons
general rules from individual observations. As an example, consider the
problem of differentiating and simplifying algebraic expressions
(Exercise 10.4). If we differentiate an expression such as X2 with
respect to X, we obtain 2X. (Notice that we use a c
attribute is defined in such a way that, all other things being equal,
higher values of the attribute correspond to higher utilities. For example,
if we choose as an attribute in the airport problem AbsenceOfNoise, then
the greater its value, the better t
previously considered to be the entire agent: it takes in percepts and
decides on actions. The learning element takes some knowledge about
the learning element and some feedback on how the agent is doing, and
determines how the performance element should
because the row must sum to 1. In Figure 15.9, we dropped one of the
two columns, but here we show all four.) The tricky part about clustering
is choosing the right meganodes. There are several ways to make this
choice, but all of them ultimately produce
and the first round is done. To estimate P(WetGrass\Cloudy) (or in
general P(X\E), we repeat the process many times, and then compute
the ratio of the number of runs where WetGrass and Cloudy are true to
the number of runs where just Cloudy is true. This
is by no means the general case. Most human learning takes place in the
context of a good deal of background knowledge. Some psychologists
and linguists claim that even newborn babies exhibit knowledge of the
world. Whatever the truth of this claim, there
involve many repeated calculations. A better way to arrange things is to
"memoize" the computations by forward-chaining from the evidence
variables. Given careful bookkeeping, the entire computation can be
done in linear time. It is interesting to note th
Equation (20.8). 20.10 Extend the standard game-playing environment
(Chapter 5) to incorporate a reward signal. Put two reinforcement
learning agents into the environment (they may of course share the agent
program) and have them play against each other.
"Pandemonium" system (Selfridge, 1959; SelfridgeandNeisser, 1960)
involved a relatively fine-grained distributed control regime reminiscent
of neural networks. Cragg and Temperley (1954; 1955) drew parallels
between McCulloch-Pitts neural networks and "sp
Derivative(u",u) = 2u If the knowledge base contains such a rule, then
any new case that is an instance of this rule can be solved immediately.
This is, of course, merely a trivial example of a very general
phenomenon. Once something is understood, it can
mutations occur, and although most of these mean a quick death for the
mutated individual, some mutations lead to new successful species. The
publication of Darwin's The Origin of Species on the Basis of Natural
Selection was a major turning point in the
The only reward signal was given at the end of each game. The
evaluation function was represented by a fully connected neural network
with a single hidden layer containing 40 nodes. Simply by repeated
application of Equation (20.8), TD-gammon learned to p