Unformatted Document Excerpt
Coursehero >>
Illinois >>
University of Illinois, Urbana Champaign >>
CS 591
Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Modeling Probabilistic vs. Function Approximation
Principles and Applications of Probabilistic Learning
Two major themes in machine learning:
1. Function approximation/black box methods e.g., for classification and regression Learn a flexible function y = f(x) e.g., SVMs, decision trees, boosting, etc 2. Probabilistic learning e.g., for regression, model p(y|x) or p(y,x) e.g, graphical models, mixture models, hidden Markov models, etc
Padhraic Smyth Department of Computer Science University of California, Irvine www.ics.uci.edu/~smyth
Both approaches are useful in general
In this tutorial we will focus only on the 2nd approach, probabilistic modeling
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Motivations for Probabilistic Modeling
leverage prior knowledge generalize beyond data analysis in vector-spaces handle missing data combine multiple types of information into an analysis generate calibrated probability outputs quantify uncertainty about parameters, models, and predictions in a statistical manner
P(Data | Parameters)
Probabilistic Model
Real World Data
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
(Generative Model) P(Data | Parameters) P(Data | Parameters)
Probabilistic Model
Real World Data
Probabilistic Model
Real World Data
P(Parameters | Data)
P(Parameters | Data) (Inference)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
1
Outline
1. Review of probability 2. Graphical models 3. Connecting probability models to data 4. Models with hidden variables 5. Case studies
(i) Simulating and forecasting rainfall data (ii) Curve clustering with cyclone trajectories (iii) Topic modeling from text documents
Part 1: Review of Probability
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Notation and Definitions
X is a random variable
Lower-case x is some possible value for X X = x is a logical proposition: that X takes value x There is uncertainty about the value of X e.g., X is the Dow Jones index at 5pm tomorrow
Example
Let X be the Dow Jones Index (DJI) at 5pm Monday August 22nd (tomorrow) X can take real values from 0 to some large number p(x) is a density representing our uncertainty about X
This density could be constructed from historical data, e.g.,
p(X = x) is the probability that proposition X=x is true
often shortened to p(x)
If the set of possible xs is finite, we have a probability distribution and p(x) = 1 If the set of possible xs is infinite, p(x) is a density function, and p(x) integrates to 1 over the range of X
After 5pm p(x) = 1 for some value of x (no uncertainty), once we hear from Wall Street what x is
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probability as Degree of Belief
Different agents can have different p(x)s
Your p(x) and the p(x) of a Wall Street expert might be quite different OR: if we were on vacation we might not have access to stock market information we would still be uncertain about p(x) after 5pm
Comments on Degree of Belief
Different agents can have different probability models
There is no necessarily correct p(x) Why? Because p(x) is a model built on whatever assumptions or background information we use Naturally leads to the notion of updating p(x | BI) -> p(x | BI, CI)
This is the subjective Bayesian interpretation of probability
So we should really think of p(x) as p(x | BI)
Where BI is background information available to agent I
(will drop explicit conditioning on BI in notation)
Generalizes other interpretations (such as frequentist) Can be used in cases where frequentist reasoning is not applicable We will use degree of belief as our interpretation of p(x) in this tutorial
Thus, p(x) represents the degree of belief that agent I has in proposition x, conditioned on available background information
Note!
Degree of belief is just our semantic interpretation of p(x) The mathematics of probability (e.g., Bayes rule) remain the same regardless of our semantic interpretation
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
2
Multiple Variables
p(x, y, z)
Probability that X=x AND Y=y AND Z =z Possible values: cross-product of X Y Z e.g., X, Y, Z each take 10 possible values x,y,z can take 103 possible values p(x,y,z) is a 3-dimensional array/table
Defines 103 probabilities
Conditional Probability
p(x | y, z)
Probability of x given that Y=y and Z = z Could be hypothetical, e.g., if Y=y and if Z = z observational, e.g., we observed values y and z can also have p(x, y | z), etc all probabilities are conditional probabilities
Note the exponential increase as we add more variables e.g., X, Y, Z are all real-valued x,y,z live in a 3-dimensional vector space p(x,y,z) is a positive function defined over this space, integrates to 1
Computing conditional probabilities is the basis of many prediction and learning problems, e.g.,
p(DJI tomorrow | DJI index last week) expected value of [DJI tomorrow | DJI index next week) most likely value of parameter given observed data
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Computing Conditional Probabilities
Variables A, B, C, D
All distributions of interest related to A,B,C,D can be computed from the full joint distribution p(a,b,c,d)
Conditional Independence
A is conditionally independent of B given C iff
p(a | b, c) = p(a | c) (also implies that B is conditionally independent of A given C)
Examples, using the Law of Total Probability
p(a) =
{b,c,d} p(a, b, c, d) {a,b} p(a, b, c, d) {b} p(a, b, c | d)
In words, B provides no information about A, if value of C is known Example:
a = patient has upset stomach b = patient has headache c = patient has flu
p(c,d) =
p(a,c | d) =
where p(a, b, c | d) = p(a,b,c,d)/p(d)
These are standard probability manipulations: however, we will see how to use these to make inferences about parameters and unobserved variables, given data
Note that conditional independence does not imply marginal independence
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Two Practical Problems
(Assume for simplicity each variable takes K values) Problem 1: Computational Complexity
Conditional probability computations scale as O(KN) where N is the number of variables being summed over
Two Key Ideas
Problem 1: Computational Complexity
Idea: Graphical models Structured probability models lead to tractable inference
Problem 2: Model Specification
To specify a joint distribution we need a table of O(KN) numbers Where do these numbers come from?
Problem 2: Model Specification
Idea: Probabilistic learning General principles for learning from data
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
3
Part 2: Graphical Models
probability theory is more fundamentally concerned with the structure of reasoning and causation than with numbers.
Glenn Shafer and Judea Pearl Introduction to Readings in Uncertain Reasoning, Morgan Kaufmann, 1990
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Graphical Models
Represent dependency structure with a directed graph
Node <-> random variable Edges encode dependencies Absence of edge -> conditional independence Directed and undirected versions
Examples of 3-way Graphical Models
A
B
C
Marginal Independence: p(A,B,C) = p(A) p(B) p(C)
Why is this useful?
A language for communication A language for computation
Origins:
Wright 1920s Independently developed by Spiegelhalter and Lauritzen in statistics and Pearl in computer science in the late 1980s
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Examples of 3-way Graphical Models
Examples of 3-way Graphical Models
A Conditionally independent effects: p(A,B,C) = p(B|A)p(C|A)p(A) B C B and C are conditionally independent Given A e.g., A is a disease, and we model B and C as conditionally independent symptoms given A
A C
B Independent Causes: p(A,B,C) = p(C|A,B)p(A)p(B)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
4
Examples of 3-way Graphical Models
Real-World Example
Monitoring Intensive-Care Patients 37 variables 509 parameters instead of 237
PULMEMBOLUS PAP SHUNT MINVOLSET
INTUBATION
KINKEDTUBE
VENTMACH
DISCONNECT
A
B
C
Markov dependence: p(A,B,C) = p(C|B) p(B|A)p(A)
VENTLUNG
VENITUBE PRESS
MINOVL
FIO2
VENTALV
ANAPHYLAXIS
PVSAT
ARTCO2
TPR
SAO2
INSUFFANESTH
EXPCO2
HYPOVOLEMIA
LVFAILURE
CATECHOL
LVEDVOLUME
STROEVOLUME
HISTORY
ERRBLOWOUTPUT
HR
ERRCAUTER
(figure courtesy of Kevin Murphy/Nir Friedman)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
CVP
PCWP
CO HRBP
HREKG
HRSAT
BP
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Directed Graphical Models
A p(A,B,C) = p(C|A,B)p(A)p(B) C B
Directed Graphical Models
A p(A,B,C) = p(C|A,B)p(A)p(B) C B
In general,
p(X1, X2,....XN) =
p(Xi | parents(Xi ) )
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Directed Graphical Models
A p(A,B,C) = p(C|A,B)p(A)p(B) C B
Example
D
B
E
In general,
p(X1, X2,....XN) =
p(Xi | parents(Xi ) )
A
C
F
G
Probability model has simple factored form
Directed edges => direct dependence Absence of an edge => conditional independence Also known as belief networks, Bayesian networks, causal networks
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
5
Example
D
Example
D
B
E
B
E
A
p(A, B, C, D, E, F, G) =
C
F
G
A
c
F
g
p( variable | parents )
Say we want to compute p(a | c, g)
= p(A|B)p(C|B)p(B|D)p(F|E)p(G|E)p(E|D) p(D)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example
D
Example
D
B
E
B
E
A
c
F
g
A
c
F
g
Direct calculation: p(a|c,g) = bdef p(a,b,d,e,f | c,g) Complexity of the sum is O(K4)
Reordering (using factorization):
b p(a|b) d p(b|d,c) e p(d|e) f p(e,f |g)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example
D
Example
D
B
E
B
E
A Reordering:
c
F
g
A Reordering:
c
F
g
b p(a|b) d p(b|d,c) e p(d|e) f p(e,f |g)
p(e|g)
b p(a|b) d p(b|d,c) e p(d|e) p(e|g)
p(d|g)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
6
Example
D
Example
D
B
E
B
E
A Reordering:
c
F
g
A Reordering:
c
F
g
b p(a|b) d p(b|d,c) p(d|g)
p(b|c,g)
b p(a|b) p(b|c,g)
p(a|c,g) Complexity is O(K), compared to O(K4)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
A More General Algorithm
Sketch of the MP algorithm in action
Message Passing (MP) Algorithm
Pearl, 1988; Lauritzen and Spiegelhalter, 1988 Declare 1 node (any node) to be a root Schedule two phases of message-passing nodes pass messages up to the root messages are distributed back to the leaves In time O(N), we can compute P(.)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Sketch of the MP algorithm in action
1
Sketch of the MP algorithm in action
1 2
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
7
Sketch of the MP algorithm in action
1 2
Sketch of the MP algorithm in action
1 2
3
3
4
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Complexity of the MP Algorithm
Efficient
Complexity scales as O(N K m) N = number of variables K = arity of variables m = maximum number of parents for any node Compare to O(KN) for brute-force method
Graphs with loops
D
B
E
A
C
F
G
Message passing algorithm does not work when there are multiple paths between 2 nodes
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Graphs with loops
D
Reduce to a Tree
D B, E
B
E
A
C
F
G
A
C
F
G
General approach: cluster variables together to convert graph to a tree
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
8
Reduce to a Tree
D B, E
Probability Calculations on Graphs
Structure of the graph reveals
Computational strategy Dependency relations
Complexity is typically O(K
max(number of parents)
)
If single parents (e.g., tree), -> O(K) The sparser the graph the lower the complexity
A
C
F
G
Technique can be automated
i.e., a fully general algorithm for arbitrary graphs For continuous variables: replace sum with integral For identification of most likely values Replace sum with max operator
Good news: can perform MP algorithm on this tree Bad news: complexity is now O(K2)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Hidden Markov Model (HMM)
Y1 Y2 Y3 Yn Observed
HMMs as graphical models
Computations of interest
p( Y ) =
p(Y , S = s)
-> forward-backward algorithm -> Viterbi algorithm
arg maxs p(S = s | Y)
---------------------------------------------------S1 S2 S3 Sn Hidden
Both algorithms.
computation time linear in T special cases of MP algorithm
Two key assumptions: 1. hidden state sequence is Markov 2. observation Yt is CI of all other variables given St Widely used in speech recognition, protein sequence models Motivation: switching dynamics, low-d representation of Ys, etc
Many generalizations and extensions.
Make state S continuous -> Kalman filters Add inputs -> convolutional decoding Add additional dependencies in the model Generalized HMMs
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Part 3: Connecting Probability Models to Data
(Generative Model) P(Data | Parameters)
Probabilistic Model
Recommended References for this Section: All of Statistics, L. Wasserman, Chapman and Hall, 2004 (Chapters 6,9,11) Pattern Classification and Scene Analysis, 1st ed, R. Duda and P. Hart, Wiley, 1973, Chapter 3.
Real World Data
P(Parameters | Data) (Inference)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
9
Plate Notation
Model parameters
Example: Gaussian Model
yi i=1:n
Data = {y1,yn}
yi i=1:n
Plate = rectangle in graphical model variables within a plate are replicated in a conditionally independent manner
Generative model:
p(y1,yn | , ) = p(yi | , ) = = p(data | parameters) p(D | ) where = {, }
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
The Likelihood Function
Likelihood = p(data | parameters) = p( D | ) = L ( ) Likelihood tells us how likely the observed data are conditioned on a particular setting of the parameters Details
Constants that do not involve can be dropped in defining L ( ) Often easier to work with log L ()
Comments on the Likelihood Function
Constructing a likelihood function L () is the first step in probabilistic modeling The likelihood function implicitly assumes an underlying probabilistic model M with parameters L () connects the model to the observed data Graphical models provide a useful language for constructing likelihoods
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example: Binomial Likelihood
Example with definition of IID binomial likelihood
Gaussian Model and Likelihood
Model assumptions: 1. ys are conditionally independent given model 2. each y comes from a Gaussian (Normal) density
Plots of likelihood for different data sets
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
10
Conditional Independence (CI)
CI in a likelihood model means that we are assuming data points provide no information about each other, if the model parameters are assumed known. p( D | ) = p(y1, yN | ) = p(yi | ) CI assumption Works well for (e.g.)
Patients randomly arriving at a clinic Web surfers randomly arriving at a Web site
Does not work well for
Time-dependent data (e.g., stock market) Spatial data (e.g., pixel correlations)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example: Markov Likelihood
Motivation: wish to model data in a sequence where there is sequential dependence,
e.g., a first-order Markov chain for a DNA sequence Markov modeling assumption: p(yt | yt-1, yt-2, yt) = p(yt | yt-1) = matrix of K x K transition matrix probabilities L( ) = p( D | ) = p(y1, yN | ) = p(yt | yt-1 , )
Maximum Likelihood (ML) Principle
(R. Fisher ~ 1922)
Model parameters
Data = {y1,yn}
yi i=1:n
L () = p(Data | ) = p(yi | ) Maximum Likelihood: ML = arg max{ Likelihood() } Select the parameters that make the observed data most likely
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example: ML for Gaussian Model
Maximizing the Likelihood
More generally, we analytically solve for the value that maximizes the function L ()
With p parameters, L () is a scalar function defined over a p-dimensional space 2 situations: We can analytically solve for the maxima of L ()
This is rare
We have to resort to iterative techniques to find ML
More common
General approach
Maximum Likelhood Estimate ML
Define a generative probabilistic model Define an associated likelihood (connect model to data) Solve an optimization problem to find ML
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
11
Analytical Solution for Gaussian Likelihood
Graphical Model for Regression
xi
yi i=1:n
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example: ML for Linear Regression
Generative model:
y = ax + b + Gaussian noise p(y) = N(ax + b, )
ML and Regression
Multivariate case
multiple xs, multiple regression coefficients with Gaussian noise, the ML solution is again equivalent to leastsquares (solutions to a set of linear equations)
Conditional Likelihood
L() = p(y1, yN | x1, xN, ) =
Non-linear multivariate model
With Gaussian noise we get
p(yi | xi , ) ,
= {a, b}
log L() = - [yi - f (xi ; ) ]2 Conditions for the q that maximizes L() leads to a set of p nonlinear equations in p variables e.g., f (xi ; ) = a multilayer neural network with 1000 weights Optimization = finding the maximum of a non-convex function in 1000 dimensional space! Typically use iterative local search based on gradient (many possible variations)
Can show (homework problem!) that log L() = - [yi - (a xi b) ]2 i.e., finding a,b to maximize log- likelihood is the same as finding a,b that minimizes least squares
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
The Bayesian Approach to Learning
Prior() = p( | )
The Bayesian Approach
Fully Bayesian: p( | Data) = p(Data | ) p() / p(Data) = Likelihood x Prior / Normalization term Estimating p( | Data) can be viewed as inference in a graphical model
yi i=1:n
ML is a special case = MAP with a flat prior
Maximum A Posteriori: MAP = arg max{ Likelihood() x Prior() } Fully Bayesian: p( | Data) = p(Data | ) p() / p(Data)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
yi i=1:n
12
More Comments on Bayesian Learning
fully Bayesian: report full posterior density p( |D)
For simple models, we can calculate p( |D) analytically Otherwise we empirically estimate p( |D) Monte Carlo sampling methods are very useful
More Comments on Bayesian Learning
In practice
Fully Bayesian is theoretically optimal but not always the most practical approach E.g., computational limitations with large numbers of parameters assessing priors can be tricky
Bayesian prediction (e.g., for regression):
p(y | x, D ) = integral p(y, | x, D) d = integral p(y | , x) p( |D) d -> prediction at each is weighted by p(|D) [theoretically preferable to picking a single (as in ML)]
Bayesian approach particularly useful for small data sets For large data sets, Bayesian, MAP, ML tend to agree
ML/MAP are much simpler => often used in practice
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example of Bayesian Estimation
Definition of Beta prior Definition of Binomial likelihood Form of Beta posterior Examples of plots with prior+likelihood -> posterior
Example: Bayesian Gaussian Model
yi i=1:n
Note: priors and parameters are assumed independent here
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example: Bayesian Regression
Other Examples
Bayesian examples
Bayesian neural networks
xi
Richer probabilistic models
Random effects models
yi i=1:n
Learning graphical model structure
Chow-Liu trees General graphical model structures
Model:
yi = f [xi;] + e,
e ~ N(0, 2)
Learning to align curves
Alignment of growth curves
p(yi | xi) ~ N ( f[xi;] , 2 )
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
13
Learning Shapes and Shifts
Data = smoothed growth acceleration data from teenagers EM used to learn a spline model + time-shift for each curve
Model Uncertainty
How do we know what model M to select for our likelihood function?
In general, we dont!
Original data
Data after Learning
However, we can use the data to help us infer which model from a set of possible models is best
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Method 1: Bayesian Approach
Can evaluate the evidence for each model, p(M |D) = p(D|M) p(M)/ p(D) Can get p(D|M) by integrating p(D, | M) over parameter
space (this is the marginal likelihood)
Comments on Bayesian Approach
Bayesian Model Averaging (BMA):
Instead of selecting the single best model, for prediction average over all available models (theoretically the correct thing to do) Weights used for averaging are p(M|D)
in theory p(M |D) is how much evidence exists in the data for model M More complex models are automatically penalized because of the integration over higher-dimensional parameter spaces in practice p(M|D) can rarely be computed directly Monte Carlo schemes are popular Also: approximations such as BIC, Laplace, etc
Empirical alternatives
e.g., Stacking, Bagging Idea is to learn a set of unconstrained combining weights from the data, weights that optimize predictive accuracy emulate BMA approach may be more effective in practice
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Method 2: Predictive Validation
Instead of the Bayesian approach, we could use the probability of new unseen test data as our metric for selecting models E.g., 2 models
If p(D | M1) > p(D | M2) then M1 is assigning higher probability to new data than M2 This will (with enough data) select the model that predicts the best, in a probabilistic sense Useful for problems where we have very large amounts of data and it is easy to create a large validation data set D
Example of Predictive Validation
Example from Web or text data
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
14
Data-generating process (truth)
K=1 Model Class
K=1 Model Class
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Data-generating process (truth)
Data-generating process (truth)
Closest model in terms of KL distance
Simple Model Class
Simple Model Class
Best model is relatively far from Truth => High Bias Complex Model Class
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
However,. this could be the model that best fits the observed data => High Variance Data-generating process (truth) Data-generating process (truth)
Simple Model Class
Simple Model Class
Complex Model Class
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Best model is closer to Truth => Low Bias
Complex Model Class
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
15
Hidden or Latent Variables Part 4: Models with Hidden Variables
In many applications there are 2 sets of variables:
Variables whose values we can directly measure Variables that are hidden, cannot be measured
Examples:
Speech recognition: Observed: acoustic voice signal Hidden: label of the word spoken Face tracking in images Observed: pixel intensities Hidden: position of the face in the image Text modeling Observed: counts of words in a document Hidden: topics that the document is about
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Mixture Models
p(Y) =
0.5 0.4
Component 1
Component 2
k p(Y | S=k) p(S=k)
Hidden discrete variable
p(x)
0.3 0.2 0.1
S
0 -5 0.5
0
5
10
Y
Observed variable(s)
0.4
Mixture Model
Motivation: 1. models a true process (e.g., fish example) 2. approximation for a complex process
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
p(x)
0.3 0.2 0.1 0 -5 0 5 10
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
x
0.5 0.4
2
Component 1
Component 2
1.5
Component Models
p(x)
p(x)
0 5 10
0.3 0.2 0.1 0 -5 0.5 0.4
1 0.5 0 -5 0.5
0
5
10
Mixture Model p(x)
0.4 0.3 0.2 0.1
Mixture Model
p(x)
0.3 0.2 0.1 0 -5 0 5 10
0 -5
0
5
10
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
x
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
x
16
A Graphical Model for Clustering
S
Hidden Markov Model (HMM)
Y1 Y2 Y3 Yn Observed
Hidden discrete (cluster) variable
---------------------------------------------------Y1 Yj Yd S1 S2 S3 Sn Hidden
Observed variable(s) (assumed conditionally independent given S) Clusters = p(Y1,Yd | S = s) Probabilistic Clustering = learning these probability distributions from data
Two key assumptions: 1. hidden state sequence is Markov 2. observation Yt is CI of all other variables given St Widely used in speech recognition, protein sequence models Motivation? - S can provide non-linear switching - S can encode low-dim time-dependence for high-dim Y
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Generalizing HMMs
Y1 Y2 Y3 Yn Y1
Generalizing HMMs
Y2 Y3 Yn
S1
S2
S3
Sn
S1
S2
S3
Sn
T1
T2
T3
Tn I1 I2 I3 In
Two independent state variables, e.g., two processes evolving at different time-scales
Inputs I provide context to influence switching, e.g., external forcing variables Model is still a tree -> inference is still linear
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Generalizing HMMs
Y1 Y2 Y3 Yn
Mixture Model
Si
S1
S2
S3
Sn
yi i=1:n
I1
I2
I3
In
Likelihood() = p(Data | ) = i p(yi | )
Add direct dependence between Ys to better model persistence Can merge each St and Yt to construct a tree-structured model
= i [ k p(yi |si = k , ) p(si = k) ]
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
17
Learning with Missing Data
Guess at some initial parameters 0 E-step (Inference)
For each case, and each unknown variable compute p(S | known data, 0 )
E-Step
Si
M-step (Optimization)
Maximize L() using p(S | .. ) This yields new parameter estimates 1
yi i=1:n
This is the EM algorithm:
Guaranteed to converge to a (local) maximum of L()
Dempster, Laird, Rubin, 1977
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
M-Step
E-Step
Si
Si
yi i=1:n
yi i=1:n
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
The E (Expectation) Step
The M (Maximization) Step
n objects
Current K components and parameters
n objects
New parameters for the K components
E step: Compute p(object i is in group k)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
M step: Compute , given n objects and memberships
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
18
ANEMIA PATIENTS AND CONTROLS
Complexity of EM for mixtures
Red Blood Cell Hemoglobin Concentration
4.4
4.3
4.2
4.1
n objects
K models
4
3.9
3.8
Complexity per iteration scales as O( n K f(d) )
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Data from Prof. Christine McLaren, Dept of Epidemiology, UC Irvine
3.4 3.5 3.6 3.7 Red Blood Cell Volume 3.8 3.9 4
3.7 3.3
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
EM ITERATION 1 4.4 4.4
EM ITERATION 3
Red Blood Cell Hemoglobin Concentration
4.3
Red Blood Cell Hemoglobin Concentration
3.4 3.5 3.6 3.7 3.8 3.9 4
4.3
4.2
4.2
4.1
4.1
4
4
3.9
3.9
3.8
3.8
3.7 3.3
3.7 3.3
3.4
3.5
3.6
3.7
3.8
3.9
4
Red Blood Cell Volume Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Red Blood Cell Volume
EM ITERATION 5 4.4 4.4
EM ITERATION 10
Red Blood Cell Hemoglobin Concentration
4.3
Red Blood Cell Hemoglobin Concentration
3.4 3.5 3.6 3.7 3.8 3.9 4
4.3
4.2
4.2
4.1
4.1
4
4
3.9
3.9
3.8
3.8
3.7 3.3
3.7 3.3
3.4
3.5
3.6
3.7
3.8
3.9
4
Red Blood Cell Volume Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Red Blood Cell Volume
19
EM ITERATION 15 4.4 4.4
EM ITERATION 25
Red Blood Cell Hemoglobin Concentration
4.3
Red Blood Cell Hemoglobin Concentration
3.4 3.5 3.6 3.7 3.8 3.9 4
4.3
4.2
4.2
4.1
4.1
4
4
3.9
3.9
3.8
3.8
3.7 3.3
3.7 3.3
3.4
3.5
3.6
3.7
3.8
3.9
4
Red Blood Cell Volume Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Red Blood Cell Volume
ANEMIA DATA WITH LABELS 4.4 490 480 470 4.2 460 450 440 430 420 3.8 410 400
LOG-LIKELIHOOD AS A FUNCTION OF EM ITERATIONS
Red Blood Cell Hemoglobin Concentration
4.3
4.1
4
3.9
Anemia Group
3.7 3.3
3.4
3.5
3.6
3.7
3.8
3.9
4
Log-Likelihood
Control Group
0
5
10
15
20
25
Red Blood Cell Volume Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
EM Iteration
Example of a Log-Likelihood Surface
Log-Likelihood Cross-Section
-45
-50
50 100 150
-55
Log-likelihood
-60
Mean 2
200 250 300 350 400
-65
-70
-75
-80 -50
10 20 30 40 Log Scale 50 Sigma 280 for 60 70 90 100
-40
-30
-20
-10
0
10
20
Log(sigma)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
20
HMMs
1
Y1
Y2
Y3
YN
Y1
Y2
Y3
YN
S1
S2
S3
SN
S1
S2
S3
SN
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
E-Step 1 1
(linear inference)
Y1
Y2
Y3
YN
Y1
Y2
Y3
YN
S1
S2
S3
SN
S1
S2
S3
SN
2
2
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
M-Step 1
(closed form)
Alternatives to EM
Method of Moments
EM is more efficient
Y1
Y2
Y3
YN
Direct optimization
e.g., gradient descent, Newton methods EM is usually simpler to implement
Sampling (e.g., MCMC)
S1 S2 S3 SN
Minimum distance, e.g.,
2
IMSE ) = E ( p(x | ) q(x)) (
[
2
]
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
21
Mixtures as Data Simulators
Mixtures with Markov Dependence
For i = 1 to N classk ~ p(class1, class2, ., class K) xi ~ p(x | classk) end
For i = 1 to N classk ~ p(class1, class2, ., class K | class[xi-1] ) xi ~ p(x | classk) end
Current class depends on previous class (Markov dependence) This is a hidden Markov model
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Mixtures of Sequences
Mixtures of Curves
For i = 1 to N classk ~ p(class1, class2, ., class K) while non-end state xij ~ p(xj | xj-1, classk) end end
Markov sequence model Produces a variable length sequence
For i = 1 to N classk ~ p(class1, class2, ., class K) Li ~ p(Li | classk) for i = 1 to Li yij ~ f(y | xj, classk) + ek end end
Independent variable x Class-dependent curve model
Length of curve
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Mixtures of Image Models
More generally..
p( Di ) = p( Di | ck ) k
k =1 K
For i = 1 to N classk ~ p(class1, class2, ., class K) Global scale sizei ~ p(size|classk) for i = 1 to Vi-1 Number of vertices intensityi ~ p(intensity | classk) end end
Generative Model - select a component for ck individual i - generate data according to p(Di | ck) - p(Di | ck) can be very general - e.g., sets of sequences, spatial patterns, etc [Note: given p(Di | ck), we can define an EM algorithm]
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Pixel generation model
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
22
Part 5: Case Studies
(i) Simulating and forecasting rainfall data (ii) Curve clustering with cyclones (iii) Topic modeling from text documents and if time permits.. (iv) Sequence clustering for Web data (v) Analysis of time-course gene expression data
Case Study 1: Simulating and Predicting Rainfall Patterns
Joint work with:
Andy Robertson, International Research Institute for Climate Prediction Sergey Kirshner, Department of Computer Science, UC Irvine
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
DATA FOR ONE RAIN-STATION
Spatio-Temporal Rainfall Data
Northeast Brazil 1975-2002 90-day time series 24 years 10 stations
5
10
15
YEAR
20
25
30
35
10
20
30
40
50
60
70
80
90
DAY
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Modeling Goals
Downscaling
Modeling interannual variability coupling rainfall to large-scale effects like El Nino
HMMs for Rainfall Modeling
Y1 Y2 Y3 YN
Prediction
e.g., hindcasting of missing data
S1
S2
S3
SN
Seasonal Forecasts
E.g. on Dec 1 produce simulations of likely 90-day winters
I1
I2
I3
IN
S = unobserved weather state Y = spatial rainfall pattern (outputs) I = atmospheric variables (inputs)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
23
Learned Weather States
States provide an interpretable view of spatio-temporal relationships in the data
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Weather States for Kenya
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Spatial Chow-Liu Trees
Spatial distribution given a state is a tree structure (a graphical model) Useful intermediate between full pair-wise model and conditional independence Optimal topology learned from data using minimum spanning tree algorithm Can use priors based on distance, topography Tree-structure over time also
-
-
-
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
24
Missing Data
Error rate v. fraction of missing data
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Summary
Simple empirical probabilistic models can be very helpful in interpreting large scientific data sets
e.g., HMM states provide scientists with a basic but useful classification of historical spatial rainfall patterns
Case Study 2: Clustering Cyclone Trajectories
Joint work with:
Suzana Camargo, Andy Robertson, International Research Institute for Climate Prediction Scott Gaffney, Department of Computer Science, UC Irvine
Graphical models provide glue to link together different information
Spatial Temporal Hidden states, etc
Generative aspect of probabilistic models can be quite useful, e.g., for simulation Missing data is handled naturally in a probabilistic framework
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Storm Trajectories
2 1.5 Normalized log-ratio of intensity 1 0.5 0
Microarray Gene Expression Data
TIME-COURSE GENE EXPRESSION DATA
-0.5 -1 Yeast Cell-Cycle Data Spellman et al (1998) 0 2 4 6 8 10 12 Time (7-minute increments) 14 16 18
-1.5 -2
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
25
Clustering non-vector data
Challenges with the data.
May be of different lengths, sizes, etc Not easily representable in vector spaces Distance is not naturally defined a priori
Graphical Models for Curves
Data = { (y1,t1),. yT, tT) }
t
Possible approaches
convert into a fixed-dimensional vector space Apply standard vector clustering but loses information use hierarchical clustering But O(N2) and requires a distance measure probabilistic clustering with mixtures Define a generative mixture model for the data Learn distance and clustering simultaneously
y n
y = f(t ; ) e.g., y = at2 + bt + c,
= {a, b, c}
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Graphical Models for Curves
Example
t
y
y
T points
y ~ Gaussian density with mean = f(t ; ), variance = 2
t
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Example
Graphical Models for Sets of Curves
f(t ; ) <- this is hidden y
t
y T
N curves
t Each curve: P(yi | ti, ) = product of Gaussians
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
26
Curve-Specific Transformations
Note: we can learn function parameters and shifts simultaneously with EM
Learning Shapes and Shifts
Data = smoothed growth acceleration data from teenagers EM used to learn a spline model + time-shift for each curve
t
Original data
Data after Learning
y T
N curves
e.g., yi = at2 + bt + c + i,
= {a, b, c, 1,.N}
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Clustering: Mixtures of Curves
The Learning Problem
K cluster models
c
t
Each cluster is a shape model E[Y] = f(X;) with its own parameters
N observed curves: for each curve we learn
P(cluster k | curve data) distribution on alignments, shifts, scaling, etc, given data
y T
N curves
Requires simultaneous learning of
Cluster models Curve transformation parameters
Results in an EM algorithm where E and M step are tractable
Each set of trajectory points comes from 1 of K models Model for group k is a Gaussian curve model Marginal probability for a trajectory = mixture model
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
5
Simulated Curves (K=2 Clusters)
1.5 1
Simulated Data after Alignment
4
0.5
3
0 -0.5 -1
2
1
-1.5
0
-2 -2.5 -3
-1
-2
0
5
10
15
20
25
2
4
6
8
10
12
14
16
18
20
Time
Time
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
27
Results on Simulated Data
Method True Model EM with Alignment Standard EM K-means Classification Accuracy 1 0.99 0.89 0.79 LogP 2.01 1.34 -7.87 Error in Mean 0 0.019 0.171 0.424 WithinCluster 0.050 0.048 0.105 0.129
Clusters of Trajectories
*Averaged over 50 train/test sets
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
TROPICAL CYCLONES Western North Pacific 1983-2002
Cluster Shapes for Pacific Cyclones
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Summary
Graphical models provide a flexible representational language for modeling complex scientific data
can build complex models from simpler building blocks
Systematic variability in the data can be handled in a principled way
Variable length time-series Misalignments in trajectories
Generative probabilistic models are interpretable and understandable by scientists
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
28
Enron email data
Case Study 3: Topic Modeling from Text Documents
Joint work with:
Mark Steyvers, Dave Newman, Chaitanya Chemudugunta, UC Irvine Michal Rosen-Zvi, Hebrew University, Jerusalem Tom Griffiths, Brown University
250,000 emails 5000 authors 1999-2002
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Questions of Interest
What topics do these documents span? Which documents are about a particular topic? How have topics changed over time? What does author X write about? Who is likely to write about topic Y? Who wrote this specific document? and so on..
Graphical Model for Clustering
Cluster-Word distributions
z
Cluster for document
w
Word
n D
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Graphical Model for Topics
Topic = probability distribution over words
TOPIC 209 WORD PROBABILISTIC BAYESIAN PROB. 0.0778 0.0671 0.0532 0.0309 0.0308 0.0257 0.0253 0.0253 0.0229 0.0219 ...
TOPIC 289 WORD RETRIEVAL TEXT DOCUMENTS INFORMATION DOCUMENT CONTENT INDEXING RELEVANCE COLLECTION RELEVANT ... PROB. 0.1179 0.0853 0.0527 0.0504 0.0441 0.0242 0.0205 0.0159 0.0146 0.0136 ...
Document-Topic distributions
P( w | z )
Topic-Word distributions
z
Topic
PROBABILITY CARLO MONTE DISTRIBUTION INFERENCE PROBABILITIES
w
Word
CONDITIONAL PRIOR
n D
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
....
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
29
Topics vs. Other Approaches
Clustering documents
Computationally simpler But a less accurate and less flexible model
What can Topic Models be used for?
Queries Who writes on this topic?
e.g., finding experts or reviewers in a particular area
LSI/LSA
Projects words into a K-dimensional hidden space Less interpretable Not generalizable E.g., authors or other side-information Not as accurate E.g., precision-recall: Hoffman, Blei et al, Buntine, etc
What topics does this person do research on? Comparing groups of authors or documents Discovering trends over time Detecting unusual papers and authors Interactive browsing of a digital library via topics Parsing documents (and parts of documents) by topic and more..
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Topic Models (aka LDA model)
next-generation text modeling, after LSI More flexible and more accurate (in prediction) Linear time complexity in fitting the model
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
0.012
9
x 10
-3
CHANGING TRENDS IN COMPUTER SCIENCE
8
0.01 OPERATING SYSTEMS
SECURITY-RELATED TOPICS
WWW
7
0.008 PROGRAMMING LANGUAGES
Topic Probability
6
Topic Probability
0.006 INFORMATION RETRIEVAL 0.004
5 COMPUTER SECURITY
4
3 ENCRYPTION
0.002
2
0 1990
1992
1994
1996
1998
2000
2002
1 1990
1992
1994
1996
1998
2000
2002
Year
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Year
Enron email data
250,000 emails 5000 authors 1999-2002 1999TOPIC 36 WORD FEEDBACK PERFORMANCE PROCESS PEP MANAGEMENT COMPLETE QUESTIONS SELECTED COMPLETED SYSTEM SENDER perfmgmt perf eval process enron announcements *** ***
Enron email topics
TOPIC 72 PROB. 0.0781 0.0462 0.0455 0.0446 0.03 0.0205 0.0203 0.0187 0.0146 0.0146 PROB. 0.2195 0.0784 0.0489 0.0089 0.0048 WORD PROJECT PLANT COST UNIT FACILITY SITE PROJECTS CONTRACT UNITS SENDER *** *** *** *** *** PROB. 0.0514 0.028 0.0182 0.0166 0.0165 0.0136 0.0117 0.011 0.0106 PROB. 0.0288 0.022 0.0123 0.0111 0.0108 TOPIC 54 WORD FERC MARKET ISO ORDER FILING COMMENTS PRICE CALIFORNIA FILED SENDER *** *** *** *** *** PROB. 0.0554 0.0328 0.0226 0.0212 0.0149 0.0116 0.0116 0.0110 0.0110 PROB. 0.0532 0.0454 0.0384 0.0334 0.0317 TOPIC 23 WORD AIR MTBE EMISSIONS CLEAN EPA PENDING SAFETY WATER GASOLINE SENDER *** *** *** *** *** PROB. 0.0232 0.019 0.017 0.0143 0.0133 0.0129 0.0104 0.0092 0.0086 PROB. 0.1339 0.0275 0.0205 0.0166 0.0129 ENVIRONMENTAL 0.0291
CONSTRUCTION 0.0169
COMMISSION 0.0215
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
30
Non-work Topics
TOPIC 66 WORD HOLIDAY PARTY YEAR SEASON COMPANY CELEBRATION ENRON TIME RECOGNIZE MONTH SENDER chairman & ceo *** *** *** PROB. 0.0857 0.0368 0.0316 0.0305 0.0255 0.0199 0.0198 0.0194 0.019 0.018 PROB. 0.131 0.0102 0.0046 0.0022 TOPIC 182 WORD TEXANS WIN FOOTBALL FANTASY SPORTSLINE PLAY TEAM GAME SPORTS GAMES SENDER PROB. 0.0145 0.0143 0.0137 0.0129 0.0129 0.0123 0.0114 0.0112 0.011 0.0109 PROB. TOPIC 113 WORD GOD LIFE MAN PEOPLE CHRIST FAITH LORD JESUS SPIRITUAL VISIT SENDER crosswalk com wordsmith *** *** PROB. 0.0357 0.0272 0.0116 0.0103 0.0092 0.0083 0.0079 0.0075 0.0066 0.0065 PROB. 0.2358 0.0208 0.0107 0.0061 TOPIC 109 WORD AMAZON GIFT CLICK SAVE SHOPPING OFFER HOLIDAY RECEIVE SHIPPING FLOWERS SENDER amazon com jos a bank sharperimageoffers travelocity com barnes & noble com PROB. 0.0312 0.0226 0.0193 0.0147 0.0140 0.0124 0.0122 0.0102 0.0100 0.0099 PROB. 0.1344 0.0266 0.0136 0.0094 0.0089
Topical Topics
TOPIC 18 WORD POWER CALIFORNIA ELECTRICITY UTILITIES PRICES MARKET PRICE UTILITY CUSTOMERS ELECTRIC SENDER *** *** *** *** *** PROB. 0.0915 0.0756 0.0331 0.0253 0.0249 0.0244 0.0207 0.0140 0.0134 0.0120 PROB. 0.1160 0.0518 0.0284 0.0272 0.0266 TOPIC 22 WORD STATE PLAN CALIFORNIA RATE SOCAL POWER BONDS MOU SENDER *** *** *** *** *** PROB. 0.0253 0.0245 0.0137 0.0131 0.0119 0.0114 0.0109 0.0107 PROB. 0.0395 0.0337 0.0295 0.0251 0.0202 TOPIC 114 WORD COMMITTEE BILL HOUSE SENATE CONGRESS PRESIDENT DC SENDER *** *** *** *** *** PROB. 0.0197 0.0189 0.0169 0.0135 0.0112 0.0105 0.0093 PROB. 0.0696 0.0453 0.0255 0.0173 0.0317 TOPIC 194 WORD LAW TESTIMONY ATTORNEY SETTLEMENT LEGAL EXHIBIT CLE SOCALGAS METALS PERSON Z SENDER *** *** *** *** *** PROB. 0.0380 0.0201 0.0164 0.0131 0.0100 0.0098 0.0093 0.0093 0.0091 0.0083 PROB. 0.0696 0.0453 0.0255 0.0173 0.0317
POLITICIAN Y 0.0137 BANKRUPTCY 0.0126
WASHINGTON 0.0140 POLITICIAN X 0.0114
LEGISLATION 0.0099
cbs sportsline com 0.0866 houston texans 0.0267 houstontexans 0.0203 sportsline rewards 0.0175 pro football 0.0136
doctor dictionary 0.0101
general announcement 0.0017
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Using Topic Models for Information Retrieval
Level 2 Precison-Recall Curve 0.9 TF-IDF KLDist p(q|d)
Author-Topic Models
The author-topic model
a probabilistic model linking authors and topics authors -> topics -> words Topic = distribution over words Author = distribution over topics Document = generated from a mixture of author distributions Learns about entities based on associated text
0.85
0.8 Precision 0.75 0.7
Can be generalized
0.65 -11 -10 -9 -8 Recall -7 -6 -5 -4
Replace author with any categorical doc information e.g., publication type, source, year, country of origin, etc
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Author-Topic Graphical Model a
Author-Topic distributions
Learning Author-Topic Models from Text
Full probabilistic model
Power of statistical learning can be leveraged Learning algorithm is linear in number of word occurrences Scalable to very large data sets Completely automated (no tweaking required) completely unsupervised, no labels
x
Author
Topic-Word distributions
z
Topic
Query answering
A wide variety of queries can be answered: Which authors write on topic X? What are the spatial patterns in usage of topic Y? How have authors A, B and C changed over time? Queries answered using probabilistic inference Query time is real-time (learning is offline)
w
Word
n D
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
31
Author-Topic Models for CiteSeer
TOPIC 205 WORD DATA MINING ATTRIBUTES DISCOVERY ASSOCIATION LARGE KNOWLEDGE DATABASES ATTRIBUTE DATASETS AUTHOR Han_J Rastogi_R Zaki_M Shim_K Ng_R Liu_B Mannila_H Brin_S Liu_H Holder_L PROB. 0.1563 0.0674 0.0462 0.0401 0.0335 0.0280 0.0260 0.0210 0.0188 0.0165 PROB. 0.0196 0.0094 0.0084 0.0077 0.0060 0.0058 0.0056 0.0054 0.0047 0.0044 TOPIC 209 WORD BAYESIAN PROBABILITY CARLO MONTE INFERENCE CONDITIONAL PRIOR AUTHOR Friedman_N Heckerman_D Ghahramani_Z Koller_D Jordan_M Neal_R Raftery_A Lukasiewicz_T Halpern_J Muller_P PROB. 0.0671 0.0532 0.0309 0.0308 0.0253 0.0229 0.0219 PROB. 0.0094 0.0067 0.0062 0.0062 0.0059 0.0055 0.0054 0.0053 0.0052 0.0048 TOPIC 289 WORD RETRIEVAL TEXT DOCUMENTS INFORMATION DOCUMENT CONTENT INDEXING RELEVANCE COLLECTION RELEVANT AUTHOR Oard_D Croft_W Jones_K Schauble_P Voorhees_E Singhal_A Hawking_D Merkl_D Allan_J Doermann_D PROB. 0.1179 0.0853 0.0527 0.0504 0.0441 0.0242 0.0205 0.0159 0.0146 0.0136 PROB. 0.0110 0.0056 0.0053 0.0051 0.0050 0.0048 0.0048 0.0042 0.0040 0.0039 TOPIC 10 WORD QUERY QUERIES INDEX DATA JOIN INDEXING PROB. 0.1848 0.1367 0.0488 0.0368 0.0260 0.0180 PROBABILISTIC 0.0778
Author-Profiles
Author = Andrew McCallum, U Mass:
Topic 1: classification, training, generalization, decision, data, Topic 2: learning, machine, examples, reinforcement, inductive,.. Topic 3: retrieval, text, document, information, content,
DISTRIBUTION 0.0257 PROBABILITIES 0.0253
PROCESSING 0.0113 AGGREGATE 0.0110 ACCESS PRESENT AUTHOR Suciu_D Naughton_J Levy_A DeWitt_D Wong_L Ross_K Hellerstein_J Lenzerini_M Moerkotte_G 0.0102 0.0095 PROB. 0.0102 0.0095 0.0071 0.0068 0.0067 0.0061 0.0059 0.0054 0.0053
Author = Hector Garcia-Molina, Stanford: - Topic 1: query, index, data, join, processing, aggregate.
- Topic 2: transaction, concurrency, copy, permission, distributed. - Topic 3: source, separation, paper, heterogeneous, merging..
Author = Jerry Friedman, Stanford:
Topic 1: regression, estimate, variance, data, series, Topic 2: classification, training, accuracy, decision, data, Topic 3: distance, metric, similarity, measure, nearest,
Chakrabarti_K 0.0064
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
AUTHOR = Outlook Migration Team (132 emails) PROB. TOPIC .9910 .0016 .0005 .0004 82 91 77 83 WORDS OUTLOOK, MIGRATION, NOTES, OWA, INFORMATION, EMAIL, BUTTON, SEND, MAILBOX, ACCESS ENRON, CORP, SERVICES, BROADBAND, EBS, ADDITION, BUILDING, INCLUDES, ATTACHMENT, COMPETITION EMAIL, ADDRESS, INTERNET, SEND, ECT, MESSAGING, BUSINESS, ADMINISTRATION, QUESTIONS, SUPPORT ISSUE, GENERAL, ISSUES, CASE, DUE, INVOLVED, DISCUSSION, MENTIONED, PLACE, POINT AUTHOR = The Motley Fool (145 emails) PROB. TOPIC .3593 .0773 .0713 .0660 17 177 169 200 WORDS ANALYST, SERVICES, INDUSTRY, TELECOM, ENERGY, MARKETS, FOOL, BANDWIDTH, ESOURCE, TRAINING ACCOUNT, ONLINE, OFFER, TRADE, TIME, INVESTMENT, ACCOUNTS, FREE, INFORMATION, ACCESS HTTP, WWW, GIF, IMAGES, ASP, SPACER, EMAIL, CGI, HTML, CLICK DECEMBER, JANUARY, MARCH, NOVEMBER, FEBRUARY, WEEK, FRIDAY, SEPTEMBER, WEDNESDAY, TUESDAY AUTHOR = Individual A (411 emails) PROB. TOPIC .1855 .1289 .0920 .0719 105 54 44 124 WORDS CUSTOMERS, RATE, PG, CPUC, SCE, UTILITY, ACCESS, CUSTOMER, DECISION, DIRECT FERC, MARKET, ISO, COMMISSION, ORDER, FILING, COMMENTS, PRICE, CALIFORNIA, FILED MILLION, BILLION, YEAR, NEWS, CORP, CONTRACTS, GAS, COMPANY, COMPANIES, WATER STATE, PUBLIC, DAVIS, SAN, GOVERNOR, COMMISSION, GOV, SUMMER, COSTS, HOUR AUTHOR = Individual B (193 emails) PROB. TOPIC .2590 .0902 .0645 .0599 178 74 70 116 WORDS CAPACITY, GAS, EL, PASO, PIPELINE, MMBTU, CALIFORNIA, SHIPPERS, MMCF, RATE GAS, CONTRACT, DAY, VOLUMES, CHANGE, DAILY, DAN, MONTH, KIM, CONTRACTS GOOD, TIME, WORK, TALK, DON, BACK, WEEK, DIDN, THOUGHT, SEND SYSTEM, FACILITIES, TIME, EXISTING, SERVICES, BASED, ADDITIONAL, CURRENT, END, AREA AUTHOR = Individual C (159 emails) PROB. TOPIC .1268 .1045 .0815 .0784 42 189 176 135 WORDS MEXICO, ARGENTINA, ANDREA, BRAZIL, TAX, OFFICE, LOCAL, RICHARD, COPY, STAFF AGREEMENT, ENA, LANGUAGE, CONTRACT, TRANSACTION, DEAL, FORWARD, REVIEW, TERMS, QUESTIONS MARK, TRADING, LEGAL, LONDON, DERIVATIVES, ENRONONLINE, TRADE, ENTITY, COUNTERPARTY, HOUSTON SUBJECT, REQUIRED, INCLUDING, BASIS, POLICY, BASED, APPROVAL, APPROVED, RIGHTS, DAYS
PubMed-Query Topics
TOPIC 188 WORD BIOLOGICAL AGENTS THREAT WEAPONS POTENTIAL ATTACK CHEMICAL WARFARE ANTHRAX AUTHOR Atlas_RM Tegnell_A Aas_P Greenfield_RA Bricaire_F PROB. 0.1002 0.0889 0.0396 0.0328 0.0305 0.0290 0.0288 0.0219 0.0146 PROB. 0.0044 0.0036 0.0036 0.0032 0.0032 TOPIC 63 WORD PLAGUE MEDICAL MEDICINE HISTORY EPIDEMIC GREAT CHINESE FRENCH AUTHOR Kroly_L Jian-ping_Z Sabbatani_S Bowers_JZ PROB. 0.0296 0.0287 0.0266 0.0203 0.0106 0.0091 0.0083 0.0082 PROB. 0.0089 0.0085 0.0080 0.0045 TOPIC 85 WORD BOTULISM BOTULINUM TOXIN TYPE CLOSTRIDIUM INFANT NEUROTOXIN BONT FOOD PARALYSIS AUTHOR Hatheway_CL Schiavo_G Sugiyama_H Arnon_SS Simpson_LL PROB. 0.1014 0.0888 0.0877 0.0669 0.0340 0.0245 0.0184 0.0167 0.0134 0.0124 PROB. 0.0254 0.0141 0.0111 0.0108 0.0093 TOPIC 32 WORD HIV PROTEASE INHIBITORS INHIBITOR PLASMA APV DRUG RITONAVIR PROB. 0.0916 0.0563 0.0366 0.0220 0.0204 0.0169 0.0169 0.0164
CENTURY 0.0280
AMPRENAVIR 0.0527
BIOTERRORISM 0.0348
EPIDEMICS 0.0090
IMMUNODEFICIENC 0.0150 AUTHOR Sadler_BM Tisdale_M Lou_Y Stein_DS Haubrich_R PROB. 0.0129 0.0118 0.0069 0.0069 0.0061
Theodorides_J 0.0045
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
PubMed-Query Topics
TOPIC 40 WORD ANTHRACIS ANTHRAX BACILLUS SPORES CEREUS SPORE SUBTILIS STERNE PROB. 0.1627 0.1402 0.1219 0.0614 0.0382 0.0274 0.0152 0.0124 TOPIC 89 WORD SARIN AGENT GAS AGENTS VX NERVE ACID TOXIC PROB. 0.0454 0.0332 0.0312 0.0268 0.0264 0.0232 0.0220 0.0197 CHEMICAL 0.0578 TOPIC 104 WORD HD MUSTARD EXPOSURE SM SULFUR SKIN EXPOSED AGENT EPIDERMAL DAMAGE AUTHOR Smith_WJ Lindsay_CD Sawyer_TW Meier_HL PROB. 0.0657 0.0639 0.0444 0.0353 0.0343 0.0208 0.0185 0.0140 0.0129 0.0116 PROB. 0.0219 0.0214 0.0146 0.0139 TOPIC 178 WORD ENZYME ACTIVE SUBSTRATE SITE ENZYMES REACTION FOLD CATALYTIC RATE AUTHOR Masson_P Kovach_IM Schramm_VL Barak_D Broomfield_CA PROB. 0.0938 0.0429 0.0399 0.0361 0.0308 0.0225 0.0176 0.0154 0.0148 PROB. 0.0166 0.0137 0.0094 0.0076 0.0072
PubMed: Topics by Country
ISRAEL, n=196 authors
TOPIC 188 p=0.049 BIOLOGICAL AGENTS THREAT BIOTERRORISM W EAPONS POTENTIAL ATTACK CHEMICAL W ARFARE ANTHRAX TOPIC 6 p=0.045 INJURY INJURIES W AR TERRORIST MILITARY MEDICAL VICTIMS TRAUMA BLAST VETERANS TOPIC 133 p=0.043 HEALTH PUBLIC CARE SERVICES EDUCATION NATIONAL COMMUNITY INFORMATION PREVENTION LOCAL TOPIC 104 p=0.027 HD
MUSTARD EXPOSURE
TOPIC 159 p=0.025 EMERGENCY RESPONSE MEDICAL
PREPAREDNESS
SM SULFUR SKIN
EXPOSED
THURINGIENSIS 0.0177
SUBSTRATES 0.0201
AGENT
EPIDERMAL
INHALATIONAL 0.0104 AUTHOR Mock_M Phillips_AP Welkos_SL Turnbull_PC Fouet_A PROB. 0.0203 0.0125 0.0083 0.0071 0.0067
PRODUCTS 0.0170 AUTHOR Minami_M Hoskin_FC PROB. 0.0093 0.0092
DAMAGE
DISASTER MANAGEMENT TRAINING EVENTS BIOTERRORISM LOCAL
Monteiro-Riviere_NA 0.0284
CHINA, n=1775 authors
TOPIC 177 TOPIC 7 TOPIC 79 p=0.045 p=0.026 p=0.024 SARS RENAL FINDINGS RESPIRATORY HFRS CHEST SEVERE VIRUS CT COV SYNDROME LUNG SYNDROME FEVER CLINICAL HEMORRHAGIC PULMONARY ACUTE CORONAVIRUS HANTAVIRUS ABNORMAL CHINA HANTAAN Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 INVOLVEMENT KONG PUUMALA COMMON TOPIC 49 p=0.024
METHODS
Benschop_HP 0.0090 Raushel_FM 0.0084 Wild_JR 0.0075
RESULTS
CONCLUSION OBJECTIVE
CONCLUSIONS BACKGROUND
STUDY
OBJECTIVES INVESTIGATE
TOPIC 197 p=0.023 PATIENTS HOSPITAL PATIENT ADMITTED TW ENTY HOSPITALIZED CONSECUTIVE
PROSPECTIVELY
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
DIAGNOSED
32
PubMed-Query: Topics by Country
EPIDERMAL
POTENTIAL ATTACK CHEMICAL W ARFARE ANTHRAX
MEDICAL VICTIMS TRAUMA BLAST VETERANS
NATIONAL COMMUNITY INFORMATION PREVENTION LOCAL
SKIN
EXPOSED
AGENT
DAMAGE
MANAGEMENT TRAINING EVENTS BIOTERRORISM LOCAL
Extended Models
Conditioning on non-authors
side-information other than authors e.g., date, publication venue, country, etc can use citations as authors
CHINA, n=1775 authors
TOPIC 177 p=0.045 SARS RESPIRATORY SEVERE COV SYNDROME ACUTE CORONAVIRUS CHINA KONG PROBABLE TOPIC 7 p=0.026 RENAL HFRS VIRUS SYNDROME FEVER
HEMORRHAGIC
TOPIC 79 p=0.024 FINDINGS CHEST CT LUNG CLINICAL
PULMONARY
TOPIC 49 p=0.024
METHODS
RESULTS
CONCLUSION OBJECTIVE
CONCLUSIONS BACKGROUND
HANTAVIRUS HANTAAN PUUMALA
HANTAVIRUSES
ABNORMAL
INVOLVEMENT
STUDY
OBJECTIVES INVESTIGATE
TOPIC 197 p=0.023 PATIENTS HOSPITAL PATIENT ADMITTED TW ENTY HOSPITALIZED CONSECUTIVE
PROSPECTIVELY
Fictitious authors and common author
Allow 1 unique fictitious author per document Captures document specific effects Assign 1 common fictitious author to each document Captures broad topics that are used in many documents
COMMON
RADIOGRAPHIC
DESIGN
DIAGNOSED PROGNOSIS
Semantics and syntax model
Semantic topics = topics that are specific to certain documents Syntactic topics = broad, across many documents Probabilistic model that learns each type automatically
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
(Griffiths et al., NIPS 2004 slides courtesy of Mark Steyvers and Tom Griffiths, PNAS Symposium presentation, 2003) x=1
0.8 z = 2 0.6
SCIENTIFIC KNOWLEDGE WORK RESEARCH MATHEMATICS 0.2 0.2 0.2 0.2 0.2
Scientific syntax and semantics
x=2
OF 0.6 FOR 0.3 BETWEEN 0.1
Factorization of language based on statistical dependency patterns: long-range, document specific dependencies
semantics: probabilistic topics z w z w x z
z = 1 0.4
HEART LOVE SOUL TEARS JOY 0.2 0.2 0.2 0.2 0.2
0.7 0.3 0.2 0.1
x=3
THE 0.6 A 0.3 MANY 0.1
0.9
w x
short-range dependencies constant across all documents
x
syntax: probabilistic regular grammar
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005 Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
x=2 x=1
z = 1 0.4
HEART LOVE SOUL TEARS JOY 0.2 0.2 0.2 0.2 0.2
x=2 x=1
z = 1 0.4
HEART LOVE SOUL TEARS JOY 0.2 0.2 0.2 0.2 0.2
0.8 z = 2 0.6
OF 0.6 FOR 0.3 BETWEEN 0.1
0.8 z = 2 0.6
OF 0.6 FOR 0.3 BETWEEN 0.1
SCIENTIFIC KNOWLEDGE WORK RESEARCH MATHEMATICS
0.2 0.2 0.2 0.2 0.2
0.7 0.3 0.2 0.1
SCIENTIFIC KNOWLEDGE WORK RESEARCH MATHEMATICS
0.2 0.2 0.2 0.2 0.2
0.7 0.3 0.2 0.1
x=3
THE 0.6 A 0.3 MANY 0.1
x=3
THE 0.6 A 0.3 MANY 0.1
0.9
0.9
THE
THE LOVE
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
33
x=2 x=1
z = 1 0.4
HEART LOVE SOUL TEARS JOY 0.2 0.2 0.2 0.2 0.2
x=2 x=1
z = 1 0.4
HEART LOVE SOUL TEARS JOY 0.2 0.2 0.2 0.2 0.2
0.8 z = 2 0.6
OF 0.6 FOR 0.3 BETWEEN 0.1
0.8 z = 2 0.6
OF 0.6 FOR 0.3 BETWEEN 0.1
SCIENTIFIC KNOWLEDGE WORK RESEARCH MATHEMATICS
0.2 0.2 0.2 0.2 0.2
0.7 0.3 0.2 0.1
SCIENTIFIC KNOWLEDGE WORK RESEARCH MATHEMATICS
0.2 0.2 0.2 0.2 0.2
0.7 0.3 0.2 0.1
x=3
THE 0.6 A 0.3 MANY 0.1
x=3
THE 0.6 A 0.3 MANY 0.1
0.9
0.9
THE LOVE OF
THE LOVE OF RESEARCH
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Semantic topics
29 AGE LIFE AGING OLD YOUNG CRE AGED SENESCENCE MORTALITY AGES CR INFANTS SPAN MEN WOMEN SENESCENT LOXP INDIVIDUALS CHILDREN NORMAL 46 SELECTION POPULATION SPECIES POPULATIONS GENETIC EVOLUTION SIZE NATURAL VARIATION FITNESS MUTATION PER NUCLEOTIDE RATES RATE HYBRID DIVERSITY SUBSTITUTION SPECIATION EVOLUTIONARY 51 LOCI LOCUS ALLELES ALLELE GENETIC LINKAGE POLYMORPHISM CHROMOSOME MARKERS SUSCEPTIBILITY ALLELIC POLYMORPHIC POLYMORPHISMS RESTRICTION FRAGMENT HAPLOTYPE GENE LENGTH DISEASE MICROSATELLITE 71 TUMOR CANCER TUMORS BREAST HUMAN CARCINOMA PROSTATE MELANOMA CANCERS NORMAL COLON LUNG APC MAMMARY CARCINOMAS MALIGNANT CELL GROWTH METASTATIC EPITHELIAL 115 MALE FEMALE MALES FEMALES SPERM SEX SEXUAL MATING REPRODUCTIVE OFFSPRING PHEROMONE SOCIAL EGG BEHAVIOR EGGS FERTILIZATION MATERNAL PATERNAL FERTILITY GERM 125 MEMORY LEARNING BRAIN TASK CORTEX SUBJECTS LEFT RIGHT SONG TASKS HIPPOCAMPAL PERFORMANCE SPATIAL PREFRONTAL COGNITIVE TRAINING TOMOGRAPHY FRONTAL MOTOR EMISSION
Syntactic classes
5 IN FOR ON BETWEEN DURING AMONG FROM UNDER WITHIN THROUGHOUT THROUGH TOWARD INTO AT INVOLVING AFTER ACROSS AGAINST WHEN ALONG
8 ARE WERE WAS IS WHEN REMAIN REMAINS REMAINED PREVIOUSLY BECOME BECAME BEING BUT GIVE MERE APPEARED APPEAR ALLOWED NORMALLY EACH
14 THE THIS ITS THEIR AN EACH ONE ANY INCREASED EXOGENOUS OUR RECOMBINANT ENDOGENOUS TOTAL PURIFIED TILE FULL CHRONIC ANOTHER EXCESS
25 26 30 SUGGEST LEVELS RESULTS INDICATE NUMBER ANALYSIS SUGGESTING LEVEL DATA SUGGESTS RATE STUDIES SHOWED TIME STUDY REVEALED CONCENTRATIONS FINDINGS SHOW VARIETY EXPERIMENTS DEMONSTRATE RANGE OBSERVATIONS INDICATING CONCENTRATION HYPOTHESIS PROVIDE DOSE ANALYSES SUPPORT FAMILY ASSAYS INDICATES SET POSSIBILITY PROVIDES FREQUENCY MICROSCOPY INDICATED SERIES PAPER DEMONSTRATED AMOUNTS WORK SHOWS RATES EVIDENCE SO CLASS FINDING REVEAL VALUES MUTAGENESIS DEMONSTRATES AMOUNT OBSERVATION SUGGESTED SITES MEASUREMENTS
33 BEEN MAY CAN COULD WELL DID DOES DO MIGHT SHOULD WILL WOULD MUST CANNOT REMAINED ALSO THEY BECOME MAG LIKELY
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
(PNAS, 1991, vol. 88, 4874-4876) A23 generalized49 fundamental11 theorem20 of4 natural46 selection46 is32 derived17 for5 populations46 incorporating22 both39 genetic46 and37 cultural46 transmission46. The14 phenotype15 is32 determined17 by42 an23 arbitrary49 number26 of4 multiallelic52 loci40 with22 two39-factor148 epistasis46 and37 an23 arbitrary49 linkage11 map20, as43 well33 as43 by42 cultural46 transmission46 from22 the14 parents46. Generations46 are8 discrete49 but37 partially19 overlapping24, and37 mating46 may33 be44 nonrandom17 at9 either39 the14 genotypic46 or37 the14 phenotypic46 level46 (or37 both39). I12 show34 that47 cultural46 transmission46 has18 several39 important49 implications6 for5 the14 evolution46 of4 population46 fitness46, most36 notably4 that47 there41 is32 a23 time26 lag7 in22 the14 response28 to31 selection46 such9 that47 the14 future137 evolution46 depends29 on21 the14 past24 selection46 history46 of4 the14 population46.
(PNAS, 1996, vol. 93, 14628-14631) The14 ''shape7'' of4 a23 female115 mating115 preference125 is32 the14 relationship7 between4 a23 male115 trait15 and37 the14 probability7 of4 acceptance21 as43 a23 mating115 partner20, The14 shape7 of4 preferences115 is32 important49 in5 many39 models6 of4 sexual115 selection46, mate115 recognition125, communication9, and37 speciation46, yet50 it41 has18 rarely19 been33 measured17 precisely19, Here12 I9 examine34 preference7 shape7 for5 male115 calling115 song125 in22 a23 bushcricket*13 (katydid*48). Preferences115 change46 dramatically19 between22 races46 of4 a23 species15, from22 strongly19 directional11 to31 broadly19 stabilizing45 (but50 with21 a23 net49 directional46 effect46), Preference115 shape46 generally19 matches10 the14 distribution16 of4 the14 male115 trait15, This41 is32 compatible29 with21 a23 coevolutionary46 model20 of4 signal9-preference115 evolution46, although50 it41 does33 nor37 rule20 out17 an23 alternative11 model20, sensory125 exploitation150. Preference46 shapes40 are8 shown35 to31 be44 genetic11 in5 origin7.
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
(graylevel = semanticity, the probability of using LDA over HMM)
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
34
Summary
(PNAS, 1996, vol. 93, 14628-14631) The14 ''shape7'' of4 a23 female115 mating115 preference125 is32 the14 relationship7 between4 a23 male115 trait15 and37 the14 probability7 of4 acceptance21 as43 a23 mating115 partner20, The14 shape7 of4 preferences115 is32 important49 in5 many39 models6 of4 sexual115 selection46, mate115 recognition125, communication9, and37 speciation46, yet50 it41 has18 rarely19 been33 measured17 precisely19, Here12 I9 examine34 preference7 shape7 for5 male115 calling115 song125 in22 a23 bushcricket*13 (katydid*48). Preferences115 change46 dramatically19 between22 races46 of4 a23 species15, from22 strongly19 directional11 to31 broadly19 stabilizing45 (but50 with21 a23 net49 directional46 effect46), Preference115 shape46 generally19 matches10 the14 distribution16 of4 the14 male115 trait15. This41 is32 compatible29 with21 a23 coevolutionary46 model20 of4 signal9-preference115 evolution46, although50 it41 does33 nor37 rule20 out17 an23 alternative11 model20, sensory125 exploitation150. Preference46 shapes40 are8 shown35 to31 be44 genetic11 in5 origin7.
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
State-of-the-art probabilistic text models can be constructed from large text data sets
Can yield better performance than other approaches like clustering, LSI, etc Advantage of probabilistic approach is that a wide range of queries can be supported by a single model See also recent work by Buntine and colleagues
Learning algorithms are slow but scalable
Linear in the number of word tokens Applying this type of Monte Carlo statistical learning to millions of words was unheard of a few years ago
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Concluding Comments
The probabilistic approach is worthy of inclusion in a data miners toolbox
Systematic handling of missing information and uncertainty Ability to incorporate prior knowledge Integration of different sources of information However, not always best choice for black-box predictive modeling
Conclusion
Graphical models in particular provide:
A flexible and modular representational language for modeling efficient and general computational inference and learning algorithms
Many recent advances in theory, algorithms, and applications
Likely to continue to see advances in new powerful models, more efficient scalable learning algorithms, etc
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Examples of New Research Directions
Modeling and Learning
Probabilistic Relational Models Work by Koller et al, Russell et al, etc. Conditional Markov Random Fields information extraction (McCallum et al) Dirichlet processes Flexible non-parametric models (Jordan et al) Combining discriminative and generative models e.g., Haussler and Jaakkola
References
To be provided as part of an updated set of slides
Applications
Computer vision: particle filters Robotics: map learning Statistical machine translation and many more.
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
Probabilistic Learning Tutorial: P. Smyth, UC Irvine, August 2005
35
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more.
Course Hero has millions of course specific materials providing students with the best way to expand
their education.
Below is a small sample set of documents:
University of Illinois, Urbana Champaign - CS - 591
An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep WebWensheng WuComputer Science Dept. University of Illinois at Urbana-Champaign wwu2@uiuc.eduClement YuComputer Science Dept. University of Illinois at C
University of Illinois, Urbana Champaign - CS - 511
AnnouncementsCS511 Design of Database Management SystemsLecture 13: Information Retrieval: Overviewl MT format: Wednesday 2:00-3:15pm open notes, papers, books. Calc. OK (wont need). PDA no. 75 points (for 75 minutes) 4 problems Prob. 1: Tr
University of Illinois, Urbana Champaign - CS - 598
CS 598CSC: Approximation Algorithms Instructor: Chandra ChekuriLecture date: January 21, 2009 Scribe: Nitish Korula1Introduction/Administrivia Course website: http:/www.cs.uiuc.edu/class/sp09/cs598csc/. Join the newsgroup! Text book (recommen
University of Illinois, Urbana Champaign - CS - 241
Operating System OverviewTarek Abdelzaher Lawrence Angrave Vikram AdveCopyright : Nahrstedt, Angrave, Abdelzaher 1Copyright : Nahrstedt, Angrave, AbdelzaherTodays ObjectivesBy the end of the hour you should be able to:Explain the main purpo
University of Illinois, Urbana Champaign - CS - 105
Spring 2009 CS105#1What Is a Query? A query is a question you ask of your database. You can: display data from multiple tables control which fields display perform calculations on field values save a query automaticallySpring 2009 CS105
University of Illinois, Urbana Champaign - CS - 411
CS411 Database Systems01: IntroductionKazuhiro MinamiWelcome to CS411Web site: http:/www.cs.uiuc.edu/~cs411 Announcements, syllabus, policies, schedule, lectures Please read the class syllabus, policies, and lecture schedule; ask if you have qu
University of Illinois, Urbana Champaign - CS - 411
CS411 Database Systems06: SQL Kazuhiro MinamiSQL = Structured Query LanguageStandard language for querying and manipulating data Has similar capabilities for queries to those in relational algebra Support statements for modifying a database (e.
University of Illinois, Urbana Champaign - CS - 461
Law and SecurityCS461/ECE422 Computer Security I Fall 2008Slide #6-1Overview Law and privacy Cybercrime Laws Affecting Computer UseSlide #6-2Reading Material Secrets of Computer Espionage: Tactics and Countermeasures, Joel McNamara, Chap
University of Illinois, Urbana Champaign - CS - 105
Fall 2008 CS105#1What Is a Query? A query is a question you ask of your database. You can: display data from multiple tables control which fields display perform calculations on field values save a query automaticallyFall 2008 CS105#2
University of Illinois, Urbana Champaign - CS - 573
AlgorithmsPre-lecture R: Solving RecurrencesRSolving Recurrences. . . O Zarathustra, who you are and must become behold you are the teacher of the eternal recurrence that is your destiny! That you as the rst must teach this doctrine how coul
University of Illinois, Urbana Champaign - CS - 476
UNIVERSITY OF OSLO Department of InformaticsFormal Modeling and Analysis of Distributed Systems in MaudeLecture Notes INF3230/INF4230Peter Csaba lveczkyApril 27, 2008Dedicated to the memory of my father Mikls lveczky and my grandmother Dr.
University of Illinois, Urbana Champaign - CS - 473
AlgorithmsLecture 9: Treaps and Skip ListsI thought the following four [rules] would be enough, provided that I made a rm and constant resolution not to fail even once in the observance of them. The rst was never to accept anything as true if I ha
University of Illinois, Urbana Champaign - CS - 473
AlgorithmsLecture 3: Fast Fourier TransformsCalvin: Heres another math problem I cant gure out. Whats 9+4? Hobbes: Ooh, thats a tricky one. You have to use calculus and imaginary numbers for this. Calvin: IMAGINARY NUMBERS?! Hobbes: You know, elev
University of Illinois, Urbana Champaign - CS - 523
Susceptibility of Commodity Systems and Software to Memory Soft ErrorsAlan Messer et al. IEEE Transactions on Computers, Dec 2004 Presented by Rodolfo Pellizzoni CS 523 Fault ToleranceSoft Errors 1/2Transient failures in integrated circuits. In
University of Illinois, Urbana Champaign - CS - 523
Spawn: A Distributed Computation EconomyWaldspurger, Hogg, Huberman, Kephart, and Stornetta Presented by Jeff PasternackMotivationWant to run distributed applications over large, heterogeneous network, taking advantage of idle time. Many users an
University of Illinois, Urbana Champaign - CS - 523
Each person in the class should submit a review for every presentation (besides, of course, their own). Please use the following template. Presenter's name: Reviewer's name: 1. Please critique the clarity and organization of the presentation. Did the
University of Illinois, Urbana Champaign - CS - 523
SPIN Operating SystemBrian Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Gun Sirer, Marc E. Fiuczynski, David Becker, Craig Chambers, Susan Eggers (UW)Presenter: Alex LiBackgroundDeveloped at University of Washington Introduced in 1995 (SOSP
University of Illinois, Urbana Champaign - CS - 523
CS523Department of Computer Science University of Illinois at Urbana-ChampaignAnnouncements, 1 MarchRoy Campbell3/1/2006CS523 2006 Roy Campbell, All Rights Reserved1Mid Term Exam Announced this week-end independent work. Virtual Mach
University of Illinois, Urbana Champaign - CS - 598
Copyright NoticeCS 598 - Advanced Topics in Network Protocols, Architectures and Applications Copyright 2006 Robin Kravets. All rights reserved. Permission to reproduce this and all CS 598 rhk course materials in whole or part for not-for-profit edu
Washington - STAT - 560
A brief review of probability for statistical modelingPeter D. HoJanuary 4, 20091Axioms of probabilityLet F , G, and H be three possibly overlapping statements about the world. For example: F = { a person votes for a left-of-center candidat
Washington - STAT - 560
Stat 560 Homework 2 Partial solutions 1. Variances: Let yi,j = + aj + i,j , i = 1, . . . , nj , j = 1, . . . , m, with the distributions of aj and i,j as given in the class notes. Compute the following variances, showing all necessary steps: (a) Var
Washington - STAT - 560
Stat 560 Homework 2 Assigned 1/23/09 Due 1/30/09 1. Variances: Let yi,j = + aj + i,j , i = 1, . . . , nj , j = 1, . . . , m, with the distributions of aj and i,j as given in the class notes. Compute the following variances, showing all necessary ste
Washington - STAT - 560
Stat 560 Homework 4 Assigned 2/11/09 Due 2/18/09 1. Countries: The dataset countries.hlm contains data on 150 countries from 1991 to 2004, including the following variables: gdp: log gross domestic product; pop: log population; conf: a measure of
Washington - HUBIO - 568
Patient EducationDiabetes Care CenterEveryone gets sick sooner or later, so the best way to prevent a minor illness from becoming a major problem is to have a sick-day plan of action. Make a plan, in advance, with your doctor and diabetes educator
Washington - HUBIO - 568
Patient EducationDiabetes Care CenterFor the best diabetes control: Eat the right foods in the right amounts at about the same times every single day Exercise Monitor your blood sugar level And, take your insulinHealthy ChoicesThe Food Guide P
Washington - HUBIO - 568
Patient EducationDiabetes Care CenterInsulin RegimensB = Breakfast L = Lunch S = Dinner HS = Bedtime= InjectionTwenty-four hour plasma glucose and insulin profiles in hypothetical individual without diabetesMixture of short and intermedia
Washington - HUBIO - 568
Please complete both sides of this form as accurately and completely as possible. Your health care provider will use this to help plan the best health recommendations for you.Food and Physical Activity Habit InventoryWellness IN the RockiesYour n
Washington - HUBIO - 568
Name: _ Exam #_ Clinical Nutrition -HuBio 568 Final Examination 20041. A bulimic binge-eating episode would be likely to include all EXCEPT which of the following: a) b) c) d) e) 2. Chocolate, cookies and ice cream Diet soft drinks Potato chips, nac
Washington - CONJ - 515
IFEATURE ARTICLEResearchIRural Community Leaders' Perceptions of Environmental Health RisksImproving Community Healthby Laura S. Larsson, MPH, RN, Patricia Butterfield, PhD, RN, FAAN, Suzanne Christopher, PhD, and Wade Hill, PhD, RNQualita
Washington - BIOL - 481
NPH159.fm Page 237 Tuesday, May 29, 2001 5:56 PMResearchWhy are all colour combinations not equally represented as ower-colour polymorphisms?Blackwell Science LtdJohn Warren1 and Sally Mackenzie2,31Instituteof Rural Studies, Llanbadarn Fawr
Washington - BIOL - 411
Sample Questions for Midterm 1 Friday, February 2 These questions are examples of the type of questions you will encounter on the first Midterm. All reading assignments will be covered on the test. The proportion of questions on the test will follow
Washington - CONJ - 514
Hematol Oncol Clin N Am 20 (2006) 711733HEMATOLOGY/ONCOLOGY CLINICSOF NORTH AMERICAAdoptive T-Cell Therapy of CancerCassian Yee, MDa,b,*a bClinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA Department of Med
Washington - ESRM - 490
ReviewTansley 2008 Tansley review ? ? 0? March reviewTansley reviewp m J . ) ( A T 7 3 9 1 X 4 6 8 2 0 y H N K U , r o f x O d t L g n h s i b u P e w k c a l B Tansley reviewEnhancing phytoremediation through the use of transgenics and endop
Washington - PSY - 448
106J. Phyiiol. (1962), 160, pp. 106-154 With 2 plate and 20 text-ftgutre8 Printed in Gret BritainRECEPTIVE FIELDS, BINOCULAR INTERACTION AND FUNCTIONAL ARCHITECTURE IN THE CAT'S VISUAL CORTEXBY D. H. HUBEL AD T. N. WIESEL From the Neurophysiolo
Washington - EE - 589
Lecture Notes11.1Force SensingLoad CellA Load Cell is a structure which supports the load and deects a known amount in response to applied forces and torques. The deections are measured to characterize the applied forces and torques.Figure
Washington - EE - 544
EE 544 Project DescriptionThe Kinematics and the Dynamics of an Upper Limb Wearable Robot (Exoskeleton) Theoretical and Experimental StudyScope: It is in the scope of this project to analyze analytically, simulate and analyze experimentally a 7 DO
Washington - EE - 543
University of WashingtonEE 543 Notes1/7/99Computational Implications of Forward Kinematics (See Craig Sec. 2.10) Suppose we need to do:A A Z C DP = B R CR DR PThere are at least two ways we could compute P . Alternative 1) Multiply all the
Washington - EE - 543
EE543 Problem Set 1 Due Friday 24-Jan-2003January 5, 20031Textbook Problems2.1 2.4 2.11 (give an example and prove for specic case) 2.7 (pseudo code, do not test oating point values for equality! ) 2.14 (nd error in gure 2.26) 2.27 2.282Pi
Washington - EE - 543
EE543 Problem Set 3 Due Friday 7-Feb-2003January 13, 200311.1Forward Kinematics AnalysisMark Rosheims Prehensile WristPlease refer to Figures 1 and 2. Assign link frames and derive Denavit-Hartenberg parameters for this wrist. Use only as ma
Washington - EE - 543
EE543Vector Velocity1Velocity of a VectorAIf A Q is a point represented in frame A, thenAVQ = limt0Q(t + t) A Q(t) tWe have just computed the velocity in frame A and it is also represented in frame A. Later we may wish to represent th
Washington - EE - 543
EE 544 Research Project EvaluationDate: Name: Title:Evaluation and FeedbackPresentation Organization Visual Materials Time Management Writeup (1) Scientic Rigor and Novelty (2) Understanding of Context (3) TOTALTOTAL > 80 80 < 800-20 0-5 0-5 0
Washington - EE - 543
EE543: Velocity PropagationBlake Hannaford Department of Electrical Engineering The University of Washington February 26, 20031Angular Velocity Angular velocity is a vector whos direction is the axis of rotation and whos magnitude is the rate
Washington - OC - 512
1Geophysical Fluid Dynamics I Problem Set 3 (revison out: 29 Jan 2004 back: 5 Feb 2004 P.B. Rhines SOLUTIONS-1)1. Use the energy equation (Gill 8.2) for the one-layer model of a wind-driven flow in a zonal channel to estimate the time for the wi
Washington - OC - 512
To: carson2@u.washington.edu, beth4cu@u.washington.edu, kmartini@u.washington.edu, zarnet@u.washington.edu, jadam@u.washington.edu, bbale@amath.washington.edu, semaj@u.washington.edu, scavallo@u.washington.edu, starlush@u.washington.edu, briana1@u.wa
Washington - C - 142
Molecules in MotionZumdahl (6th Ed) Chapter 5: 5.6, 5.8, 5.10, 5.12 The motion of molecules in the gas creates pressure. (Kinetic Theory of Gasses) How often are we being hit by gas molecules? (Why dont we feel it?) How do real gasses deviate from i
University of Illinois, Urbana Champaign - ASTRO - 230
ASTR 230Section 1Fall 2004Syllabus:(Also see http:/eeyore.astro.uiuc.edu/~lwl/classes/astro230/fall04/)Astronomy 230: Extraterrestrial LifeInstructor InfoInstructor: Office: Office Hours: Prof. Leslie Looney 218 Astronomy Email: Phone: lwl
University of Illinois, Urbana Champaign - ASTRO - 230
Sex in Space: Astronomy 230Section 1 MWF 1400-1450134 Astronomy BuildingLeslie Looney Phone: 244-3615 Email: lwl @ uiuc . edu Office: Astro Building #218 Office Hours: T: 10:30-11:30 a.m. W: 3:00-4:30 p.m. or by appointment1 1 1 1This class (L
University of Illinois, Urbana Champaign - ASTRO - 230
Astronomy 230Section 1 MWF 1400-1450 106 B1 Eng HallOutline Will a civilization develop that has the appropriate technology and worldview? Requires knowledge of quantum mechanics and astronomy. The most important shift for humans was the Coper
University of Illinois, Urbana Champaign - ASTRO - 230
Astronomy 230This class (Lecture 19): Origin of Intelligence Adam Molski Kerry Doyle Steven Novak Next Class: Origin of Intelligence Alan Francis Katelyn Swartz Octavio Mendoza Nov 7: Jeffery Ungrund Ian Gentile Chris BlimOct 31, 2006Presentation
University of Illinois, Urbana Champaign - ASTRO - 230
Astronomy 230Section 1 MWF 1400-1450 106 B6 Eng HallThis Class (Lecture 25): Future of Civilization Research Papers are due on May 5th.Outline What are our future plans? We are looking for advanced civilizations, but how do we become an advan
University of Illinois, Urbana Champaign - ASTRO - 230
Astronomy 230This class (Lecture 23): Evolution of World View Ken SampsonHW 3 Octavio Mendoza:http:/www.ufosoveramerica.com/Next Class: Evolution of World ViewMusic: http:/youtube.com/watch?v=eFAxumuzmN0Nov 14, 2006 Astronomy 230 Fall 2006
University of Illinois, Urbana Champaign - ASTRO - 230
ET: Astronomy 230HW 7 due on Friday!This Class (Lecture 29): Origin of IntelligencePresentations Monday Nov 7th Nick Warren Jeff Greenswag Jennifer BrownOutline Early Life making the atmosphere. Summary of life on Earth. What is intelligen
University of Illinois, Urbana Champaign - ASTRO - 230
The History of the Universe in 200 Words or LessQuantum fluctuation. Inflation. Expansion. Strong nuclear interaction. Particleantiparticle annihilation. Deuterium and helium production. Density perturbations. Recombination. Blackbody radiation. Loc
Washington - CONJ - 536
Notochord induction of zebrafish slow muscle mediated by Sonic hedgehogChris S. Blagden,1,3 Peter D. Currie,2,3 Philip W. Ingham,2,4 and Simon M. Hughes1,51 Developmental Biology Research Centre and Medical Research Council (MRC) Muscle and Cell Mo