Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
INTRODUCING THE ZTRANSFORM
After reading this section, you may want to look at Chapter 1, "Signal Processing Basics," in the
User's Guide for the Matlab Signal Processing Toolbox. There are a few things that we have not
covered, and will not cover  for
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Variance, covariance, correlation
This continues our exploration of the semantics of the inner product.
As you doubtless know, the variance of a set of numbers is defined as the "mean squared
difference from the mean". The inner product of a vector with i
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Towards the Discrete Fourier Transform
We spent some time earlier on the first slogan of signal processing:
The response of a linear shiftinvariant system S to an arbitrary input x is the
convolution of x with the impulse response of S.
This turned out t
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Introduction to subspace methods
Many interesting interesting and important techniques center around the ideas of embedding of a
set of points in a higherdimensional space, or projecting a set of points into a lowerdimensional
space.
It's important to b
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Homework #1: A note on correlation
In the instructions for homework #1, we asked you to take a look at the r2 coefficient from using
multiple regression to predict secondformant values in the TIMIT data as a (linear, additive)
function of sex and vowel i
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Problem Set #1  COGS 501
Due Monday 9/21/2011
This is an simple exercise in linear regression. The data that we'll use are formant frequencies
taken from the analysis of human speech. Formants are concentrations
of energy in frequency and time, which sho
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Basic Linear Algebra Review
Linear Algebra has become as basic and as applicable as calculus, and fortunately it is easier.
Gilbert Strang
Today, most scientific mathematics is applied linear algebra, in whole or in part. This is true of
most inferential
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Problem set #7  COGS501
Due 11/23/2011
1. Kmeans clustering
Download KM1.m, and put it somewhere that Matlab can find it. Execute it to see a simple demo, in
which three random clouds of points are generated, mixed, and then separated again by kmeans
c
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Impulse Response and Convolution
Digital signal processing is (mostly) applied linear algebra.
The relevance of matrix multiplication turned out to be easy to grasp for color matching. We had
fixed dimensions of 1 (number of test lights), 3 (number of pri
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Adding phonetics to speechcommunity dynamics
[A problem, in blue text, is at the end of the discussion.]
1. Reciprocal learning of continuous random variables
Instead of using a linearlearning model to learn the probability of a set of discrete outcomes
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Simple models of speech community dynamics
[There are three problems, in blue text, interspersed among these notes.]
1. Probability matching by linear learning
At least since Bush & Mosteller 1951, it's been understood that a "linear learner", which updat
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
COGS501  Problem Set #5
The MATLAB fft function implements the Discrete Fourier Transform given by the equation
where N = l ength(x).
In MATLAB terms, this means that for a real (or complex) input vector x of length N , fft returns a complex vector X
of
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Problem Set #4 (due Wed. 10/12)
1. C reate a vector of length 24 containing an impulse at location 5, a step edge at location 12,
and a unitslope ramp starting from location 18. For example:
X=zeros(24,1); X(5)=1; X(12:24)=1; X(18:24)=(X(18:24)+(1:7)');
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Problem set #3  COGS501
(due 10/05/2011)
Follow the instructions given in problem set #2 to set up the Peterson/Barney vowel data.
1. Apply principal components analysis to the F1, F2, F3 data as a whole.
Plot the (first two dimensions) of the results us
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Problem Set #2  COGS 501
(due 9/28/2011)
1. Download the file pb.Table1 to a place accessible to Matlab. (This is the data from Peterson &
Barney, "Control Methods Used in a Study of the Vowels", JASA 23 (1), 1951.)
The table columns are:
1: m = man, w=w
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Homework 6  Due We 11/2/2011
1. As background, spend some time experimenting with the properties of complex exponential sequences
x[n] = z^n
or in MATLAB terms
x = z.^n
where z is an arbitrary complex number, and n ranges over integers.
In the general ca
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Hamming's Rule
The basic idea of linear regression is to predict the M values of a dependent variable (which we
treat as a column vector b), on the basis of the values of N independent variables (which we put in
the rows of an M by N matrix A), using the
Mathematical Foundations for the Study of Language and Communication
COGS 501

Fall 2011
Causal Moving Average (FIR) Filters
We've discussed systems in which each sample of the output is a weighted sum of (certain of
the) the samples of the input.
Let's take a causal "weighted sum" system, where causal means that a given output sample
depends