In this problem, we explore applications of linear algebra to
web search algorithms. Google ranks pages
based upon what would happen
if someone randomly wandered around the web by clicking on links. If your
page gets visited a lot more than other pages, then you will have a higher
page rank. The graph below is a mini-web with only six sites; an arrow
indicates a link from one site to another.
The mini-web can be described by a matrix that has six rows and columns.
Site 1 has links to sites 2, 3 and 4; thus the first column of the matrix A will
be [0; 1/3; 1/3; 1/3; 0; 0], the second column will be [0; 0; 0; 0;
1; 0], and so forth. Note that the the entries in one column always add up
to one. Enter the matrix into MATLAB. For a high degree of accuracy it is
better to enter 1/3 than 0.33333.
Now suppose everybody starts browsing the web on site 1, which is linked
to sites 2, 3 and 4. After following one link, one third will be on site 2, one
third will be on site 3 and one third will be on site 4. If the coefficients
of the vector x represent the percentages (scaled to one) of visitors on the
sites, then the initial vector is
>> x = [ 1; 0; 0; 0; 0; 0]
To get the percentages after following one link the x-vector is
which is [ 0; 1/3; 1/3; 1/3; 0; 0]. Thus, the original group of people
who started on the site 1 is now split into three groups. At the next iter-
ation, one third of the original group will follow the links on site 2, thus
the distribution from this group will be 1
3a2 (a2 = second column of A).
But we must also include the browsing of those who initially went to sites
3 and 4. Hence the distribution will be 1
3a2 + 1
3a3 + 1
3a4, which, in matrix
vector notation is Ax. Thus we must repeatedly issue the command A*x
to get the distribution of visitors after one, two, three, etc. links. To get
the distribution after 15 iterations use y = A^15*x. Answer the following
1. Explain why a45 = 0.
2. If everybody starts browsing the web on site 1, what percentage of
visitors will be on site 5 after following three links?
3. If everybody starts browsing the web on site 1, find the distribution y
after 15 iterations.
4. What happens to the vector y if you multiply it by A again?
5. Based on y from above, rank the sites from highest (most visited) to
lowest (least visited).
6. What is the initial vector x if people choose their first site at random
(so that initially there is the same number of people on each site)?
7. What happens to this vector after 15 multiplications by A?