Question

# In this problem, we explore applications of linear algebra to

web search algorithms. Google ranks pages

based upon what would happen

if someone randomly wandered around the web by clicking on links. If your

page gets visited a lot more than other pages, then you will have a higher

page rank. The graph below is a mini-web with only six sites; an arrow

indicates a link from one site to another.

The mini-web can be described by a matrix that has six rows and columns.

Site 1 has links to sites 2, 3 and 4; thus the first column of the matrix A will

be [0; 1/3; 1/3; 1/3; 0; 0], the second column will be [0; 0; 0; 0;

1; 0], and so forth. Note that the the entries in one column always add up

to one. Enter the matrix into MATLAB. For a high degree of accuracy it is

better to enter 1/3 than 0.33333.

Now suppose everybody starts browsing the web on site 1, which is linked

to sites 2, 3 and 4. After following one link, one third will be on site 2, one

third will be on site 3 and one third will be on site 4. If the coefficients

of the vector x represent the percentages (scaled to one) of visitors on the

sites, then the initial vector is

>> x = [ 1; 0; 0; 0; 0; 0]

To get the percentages after following one link the x-vector is

>> A*x

which is [ 0; 1/3; 1/3; 1/3; 0; 0]. Thus, the original group of people

who started on the site 1 is now split into three groups. At the next iter-

ation, one third of the original group will follow the links on site 2, thus

the distribution from this group will be 1

3a2 (a2 = second column of A).

But we must also include the browsing of those who initially went to sites

3 and 4. Hence the distribution will be 1

3a2 + 1

3a3 + 1

3a4, which, in matrix

vector notation is Ax. Thus we must repeatedly issue the command A*x

to get the distribution of visitors after one, two, three, etc. links. To get

the distribution after 15 iterations use y = A^15*x. Answer the following

questions:

1. Explain why a45 = 0.

2. If everybody starts browsing the web on site 1, what percentage of

visitors will be on site 5 after following three links?

3. If everybody starts browsing the web on site 1, find the distribution y

after 15 iterations.

4. What happens to the vector y if you multiply it by A again?

5. Based on y from above, rank the sites from highest (most visited) to

lowest (least visited).

6. What is the initial vector x if people choose their first site at random

(so that initially there is the same number of people on each site)?

7. What happens to this vector after 15 multiplications by A?