statement expresses more than one idea, then
you might confuse your readers about the
subject of your paper. For example: Companies
need to exploit the marketing potential of the
Internet, and Web pages can provide both
advertising and customer support. T
in non-decreasing order in each node, and nondecreasing order between nodes. More
precisely, this means that Ep,i Ep, j for all
relevant i < j and p, and that Ep,i Eq, j for 0 p
< q < P and all relevant i, j. The speedup offered
by a parallel algorithm fo
degree of efficiency. Merging algorithms that
require less memory are possible but they are
quite computationally expensive[Ellis and
Markov 1998; Huang and Langston 1988;
Kronrod 1969]. 1.1.2.2 Extending to P
processors Can we now produce a P processor
p
remote update algorithm 57 of the fast
signature11. A 16 bit index table is then formed
which takes a 16 bit hash value and gives an
index into the sorted signature table which
points to the first entry in the table which has a
matching hash. Once the sor
message digest algorithms commonly used in
cryptographic applications. These algorithms
are believed to have the following properties
(where b is the number of bits in the signature)
[Schneier 1996]: The probability that a
randomly generated block has the
the first pass when sorting 50 million 64 bit
random elements on the 128 processor AP1000
for varying values of k. Note the narrow y range
on the graph. The results show that even for
quite high values of k nearly 90% of slices are
completed after the fir
files to be transferred through my modem for
archiving, distribution or testing on computers
at the other end of the link or on the other side
of the world. The time taken to transfer the
changes gave me plenty of opportunity to think
about better ways of
sorting we can perform some simple
calculations which are very revealing and which
provide a great deal of guidance in the
algorithm design. It is well known that
comparison-based sorting algorithms on a
single CPU require logN! time1 , which is well
appr
not a problem except when disk space is very
tight. Even without this restriction it would, in
many cases, be wise for the reconstruction to
happen to a temporary file followed by an
atomic rename as otherwise the transfer would
be susceptible to network
[1] [0 1]). 6A method for avoiding infinitypadding using balancing is given in Section 1.2.5
so infinity-padding is not strictly needed but it
provides some useful concepts nonetheless.
1.2 Algorithm Details 11 procedure
hypercube_balance(integer base, in
and Plaxton 1993] and demonstrate that it is
asymptotically optimal in terms of I/O
operations on general multi-level storage
hierarchies. Their algorithm is aimed at larger
values of k than are considered in this chapter
and concentrates on providing a f
investigation of fast signature algorithms which
depended on all the bytes in a block, while
requiring very little computation to find the
signature values for every byte offset in a file.
Perhaps the simplest such algorithm is R(a) =
ai This would be ver
Get International Experience during your Studies!
SAMK has wide international connections and partner universities in various countries. All the
students of SAMK have a change to go abroad as an exchange student, do their internship abroad
or get to know
nomenclature a little. 2Unless the link is
asymmetric. If the link was very fast from B to A
but slow from A to B then it wouldnt matter
how big S is. 3.2 Designing a remote update
algorithm 51 means that S cannot uniquely
identify all possible files bi w
than the other node, then the virtual infinity
elements are all required in the other node, and
no transfer of real elements is required. This
means that if a final balancing phase is
introduced where elements are drawn from the
last node to fill the lowe
1.2.12 we dont strictly need to obtain a sorted
list in each cell when this two processor merge
is being used as part of a larger parallel merge
but it does simplify the discussion. 1.1 How
fast can it go? 6 Step 1 Step 3 Step 2 Cell 1
Cell 2 Cell 3 Cel
nodes in a MIMD machine. To obtain good
memory utilization when sorting small
elements linked lists are avoided. Thus, the lists
of elements referred to below are implemented
using arrays, without any storage overhead for
pointers. 1.2 Algorithm Details 1
algorithm unusable. For example, the signature
could be just the first 4 bytes of each block. This
would be very easy to compute but the
algorithm would fail to produce the right result
when two different blocks had their first 4 bytes
in common. 3.2.3 Tw
N P logP) time. 1.1.2.1 The two processor
problem Let us first consider the simplest
possible parallel merging problem, where we
have just two processors and each processor
starts with N 2 elements. The result we want is
that the first processor ends up w
will transfer the whole file. 3.2.2 A second try
We can solve this problem by getting A to
generate signatures not just at block
boundaries, but at all byte boundaries. When A
compares the signature at each byte boundary
with each of the signatures Sj on
we have implicitly performed padding of the
nodes with infinity elements, thus guaranteeing
the correct behavior of the algorithm. 1.2.4
Balancing The aim of the balancing phase of the
algorithm is to produce a distribution of the
elements on the nodes th
varied links within this relatively new discipline.
I hope you will enjoy reading it as much as I
enjoyed the research which it describes.
Chapter 1 Internal Parallel Sorting My first
introduction to the problem of parallel sorting
came from a problem in
thoroughly enjoyable. That enjoyment is largely
a result of the interaction that I have had with
my supervisors, colleagues and, in the case of
rsync, the people who have tested and used the
resulting software. I feel very privileged to have
worked with m
totally lacked a parallel sorting routine. I had
been expecting that there would be a routine
that is the parallel equivalent of the ubiquitous
qsort() routine found in standard C libraries.
The lack of such a routine was quite a surprise
and prompted me
signature table then a single byte literal is
emitted and the search continues at the next
byte13 . At first glance this search algorithm
appears to be O(n 2 ) in the file size, because for
a fixed block size the number of blocks with
matching 16 bit hash
showing how it can be implemented with
minimal memory overhead. Each stage of the
algorithm is analyzed more carefully resulting in
a more accurate estimate of the expected
running time of the algorithm. The algorithm
was first presented in [Tridgell and
in the next chapter. 14It rises as n or n
depending on the definition of optimal. 3.3
Choosing the block size 58 some extra memory
and bookkeeping. 3.2.7 Reconstructing the file
One of the simplest parts of the rsync algorithm
is reconstructing the file o
A and uses these to construct ai . For this
algorithm to be effective and efficient we need
the following conditions: the signature R
needs to be cheap to compute at every byte
offset in a file; 6 I call them R and H for rolling
checksum and hash respecti