@ Parallel Bubble Sort and Theorem Z

<Knuth theorem Z, on searching and sorting, zero-one principle>:

If a network with n input elements sorts all 2^n sequences of 0’s and 1’s into non-decreasing order, it will sort any arbitrary sequence of n integers into non-decreasing order.

Proof of Knuth’s 0-1 principal:

If f(x) is any monotonic function, with f(x) <= f(y) whenever x <= y.

When f(x) is monotonic min(f(x),f(y)) = f(min(x,y)) and max(f(x),f(y)) = f(max(x,y))

and if a given network N transforms (x_1,...,x_n) into (y_1,...,y_n):

N(x_1,...,x_n) = (y_1,...,y_n)

then that the network will transform f(x_1),...,f(x_n) into f(y_1),...,f(y_n):

N( f(x_1),...,f(x_n) ) = ( f(y_1),...,f(y_n) )

Proof:

Suppose the network has only one comparator C<i,j> comparing

elements at indexes i and j

the i’th element will be min( f(x_i), f(x_j) ) = f( min(x_i, x_j) ).

the j’th element will be max( f(x_i), f(x_j) ) = f( max(x_i,x_j) ).

all other elements k will be f(x_k).

The result will be the same as applying the network to (x_1, … , x_n), and the applying function f to the result.

This proves the base case of an induction. Assume that the above holds for a network of q comparators. Then add a comparator between any outputs of the network of q comparators to get a q+1 comparator network.

The effect of this addition is exactly the same as in the base case where there is only one comparator, this shows that the order of the application of the network, and of the monotonic function are interchangeable for a q+1 comparator network.

By induction this shows that the application of a comparator network to some input (x_1,...,x_n) and the application of a monotonic function to that network can commute.

N( f(x) ) = f( N(x) )

Suppose there is some sequence a=(a_1,... , a_n ) that network N fails to sort.

Then N(a)=(b_1,... , b_n ), and there is at least one position i such that b_i > b_i+1. Choose the first such position. Define monotonic function f that maps all numbers less that b_i to 0 and all numbers greater than or equal to b_i to 1. Since N(f(a))=f(N(a)), applying the network to the output of the function will produce an unsorted sequence. This means if there exists an integer sequence not sorted by N, then f(a) will be a bit sequence not sorted by N. Equivalently if there is no bit sequence that is left unsorted by N, then there can be no integer sequence that is left unsorted by N.

References:

Knuth Searching & Sorting Pg223.

http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/networks/nulleinsen.htm

//Compare-Exchange

define CE(A,B):

if ( A > B ):

temp = A

A = B

B = A

Definition: A comparator is a mapping [i : j] : A^n

[i:j](a)i = min(a_i, a_j)

[i:j](a)j = max(a_i, a_j)

[i:j](a)k = a_k for all k with k != i, k != j

for all a in A^n.

Problem statement:

Given a sequence S of n integers produce a sequence S’ such that each integer present in S is present in S’, and occurs as many times in S’ as in S. Additionally, every element at some index i of S’ must be less than or equal to the element at index i+1 of S’ if such an index exists.

Algorithm Definition:

Input:

A sequence S of n integers, A sequence P of n unique processors.

A processor in P at index i may communicate with processors at indexes i-1 (if i-1>0)

and i+1 (if i+1<n-1).

Algorithm:

0) For all indices i from 0 to n-1 send S[i] to P[i].

1) For all indices (i) of P such that i%2==0:

if(i<n-1):

CE( P[i], P[i+1] )

2) For all indices (i) of P such that i%2==1:

if(i<n-1):

CE( P[i], P[i+1] )

3) Repeat steps 1, 2 n times.

4) Return resulting sequence.

Proof of Correctness for parallel bubble sort:

The algorithm above can be implemented by a comparator network N that can sort input sequence S in n cycles. The processors will form a line and each can communicate with its successor and predecessor. Each processor will initially store a unique input from sequence S. At each odd numbered cycle all odd indexed processors will perform a compare-exchange (CE) operation with their successive neighbor. At each even numbered cycle all even indexed processors will perform a compare-exchange (CE) operation with their successive neighbor. At the end of n such stages the elements of S will be stored in order along the line of processors. Thus the result S’ can be obtained by visiting the processors in the order that they are aligned in the line and appending the value stored at the processor to the output.

By applying the 0-1 principal it is sufficient to show that the 1 dimensional even-odd comparator network N will produce correctly sorted bit sequences for all of 2^n possible bit strings.

The claim then becomes: An even-odd transposition comparator network N can sort any of the 2^n possible n-length bit strings in O(n) time.

Base Case: In the case of a single element the algorithm will produce the same element on output, which is correct.

Inductive Assumption: Assume that the network N(with n-1 comparators) can sort any of the 2^n possible bit strings of length n in O(n) time.

It can be shown that a even-odd transposition network Y(with n comparators) can sort any of the 2^(n+1) possible bitstrings of length n+1 in in O(n+1) time.

Proof:

*Note that as the algorithm executes no 0 stored at processor i will ever be swapped into any processor index greater than i. There is no mechanism for this to happen, exchanges may only happen when a 1 and a 0 are compared, and in that case the 1 will always be routed to the higher indexed processor. Similarly a 1 stored at processor i will never be swapped into an index less than i by the same reasoning.

Take bitstring S=a=(a_0,...,a_n) of n+1 bits, and a CE network Y of n comparators

Case a_n=1:

In this case the comparator between a_n and a_n-1 is superfluous since the bit at a_n=1 is in the correct position (at processor index n), and will remain there regardless of the value of a_n-1. Take the network where the comparator between a_n and a_n-1 is removed, and input line a_n is routed directly to the last output line of the network call this network X(with n-1 comparators). When network X is applied to inputs (a_0,...,a_n-1) by the inductive assumption network X will sort them. Since the value remains at processor n remains remains 1 the final output of the network must be completely ordered. Ordering inputs a_0,...,a_n-1 takes O(n) time by the inductive assumption, and no additional work was required to sort input a_n (it started out in the correct position).

Case a_n=0:

If a_0=a_1=...=a_n+1=0 Then the network is already sorted.

If on the other hand, at least one of the elements a_0,..,a_n is 1, then consider the state of the network after n cycles (iterations of the odd-even compare-exchanges). Because as shown above the progression of the network never moves a 0 ahead of a 1 or a 1 before a 0, then we know that the last comparator in the network (the n’th) will not disorder any elements at any iteration, and the 0 coming in from a_n will never be swapped into the a_n-1’th spot. Since the comparator will not disorder any elements we can consider the network formed if the n+1’th input is removed, and the compare exchange algorithm is run n times. This matches the network in our inductive assumption, if there is at least a single 1 on inputs a_0,...,a_n-1 then at the n’th iteration that 1 (or some other 1) will have been swapped into the processor at a_n-1 (by our inductive assumption). Then on the n+1’th iteration of the algorithm the comparator we earlier disregarded will compare the 1 coming from processor containing a_n-1 and the 0 at the processor containing a_n, and exchange them, thereby fully ordering the network, as we know that a_0...a_n-1 will still be ordered after this last exchange operation.

As the even-odd transposition network has been shown to order all possible 2^n bitstrings in O(n) time, the 0-1 principle states that the network will order any n integers in O(n) time. The fact that this all takes O(n) time is also clear from the fact that the sorting network implementing the algorithm has a depth of n.

Reference:

http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/networks/oetsen.htm