问题描述:

I am reading "Insertion Sort is O(nlogn) by Michael A. Bender , Martín Farach-Colton , Miguel Mosteiro" and I don't quite understand how the algorithm works and how to implement it even with the help of Wikipedia. The following is the description of the algorithm extracted from the original article.

1) Let A be an n-element array to be sorted. These elements are inserted one at

a time in random order into a sorting array S of size (1 + ε)n.

So the first step is creating array of size (1 + ε)n. Let ε = 1, then I need to create an array with twice the size of the original array.

2) The insertions proceed in log(n) rounds as follows.

Each round doubles the number of elements inserted into S and doubles the prefix of S where elements reside.

I understand that there will be outer loop that will loop log(n) time. Each round, I need to double the number of elements from A (original array) to S array. *What I don't really understand is "double the prefix of S"*.

3) Specifically, round i

^{th}ends when element 2^{i}is inserted and the elements are rebalanced. Before the rebalance, the 2^{i}elements are in the first (1 + ε)2^{i}positions.A rebalance moves them into the first (2 + 2ε)2

^{i}positions, spreadingthe elements as evenly as possible. We call 2 + 2ε

the spreading factor.

From what I understand is that for every round, we will do "rebalance". "rebalance" will uniformly spread the original element in S array so that it leaves some gap between the element. The formula to spread the element is: *k* = *i* * (1 + ε) where *i* is old index, and *k* is a new index.

4) The insertion of 2

^{i−1}intercalated elements within round i^{th}is performed thebrute force way: search for the target position of the element to be inserted by

binary search (amongst the 2

^{i−1}support positions in S), and move elementsof higher rank to make room for the new element. Not all elements of higher

rank need to be moved, only those in adjacent array positions until the nearest

gap is found.

This part shows how to insert each element into S array. First, use binary search to search for where the element should belongs. Then, shift the higher rank until it hit the gap.

This is the translation of the algorithm from what I understand (where A is array to sort and array start with index of 1):

def LibrarySort(A)

n ← length(A)

S ← array of size (1 + ε) * n

for i ← 1 to n

S[i] = null

for i ← 1 to floor(log(n) + 1)

for j ← 2

^{i - 1}to 2^{i}index = binarysearch(S, A[j])

insert(S, A[j], index)

rebalance()

Then for insertion() function takes 3 parameters: array, item to insert, and location.

`def insert(S, item, index)`

if S[index] != null

tmp ← S[index]

i ← index + 1

while i <= length(S) and S[i] != null

swap(tmp, S[i])

i++

S[index] ← item

Questions

- Is what I understand correct?
- What is
*"double the prefix of S"*?

Ad "double the prefix of S": The array (memory) is allocated once at the beginning to the size of (1 + *ε*) *n*, where *n* is total number of elements to be sorted. But the elements are added gradually and as they are added, they are not spread across the whole array, but only some prefix of it. When *m* elements are rebalanced, they are spread across first (1 + *ε*) ** m** elements of the array. This is the prefix. When

Ad correctness: I see one slight mistake:

The formula to spread the element is: k = i * (1 + ε) where i is old index, and k is a new index.

The quoted description does not say what the formula is, but it can't be this one. Because this would map array of length *m* to length (1 + *ε*) *m*, but the description says you are mapping array of length (1 + *ε*) *m* to array of length 2 (1 + *ε*) *m*.

A simple expression would be *k* = 2 *i* where *i* is *old* index, but that would not spread the elements evenly. To spread the elements evenly, the formula is *k* = (2 + 2 *ε*) *i*, but *i* is index *excluding any gaps*.