The insertion sort algorithm

Introduction

Insertion sort is one of the simplest sorting algorithms. It is quite
efficient for small arrays. Its runtime increases with the square of the
number of elements to be sorted. Therefore it should not be used with large
data sets. For large data sets algorithms like merge sort or heap sort are far
more efficient.

All sort algorithms work on abstract arrays. An abstract array can have
different implementations. The sort algorithm just need random access to the
elements.

An abstract array has the following interface.

  deferred class
    ABSTRACT_ARRAY[G]
  feature
    content: ghost LIST[G]

    count: NATURAL
      deferred
      ensure
        Result = content.count
      end

    [](i:NATURAL)
      require
        i < count
      deferred
      ensure
        Result ~ content[i]
      end

    swap(i,j:NATURAL)
      require
        i < j
        j < a.count
      deferred
      ensure
        a.content = old a.content.swapped(i,j)
      end
  end

For the purposes of sorting we need access only to the content (ghost access
is sufficient), the length of the array, the individual elements in a random
access manner. The only procedure we need is to swap two elements.

Needed list functions

The ghost function “content” returns the elements as a list. This function is
needed to describe the properties well. We never need the list in actual
computations. But the data type is such a powerful type that it can serve as a
“swiss army kife” in many algorithms. The following subsections describe what
properties are needed from lists.

Univerally quantified expressions

If we need to verify that all elements of the concatenation of two lists “u+v”
satisfy a certain condition “e”, then it is sufficient to prove that all
elements of the lists “u” and “v” individually satisfy the corresponding
condition.

  all(u,v:LIST[G], a:G, e:BOOLEAN)
    require
      all(x:G) x in u => e
      all(x:G) x in v => e
      e[x:=a]
    ensure
      all(x:G) x in (u+v) => e
      all(x:G) x in (u+a) => e
      all(x:G) x in (a::u) => e
    end

Take and drop

The function “l.take(i)” returns a list with the first “i” elements of the
list “l” or the whole list in case that “i” exceeds the number of elements in
the list. “l.drop(i)” returns the list “l” where the first “i” elements have
been removed. The functions “take” and “drop” satisfy the following properties.

  all(l:LIST[G], i:NATURAL)
    ensure
      l.take(i).count = min(i,l.count)

      l.take(i) + l.drop(i) = l

      l.take(l.count) = l
      l.drop(l.count) = nil

      i/=0 => i<=l.count
      => l.take(i) = l.take(i-1) + l[i-1]

      i<l.count
      => l.drop(i) = l[i]::l.drop(i+1)

      i/=0 => i<l.count
      => l = l.take(i-1) + l[i-1]::l[i]::l.drop(i-1)
    end

Tail zipping

A list can be zipped with its own tail. Tail zipping enjoys the following
properties.

  all(u,v:LIST[G], e1,e2:G)
    ensure
      nil.tail_zipped = nil

      (e1::nil).tail_zipped = nil

      (e1::e2::u).tail_zipped = [e1,e2]::(e2::u).tail_zipped

      u=nil => (u+v).tail_zipped = v.tail_zipped

      v=nil => (u+v).tail_zipped = u.tail_zipped

      (u+e1+v).tail_zipped = (u+e1).tail_zipped + (e1::v).tail_zipped

      u/=nil
      => v/=nil
      => (u+v).tail_zipped
         = u.tail_zipped + [u.last,v.first] + v.tail_zipped

    end

Example:

  ([1,2,3,4]).tail_zipped = [[1,2],[2,3],[3,4]]

A list is sorted if in the tail zipped list all tuples have a first component
which is less or equal than its second component.

Swapping

The expression “l.swapped(i,j)” returns a list which is equal to “l” except
that the elements in position “i” and “j” are swapped.

  swapped:
  all(l:LIST[G], j:NATURAL)
    require
      j /= 0
      j < l.count
    ensure
      swap1: l.swapped(j-1,j).is_permutation(l)
          -- "is_permutation" is an equivalence relation

      swap2: l.swapped(j-1,j).take(j-1) = l.take(j-1)
      swap3: l.swapped(j-1,j).drop(j)   = l[j-1]::l.drop(j+1)
      swap4: l.swapped(j-1,j).drop(j-1) = l[j]::l[j-1]::l.drop(j+1)
    end

The insertion sort algorithm

The algorithm starts with an empty prefix of the content. An empty list is
always sorted. Then it loops over all elements successively and inserts the
new elements into the correct position of the already sorted prefix of the
list. The algorithm terminates if all elements are inserted at their proper
location.

We put the needed functions and procedures in a module “insertion_sorter”.

  -- module: insertion_sorter

  immutable class
    INSERTION_SORTER[G:ORDER]
  end

The class INSERTION_SORTER is not needed to create objects from it. It just
serves to express the constraint that the formal generic G has to satisfy the
concept of a total order relation.

Predicate “is_sorted”

We need a predicate which tells whether a list is sorted and whether an array
is sorted.

  is_sorted(l:LIST[G]): ghost BOOLEAN
    ensure
      Result = all(x,y:G) [x,y] in l.tail_zipped =>  (x<=y)
    end


  is_sorted(a:ABSTRACT_ARRAY[G]): ghost BOOLEAN
    ensure
      Result = a.content.is_sorted
    end

Note that the predicates need not be computable. It is easy to make them
computable. But it is not worth the effort.

If we have two lists “u+e” and “e::v” which are individually sorted the sum
“u+e+v” is sorted as well.

  glueing:
  all(u,v:LIST[G], e:G)
    require
      (u+e).is_sorted
      (e::v).is_sorted
    ensure
      (u+e+v).is_sorted
    end

This claim can be proved by

  all(u,v:LIST[G], e:G)
    require
      r1: (u+e).is_sorted
      r2: (e::v).is_sorted
    check
      all(x,y:G)
        require
          [x,y] in (u+e+v).tail_zipped
        check
          (u+e+v).tail_zipped = (u+e).tail_zipped + (e::v).tail_zipped
          if [x,y] in (u+e).tail_zipped then
            x<=y  -- r1
          else
            [x,y] in (e::v).tail_zipped
            x<=y  -- r2
          end
        ensure
          x<=y
        end
    ensure
      (u+e+v).is_sorted
    end

The basic algorithm

The procedure “sort” has to satisfy the contract:

  feature
    sort(a:ABSTRACT_ARRAY[G])
      ensure
        a.is_sorted
        a.content.is_permutation(old a.content)
      end
  end

The algorithm needs a loop which keeps some prefix of length “i” of “a”
sorted. I.e. the loop works with the invariant

  i <= a.count
  a.content.take(i).is_sorted
  a.content.is_permutation(old a.content)

The invariant is satisfied if we initialize “i” with zero. At each step of the
algorithm we have to make the sorted prefix of “a” one element larger. The
following algorithm does the job.

  feature
    sort(a:ABSTRACT_ARRAY[G])
      local
        i: NATURAL := 0
      do
        from invariant
          i <= a.count
          a.content.take(i).is_sorted
          a.content.is_permutation(old a.content)
        variant
          a.count - i
        until
          i = a.count
        loop
          insert_ith(a,i)  -- see below
          i := i + 1
        end
      ensure
        a.is_sorted
      end
  end

It is easy to verify that after termination of the loop the result is
valid. The exit condition states “i=a.count” and the invariant guarantees
“a.content.take(i).is_sorted”. Both imply the postcondition.

In order to prove the algorithm correct we need to make sure that the
invariant is maintained in each iteration of the loop. In order to prove this
we need the assignment axiom.

  -- Assigment axiom: "x" must be a local variable and
  --                  must not be an attribute of the surrounding class!
  all[G](x,exp:G, cond:BOOLEAN)
    require
      cond[x:=exp]
    do
      x := exp
    ensure
      cond
    end

It basically states the following: “If we want to have a boolean condition
“cond” valid after the assignment statement “x:=exp” then we have to guarantee
that “cond[x:=exp]” (i.e. cond with all free occurrences of “x” substituted by
“exp”) is valid before the assignment.

With the assignment axiom we can backtransform the invariant through the
statement “i:=i+1” and get

  require
    r1: i+1 <= a.count
    r2: a.content.take(i).is_sorted
    r3: a.content.is_permutation(old a.content)
  do
    i := i + 1
  ensure
    i <= a.count
    a.content.take(i+1).is_sorted
    a.content.is_permutation(old a.content)
  end

The procedure “insert_ith” cannot modify “i” (NATURAL is immutable). Therefore
the condition “r1” can be derived from the invariant “i<=a.count” and the
negated exit condition “i/=a.count”.

The condition “a.content.take(i+1).is_sorted” must be established by
“insert_ith” and the condition “a.content.is_permutation(old a.content)” must
be maintained.

Therefore the procedure “insert_ith” has to satisfy the contract

  require
    i < a.count                  -- from the loop invariant and negated exit
                                 -- condition

    a.content.take(i).is_sorted  -- from the loop invariant

    a.content.is_permutation(old a.content) -- from the loop invariant
  do

    insert_ith(a,i)

  ensure
    a.content.take(i+1).is_sorted
    a.content.is_permutation(old a.content)
  end

The procedure “insert_ith”

Contract of “insert_ith”

We need a procedure “insert_ith” which satisfies the contract

  feature{NONE}
    insert_ith(a:ABSTRACT_ARRAY[G], i:NATURAL)
      require
        i < a.count
        a.content.take(i).is_sorted
      ensure
        a.content.take(i+1).is_sorted
        a.content.is_permutation(old a.content)
      end
  end

The procedure can be private, therefore the “feature{NONE}”.

Description of the loop

We need a loop to establish the postcondition, because the element “a[i]” has
to be put in the correct position. We swap it with the previous element as
long as the previous element greater than the element to be inserted. The loop
terminates if the previous element is not greater than the element to be
inserted.

In order to specify the algorithm we need a predicate that a list is sorted
except at one position.

  is_sorted_except(l:LIST[G], j:NATURAL): ghost BOOLEAN
    ensure
      Result = check
        sorted_without_j: (l.take(j)+l.drop(j+1)).is_sorted
        sorted_from_j:    l.drop(j).is_sorted
      end
    end

At the enty point of the routine the condition

  a.content.take(i+1).is_sorted_except(i)

is valid because “a.content.take(i).is_sorted” is already valid and
“a.content.take(i+1)” has just one element more (the element “i”). Therfore it
is sorted except at position “i”. Note that “a.content.take(i+1).drop(i+1)” is
empty and “a.content.take(i+1).drop(i)” has just one element.

The algorithm uses a loop with the loop variable “j”. The variable “j” has
initially the value “i”. The loop body decreases the variable “j” at each
iteration maintaining the invariant.

We use the following invariant

  j <= i
  a.content.is_permutation(old a.content)
  a.content.take(i+1).is_sorted_except(j)

and the following pattern for the algorithm:

  insert_ith(a:ABSTRACT_ARRAY[G], i:NATURAL)
    require
      i < a.count
      a.content.take(i).is_sorted
    local
      j:NATURAL := i
    do
      from invariant
        inv1: j <= i
        inv2: a.content.is_permutation(old a.content)
        inv3: a.content.take(i+1).is_sorted_except(j)
      variant
        j
      until
        -- a.content.take(i+1).is_sorted
      loop
        -- establish the invariant for decreased j
        j := j-1
      end
    ensure
      a.content.take(i+1).is_sorted
      a.content.is_permutation(old a.content)
    end

The algorithm is already correct, we just need to fill in the missing parts.

Exit condition

Since we have the invariant “inv3” we know that “a.content.take(i+1).drop(j)”
is always sorted. In case of “j=0” this is identical to “a.content.take(i+1)”
and we can terminate the loop. Therefore “j=0” is one possible exit condition.

But there is a second possibility. The element at the exception position “j”
might fit already well. This is the case if “a[j-1]<=a[j]” is valid. I.e. the
complete exit condition is

  j=0 or a[j-1]<=a[j]  -- exit condition

We have to prove that the exit condition and the invariant guarantee the
desired result. The following lemma verifies exactly that condition.

  all(l:LIST[G], j:NATURAL)
    require
      r1: j < l.count
      r2: l.is_sorted_except(j)
      r3: j=0 or l[j-1]<=l[j]
    check
      if j=0 then
        c1: l.drop(0).is_sorted  -- r2
        c2: l.is_sorted
      else
        c3: l[j-1] <= l[j]                         -- r3 and "j/=0"
        c4: (l[j]::l.drop(j+1)).is_sorted          -- r3, "sorted_from_j"
        c5: (l[j-1]::l[j]::l.drop(j+1).is_sorted   -- c3,c4
        c6: (l.take(j-1)+l[j-1]).is_sorted         -- r3, "sorted_without_j"
        c7: (l.take(j-1)+l[j-1]::l[j]::l.drop(j+1)).is_sorted
            -- paste c5,c6
        c8: l.is_sorted  -- c7
      end
    ensure
      l.is_sorted -- c2,c8
    end

Establish the invariant for decreased “j”

At the entry point of the loop body the exit condition is invalid (otherwise
the loop would be terminated). The negated exit condition is

  j/=0 and a[j] < a[j-1]  -- negated exit condition

i.e. “a[j-1]” and a[j] are out of order. The element a[j] is still to small
for its position. We simply swap the two elements. Since the sequence without
“a[j]” is perfectly sorted, the invariant is kept. Just the possible out of
order element has been pushed one position down.

We get the complete loop

  from invariant
    inv1: j <= i
    inv2: a.content.is_permutation(old a.content)
    inv3: a.content.take(i+1).is_sorted_except(j)
  variant
    j
  until
    j=0  or a[j-1]<=a[j]
  loop
    a.swap(j-1,j)
    j := j-1
  end

The following lemma verifies that the invariant “inv3” is maintained.

  all(l,ls:LIST[G], i,j:NATURAL)
    require
      r1: j /= 0
      r2: j < l.count
      r3: l.is_sorted_except(j)
      r4: l[j]<=l[j-1]
    check
      c01: (l.take(j) + l.drop(j+1)).is_sorted  -- r3, def "is_sorted_except"
      c02: l.drop(j).is_sorted                  -- r3, def "is_sorted_except"
      c03: (l[j]::l.drop(j+1)).is_sorted        -- c02
      c04: (l[j-1]::l.drop(j+1)).is_sorted      -- c01

      c09: (l.take(j-1) + l[j-1]::l.drop(j+1)).is_sorted  -- c01, assoc
      c10: (l[j]::l[j-1]::l.drop(j+1)).is_sorted          -- c03,c04,r4
    ensure
      l.swapped(j-1,j).is_sorted_except(j-1)
               -- c09, c10, swap2, swap3, swap4, def "is_sorted_except"
    end
  end

The complete algorithm

  -- module: insertion_sorter

  immutable class
      INSERTION_SORTER[G:ORDER]
  end


  feature
      is_sorted(l:LIST[G]): ghost BOOLEAN
          ensure
              Result = all(x,y:G) [x,y] in l.tail_zipped =>  (x<=y)
          end


      is_sorted(a:ABSTRACT_ARRAY[G]): ghost BOOLEAN
          ensure
              Result = a.content.is_sorted
          end

      sort(a:ABSTRACT_ARRAY[G])
          local
              i: NATURAL := 0
          do
              from invariant
                  i <= a.count
                  a.content.take(i).is_sorted
                  a.content.is_permutation(old a.content)
              variant
                  a.count - i
              until
                  i = a.count
              loop
                  insert_ith(a,i)  -- see below
                  i := i + 1
              end
          ensure
              a.is_sorted
          end
  end


  feature {NONE}
      is_sorted_except(l:LIST[G], j:NATURAL): ghost BOOLEAN
          ensure
              Result = check
                  sorted_without_j: (l.take(j)+l.drop(j+1)).is_sorted
                  sorted_from_j:    l.drop(j).is_sorted
              end
          end

      insert_ith(a:ABSTRACT_ARRAY[G], i:NATURAL)
          require
              i < a.count
              a.content.take(i).is_sorted
          local
              j:NATURAL := i
          do
              from invariant
                  inv1: j <= i
                  inv2: a.content.is_permutation(old a.content)
                  inv3: a.content.take(i+1).is_sorted_except(j)
              variant
                  j
              until
                  j=0  or a[j-1]<=a[j]
              loop
                  a.swap(j-1,j)
                  j := j-1
              end
          ensure
              a.content.take(i+1).is_sorted
              a.content.is_permutation(old a.content)
          end
  end
Advertisements
  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: