I am taking one online course on algorithms and thought would write a programming tutorial (or at least my thought process) along with implementation of merge-sort algorithm.

————————— merge-sort rules——————————-

If array size is 1 it’s always sorted

If array size is 2 then sorting is trivial and could be done as (if a[1] < a[2] then good otherwise reverse it)

In all other case divide array in 2 parts and perform sort individually.

Once both sided are sorted the merge them back iteratively by checking each element one by one in the half array

—————————————-First attempt ————————————————-

def merge_sort(arr): l = len(arr) if l <= 1: return arr elif l ==2: if arr[0] < arr[1]: return arr else: return [arr[1], arr[0]] else: half = l // 2 left = merge_sort(arr[:half]) right = merge_sort(arr[half:]) sorted_arr = [] for idx in xrange(0,half): if left[0] <= right[0]: v = left.pop(0) else: v = right.pop(0) sorted_arr.append(v) sorted_arr = sorted_arr + left + right return sorted_arr

Looks ok (hmmm!!!!)

As a programmer the most important thing that you need to check a code for is ____ –>

**“Correctness”**

let’s write some tests

What are good tests ?

1. Checks for “common” values correctness

2. Checks for “extreme” cases correctness

3. Checks for exceptions and correctness

from back of my mind I can think of following cases

test1 = [1,2,3,4,5,6] (Checks what happens when array is already sorted)

test2 = [4,3,1,7,6,5] (checks for the real sorting problem under normal kind of case (even numbers)

test3 = [4,3,1,7,6,5,2] (checks similar to test 2 along just odd numbers)

test4 = [] (Extreme case when array is empty)

test5 = [4,4,4,4,4] (Extreme case when all elements are same)

test6 =[2,2,1,1,5,3,6,100] (Not extreme but good case having some duplicates)

———————————————Input————————————————-

test1 = [1,2,3,4,5,6] test2 = [4,3,1,7,6,5] test3 = [4,3,1,7,6,5,2] test4 = [] test5 = [4,4,4,4,4] test6 =[2,2,1,1,5,3,6,100] print merge_sort(test1) print merge_sort(test2) print merge_sort(test3) print merge_sort(test4) print merge_sort(test5) print merge_sort(test6)

——————————————output at first attempt——————————————

[1, 2, 3, 4, 5, 6] [1, 4, 3, 5, 7, 6] [1, 2, 4, 3, 5, 6, 7] [] [4, 4, 4, 4, 4] [1, 1, 2, 2, 3, 5, 6, 100]

Hmm so the code has some bug for test2 and test3 😦

what happend?

The code has a logic flaw (instead of a for there should be a while) posting the code without explaining the logical flaw as I think that is quite trivial

def merge_sort(arr): l = len(arr) if l <= 1: return arr elif l ==2: if arr[0] < arr[1]: return arr else: return [arr[1], arr[0]] else: half = l // 2 left = merge_sort(arr[:half]) right = merge_sort(arr[half:]) sorted_arr = [] while len(left) != 0 and len(right) != 0: if left[0] <= right[0]: v = left.pop(0) else: v = right.pop(0) sorted_arr.append(v) sorted_arr = sorted_arr + left + right return sorted_arr

——— So all the test passed after this ——

Next step –>

**Fix style**

Style depends a lot on programmers taste but there are certain things that everyone (at least every python programmer) would appreciate. So I am going to just touch those points

def merge_sort_withstyle(arr): """ The program is a style corrected version written by Tanmay for mergesort """ length = len(arr) sorted_arr = arr if length return arr if length <= 1 if length <= 1: return arr elif length == 2: return arr if arr[0] < arr[1] else [arr[1], arr[0]] else: half = length//2 left = merge_sort_withstyle(arr[:half]) right = merge_sort_withstyle(arr[half:]) while len(left) != 0 and len(right) != 0: value = left.pop(0) if left[0] < right[0] else right.pop(0) sorted_arr.append(value) sorted_arr += left + right return sorted_arr

Next step (as a mathematician and a really good programmer)

**Optimize **

(have fun !! )