Counting number of inversion in an array (python)

def count_inversion(array):
    (count,sorted_array) = count_inversion_helper(array)
    return count

def count_inversion_helper(array1):
    """"
    counts the number for inversion in one array we assume the inversions
    are related to sorted array from
     0 to n (pep8 tested with 10.0 score)
    :author name: Tanmay Datta
    :param array1: array to count (list type)
    :return: number of inversoin
    :type array1: List(Num)
    """"
    length = len(array1)
    if length <= 1:
        #Trivial case and no inversion
        return (0, array1)
    if length == 2:
        if array1[0] &amp< array1[1]:
            return (0, array1)
        else:
            return (1, [array1[1], array1[0]]) # sorted array
    else:
        half = length/2
        right_array = array1[half:]
        (left_inversion, left_array) = count_inversion_helper(array1[:half])
        (right_inversion, right_array) = count_inversion_helper(array1[half:])
        sorted_array = []
        left_idx = 0
        right_idx = 0
        left_size = len(left_array)
        right_size = len(right_array)
        total_inversions = left_inversion + right_inversion
        while True:
            if right_array[right_idx] < left_array[left_idx]:
                sorted_array.append(right_array[right_idx])
                total_inversions += left_size - left_idx
                right_idx += 1

            else:
                sorted_array.append(left_array[left_idx])
                left_idx += 1
            if left_idx >= left_size:
                sorted_array += right_array[right_idx:]
                break
            if right_idx >= right_size:
                sorted_array += left_array[left_idx:]
                break
        return (total_inversions, sorted_array)

Another funky implementation which uses a “bisect” function already given by python libraries

import bisect

def ms_funky(arr):

sorted_array = sorted(arr)

r = 0

for n in arr:

    pos = bisect.bisect_left(sorted_array,n)

    r += pos

    sorted_array.pop(pos)

return r

Advertisements

Merge sort in python (implementation,testing,re implementation, style fix)

I am taking one online course on algorithms and thought would write a programming tutorial (or at least my thought process) along with implementation of merge-sort algorithm.

————————— merge-sort rules——————————-

If array size is 1 it’s always sorted

If array size is 2 then sorting is trivial and could be done as (if a[1] < a[2] then good otherwise reverse it)

In all other case divide array in 2 parts and perform sort individually.

Once both sided are sorted the merge them back iteratively by checking each element one by one in  the half array

—————————————-First attempt ————————————————-

def merge_sort(arr):

l = len(arr)

if l <= 1:

    return arr

elif l ==2:

if arr[0] < arr[1]:

    return arr

else:

    return [arr[1], arr[0]]

else:

    half = l // 2

    left = merge_sort(arr[:half])

    right = merge_sort(arr[half:])

    sorted_arr = []

    for idx in xrange(0,half):

    if left[0] <= right[0]:

        v = left.pop(0)

    else:

        v = right.pop(0)

    sorted_arr.append(v)

    sorted_arr = sorted_arr + left + right

    return sorted_arr

Looks ok (hmmm!!!!)

As a programmer the most important thing that you need to check a code for is ____ –>

“Correctness”

let’s write some tests

What are good tests ?

1. Checks for “common” values correctness

2. Checks for “extreme” cases correctness

3. Checks for exceptions and correctness

from back of my mind I can think of following cases

test1 = [1,2,3,4,5,6] (Checks what happens when array is already sorted)

test2 = [4,3,1,7,6,5] (checks for the real sorting problem under normal kind of case (even numbers)

test3 = [4,3,1,7,6,5,2]  (checks similar to test 2 along just odd numbers)

test4 = [] (Extreme case when array is empty)

test5 = [4,4,4,4,4] (Extreme case when all elements are same)

test6 =[2,2,1,1,5,3,6,100] (Not extreme but good case having some duplicates)

———————————————Input————————————————-


test1 = [1,2,3,4,5,6]

test2 = [4,3,1,7,6,5]

test3 = [4,3,1,7,6,5,2]

test4 = []

test5 = [4,4,4,4,4]

test6 =[2,2,1,1,5,3,6,100]

print merge_sort(test1)

print merge_sort(test2)

print merge_sort(test3)

print merge_sort(test4)

print merge_sort(test5)

print merge_sort(test6)

——————————————output at first attempt——————————————


[1, 2, 3, 4, 5, 6]
[1, 4, 3, 5, 7, 6]
[1, 2, 4, 3, 5, 6, 7]
[]
[4, 4, 4, 4, 4]
[1, 1, 2, 2, 3, 5, 6, 100]

Hmm so the code has some bug for test2 and test3 😦

what happend?

The code has a logic flaw (instead of a for there should be a while) posting the code without explaining the logical flaw as I think that is quite trivial

def merge_sort(arr):

l = len(arr)

if l <= 1:

return arr

elif l ==2:

if arr[0] < arr[1]:

return arr

else:

return [arr[1], arr[0]]

else:

half = l // 2

left = merge_sort(arr[:half])

right = merge_sort(arr[half:])

sorted_arr = []

while len(left) != 0 and len(right) != 0:

if left[0] <= right[0]:

v = left.pop(0)

else:

v = right.pop(0)

sorted_arr.append(v)

sorted_arr = sorted_arr + left + right

return sorted_arr

——— So all the test passed after this ——

Next step –>

Fix style

Style depends a lot on programmers taste but there are certain things that everyone (at least every python programmer) would appreciate. So I am going to just touch those points

def merge_sort_withstyle(arr):

"""

The program is a style corrected version written by Tanmay for

mergesort

"""

length = len(arr)

sorted_arr = arr if length

return arr if length <= 1

if length <= 1:

return arr

elif length == 2:

return arr if arr[0] < arr[1] else [arr[1], arr[0]]

else:

half = length//2

left = merge_sort_withstyle(arr[:half])

right = merge_sort_withstyle(arr[half:])

while len(left) != 0 and len(right) != 0:

value = left.pop(0) if left[0] < right[0] else right.pop(0)

sorted_arr.append(value)

sorted_arr += left + right

return sorted_arr

Next step (as a mathematician and a really good programmer)

Optimize 

(have fun !! )

Python gotcha

While browsing for some of the python “gotcha” I realized that everyone has the list gotcha but in my opinion that is always incomplete Consider this code

def gotcha(bad_list=[]):
bad_list.append(3)
print bad_list
def gotcha_again(bad_list=[]):
bad_list = bad_list + [3]
print bad_list

print "calling gotcha_again"
gotcha_again()
gotcha_again()
print "calling gotcha"
gotcha()
gotcha()

What happens when you call these two functions twice ?

If you have read the gotcha and applied it without understanding the reason, then you might expect both answers to be same.

However they would print :
calling gotcha_again
[3]
[3]
calling gotcha
[3]
[3, 3]

The reason has to with the inconsistency in the python language core( again my opinion). Python language is confused (although less than c++) b/w functional and OO and procedural language. Sometimes it would follow the “purity” concept of the functional languages and sometime it would completely violates it.

One other example of this conflict is

list.sort()

sorted(list)

Why do you need two behavior for sorting list ?

…. May be for creating an interview question ;)…

Why should anyone care about consistency.

…It creates “well behaved” product..
While developing a framework I realized this fact the hard way. If all the functions have similar behavior then the need to go through the documentation for each functions is reduced considerably. You can make valid and reasonable assumptions about even those parts of codes which you have not thoroughly studied.

Example.. now if someone is not expert in python then because of this sorting irregularity he/she can never be sure if any function on list would change the input list or would return some value. You have to confirm it always (Not that confirming is bad but still…)

 

Funtional programming with Scala

My first impression of scala is that it’s too verbose. Thing that could be done in may be 1 line in haskell and 2 lines in erlang takes around 5-10 lines(Might not look like a big difference for 1 line but production codes easily span thousands of line even with the most expressive languages and here the factor is of 5). Anyways, I am still new so might not be my place to judge it.

Here goes implementation of some functions / practice problems in scala.

1. Pascal triangle such that pascal(r,c) such function return value at rth row and cth column

a. Haskel version used ZipWith [Signature ((A->A->A) -> [A] -> [A] -> [A]) which is not in scala so I wrote my own version.

(I don’t like the fact that I was not able to pattern match the arguments to map.. to do –> Find a way to pattern match the tuple)

def zipWith[A] (f: (A,A) => A, l1: List[A], l2: List[A]) : List [A] = {

val zipList = l1.zip(l2)

zipList.map(x=>f(x._1,x._2))

}

———————– Now goes my implementation of the above function —————————-

def pascal(c: Int, r: Int) : Int  {

def pascalH(pr: List[Int], currentRow :Int): List [Int] =

{

if (r==currentRow) pr else

{

val l1  = pr :+ 0

val l2 = List(0) ++ pr

val next_list = zipWith((x:Int,y:Int)=> x+y), l1,l2)

pascalH(next_list,cr+1)

}

val thePascalRow = pascalH(List(1),0)

thePascalRow(c)

}

—– Note that the helper definition works here because scala (as other functional languages) has lexical closure..

I am still new to scala to could not figure out how to use + instead of that lambda in the zipWith call — if you know please leave a comment

Debugging/Reverse engineering Linux shell script.

(Intended for someone[me] who don’t faint on sight of some assembler code and can look for relevant information from the dump of a binary).

(In draft)

1.Debugging

using option vx for debugging shell script might be a good idea when you can not find anything wrong by the visual inspection

[howTo]

say you have a shellscript called DoGreatStuff

called with parameters me you they $4 etc

call it as : set -vf DoGreatStuff me you they $4 etc

Example run for cd for list in my computer is

set -vx ls $1
set -vx ls $1
__vte_prompt_command
++ __vte_prompt_command
__vte_osc7)”
__vte_osc7)
__vte_osc7
+++ __vte_osc7
__vte_urlencode “${PWD}”)”
__vte_urlencode “${PWD}”)
__vte_urlencode “${PWD}”
++++ __vte_urlencode /home/tanmay
++++ LANG=C
++++ str=/home/tanmay
++++ ‘[‘ -n /home/tanmay ‘]’
++++ safe=/home/tanmay
++++ printf %s /home/tanmay
++++ str=
++++ ‘[‘ -n ” ‘]’
++++ ‘[‘ -n ” ‘]’
+++ printf ’33]7;file://%s%s\a’ tanmay /home/tanmay
++ printf ’33]0;%s@%s:%s07%s’ tanmay tanmay ‘~’ ”

Ignore other things but you can see that ls is a C program.. (Not a good example here but I hope explains the point here)

Conditional parameters substitution in Unix

As a developer who is coding for future, there are conditions when we encounter uncertainty (Monads !! ) 

Shell scripts provide a way to construct some expressions based on these uncertainties. One way is to use “Dollar-Braces”

syntax for that is ${variable-defaultValue}

so if you start a shell and do 

echo ${myVar-variable not defined}

you would get response as

$> variable not defined

now if you do

myVar=nowDefined

echo ${myVar-variable not defined}

you will get

nowDefined

Linux Dollars($)

Linux/Unix shell provides some of very nice functionalities (That should always be kept in mind while developing own software).  While writing shell scripts there are number of inbuilt “special” variables provided for developers. Following are the variables that starts with dollar sign ‘$’

1. $# –> # sign occasionally stands for number and this variable tells number of positional parameters in a shell script eg. myCommand -u -v hello dev

has $# = 4

2. $- –> This one tells you about starting parameters for your shell (bash, korn etc)

if I run it on my bash terminal it gives output of himBH

3. $? –> ? is used to ask questions, here too it’s purpose is to ask: “what happened to last executed command”

0 == success

1 == failure

2 == some diaster

4. $$ –> What is my own PID

5. $! –> bang(!) pattern on shell has a use similar to its use here (for last command) here it stores the answer for “what is the process number of last background command”

6. $0 –> What is my name (shell script’s name) on the terminal it would give you -bash or -korn etc

7. $* –> list of all the shell argument

8 $@ –> Same as $* except when enclosed in double quotes