Puzzles, Maths and Algorithms: 2010

March 26, 2010

Problems solvable using Hashtable

Hashtable are extremely useful data-structure as they provide storage and retrieval in O(1) time (amortized). Several problems of algorithms can be very efficiently solved using hashtables which otherwise turn out to be quite expensive. In this post, we consider some of these problems:

Problem 1: Remove duplicate elements from an unsorted array of size N.

Solution: show

Lets first see the solution without hashtable. Naive solution would be to compare each pair of elements. If they are same then drop one of them. This solution would take O(N^2) time in average case and O(N) time in best case, when all the elements are same. The following code does that:

for i = 1 to N-1
  if a[i] == NaN: continue
  print a[i]
  
  for j = i+1 to N
    if a[i] == a[j]: a[j] = NaN

Above code prints all non-duplicates from the array. Naive solution is average case O(N^2). In practice if you believe that array consists of few numbers duplicated lots of times, say K numbers duplicated N/K times, then the running time of above algorithm is O(KN).

We can do better than worst case O(N^2) time. If we sort the elements in O(N logN) time and compare adjacent elements in O(N), we can solve the above problem in O(N logN) time.

sort(a) // O(N log N) time, O(1) space using Heap Sort

k = a[1]
for i = 2 to N
  if a[i] == k: a[i] = NaN
  else: k = a[i]

Using heap sort, we can make the above algorithm run in O(N log N) time in worst case and it would take O(1) extra space (apart from N for the array). Using hashtable, we can solve this problem in O(N) time (amortized) and O(N) space (for the hash table). The algorithm looks like following:

for i = 1 to N
  if hash.has_key(a[i]) == false:
    print a[i]
    hash.put(a[i])

Above algorithm stores an unseen element in hash-table and ignores the seen element. To check if an element is seen or not, hash table provides storage/retrieval methods in O(1) time.

Problem 2: Find intersection of K unsorted array of N elements each. Intersection consists of elements that appear in all the K arrays.

Problem 3: How to make a linked list support operations in O(1) time. The operations on linked list can be insertion after any arbitrary valued node, deletion of any arbitrary valued node.

Problem 4: Find all unique pairs of element in an array that sum to S. For ex. If array = {2,4,6,4,6} and S = 8 then answer is {(2,6), (4,4)}

Problem 5: Consider an array containing unique elements. Find a triplet of elements in the array that sum to S (extension of problem 4). Can hash-tables improve the running time of your algorithm.

Problem 6: Consider two strings of size M, N. Perform string matching in size O(M+N).

March 19, 2010

Dragon and Knight

Problem: A dragon and knight live on an island. This island has seven poisoned wells, numbered 1 to 7. If you drink from a well, you can only save yourself by drinking from a higher numbered well. Well 7 is located at the top of a high mountain, so only the dragon can reach it.

One day they decide that the island isn't big enough for the two of them, and they have a duel. Each of them brings a glass of water to the duel, they exchange glasses, and drink. After the duel, the knight lives and the dragon dies.

Why did the knight live? Why did the dragon die?

Solution: show

Problem: Now consider that Dragon and Knight are equally intelligent then who is expected to die.

Answer: Both survive the battle.

Solution: show

March 18, 2010

Number and Age of David's Kids

Problem: The product of the ages of David's children is the square of the sum of their ages. David has less than eight children. None of his children have the same age. None of his children is more than 14 years old. All of his children is at least two years old. How many children does David have, and what are their ages?

Answer: (12,6,4,2)

Solution: show

A way to mathematically analyze this problem is to write the general form of equations, differentiate and maximize w.r.t to largest age. That leads to a Fibonacci number series. But after that complex heuristics have to be employed to devise the solution (and we cannot guarantee if that is the unique).

This is a constraint satisfaction problem. The best way to solve it is write the code. The following code solves the problem in general sense:

Function find_ages(kidAges)  
   s = sum(kidAges)
   p = product(kidAges)
   if s*s == p
      print kidAges

   l = length(kidAges)
   if l >= 7
      return
   else if l == 0
      x = 14
   else
      x = kidAges[l]
   end

   for i = 2 to x-1
      kidAges1 = clone(kidAges)
      kidAges1.add(i)
      find_ages(kidAges1)
   end
end

// call
find_ages([])

After running the code, we see the following solution

[12, 6, 4, 2]

Note that in the above pseudo code, we cloned the kidAges array. If we dont clone then it wont work properly. Down side of cloning is that now we use a huge amount of memory. How do we write code without cloning? Check the python script attached below.

Source Code: david_kids_ages.py

March 11, 2010

Largest Sum of Consecutive Numbers

Problem: Given an array of N integers (both positive and negative), find the sub-sequence with largest sum.

For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest sum is 10 (start = 4, end = 7)

Solution: show

The naive solution to this problem can be formulated and coded in O(N^3) time. Consider every possible pair of start and end index and calculate sum of the elements within those index and keep track of the max sum. The following code performs that

max = - Inf
imax = jmax = -1
for i = 1 to N
  for j = i to N
    
    // get sum of elements enclosed in index (i,j)
    s = 0
    for k = i to j
      s += A[k]

    if s > max
      max = s
      imax = i
      jmax = j

Clearly the above algorithm is O(N^3). We can definitely improve the running time of the above algorithm. After observing carefully, we can see that an operation that is run several times (O(N^2) times) is computing the sum of elements between indices i,j. Can we do it quickly. Indeed we can.

If we save the cumulative sum of elements in an array CSUM, then CSUM[i] indicates the sum of elements enclosed in indices (1,i), CSUM[j]-CSUM[i-1] would then indicate the sum of elements enclosed in indices (i,j). The following algorithm captures this intuition.

csum[0] = 0
for i = 1 to N
  csum[i] = csum[i-1] + A[i]

max = - Inf
imax = jmax = -1
for i = 1 to N
  for j = i to N
    s = CSUM[j] - CSUM[i-1]
    if s > max 
      max = s
      imax = i 
      jmax = j

This is the best we can do with this approach. This problem can be done in O(N) time using a clever approach. Consider a sub-sequence with sum s > 0. Let the next element n is such that s + n < 0. Clearly this sub-sequence cannot be part of largest subsequence that contains s and n as removing s,n from that sub-sequence we get a larger sum. Easy to prove using contradiction. This idea is captured in the following O(N) algorithm

max = -Inf
imax = jmax = -1
i_temp = -1
s_temp = 0
for i = 1 to N 
  s_temp += A[i]

  if s_temp > max
    // keep track of max so far
    max = s_temp
    imax = i_temp
    jmax = i

  if s_temp < 0
    // abort this sub-sequence
    s_temp = 0
    i_temp = i + 1

Problem: Given an array of N integers (both positive and negative), find the sub-sequence with largest absolute sum. This problem differs from the one at the top as we want the absolute sum (taking mod)

For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest absolute sum is 11.

Solution: show

Simply keep track of max and min at the same time in the above O(N) solution.

Assuming that we have to use the above algorithm without modification, then we can take these steps to get max absolute sum in O(N) time:

1. Run O(N) algorithm on array A to get the max
2. Create array B by multiply all numbers in A by -1
3. Run O(N) algorithm on array B to get the max
4. Pick the max between the output of two runs of the algorithm

Instead of trying the above two methods, we can try a very nice and simple approach

// build csum matrix as before
csum[0] = 0
for i = 1 to N
  csum[i] = csum[i-1] + A[i]

answer = max(csum) - min(csum)

The answer is maximum element of csum minus the minimum element of csum.

February 3, 2010

Polynomial Evaluation

Problem: Given a polynomial with degree bound of n=m^r. How do we evaluate the polynomial at n different points such that the polynomial can be recovered from these n points (i.e. the coefficients can be calculated back using the points).

Note: We do not want to recover the coefficients but cleverly choosing the points as that impacts the running time of the algorithm.

Solution: show