January 31, 2009

Loops in Linked List

Problems with linked lists occur mainly while building or using a bad memory manager. Let us discuss two well known linked list problems.

Problem: Find if two linked lists short at some node. Find the node at which the two lists short.

Solution: show

Problem: A linked list might contain a loop. How do we detect existence of the loop and find the node from which loop starts. Propose an O(n) time algorithm that doesn't take extra space and doesn't modify the linked list.

Solution: show

Monty Hall Problem

This is a very famous probability problem. This problem illustrates that probability is sometimes more convincing than "what we perceive as obvious". I'll post another problem that defeats "obvious reasoning".

Monty Hall problem: Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

Solution: show

Red Green Cards: A box has three cards. First card has both sides green, second card has both sides red, third card has one side green and one side red. A card is picked at random and its one side is observed to be green. What is the probability that the other side of the card is also green.

Solution: show

January 30, 2009

Finding Second Smallest Element

Jargon: Order Statistics problem is to find the kth smallest element in an unsorted array A[1..n]. This problem can be solved in O(n) time in average case using randomized select algorithm or in O(n) time in worst case time using an algorithm that chooses the pivot element carefully.

Problem: How to find second smallest element in an array A[1..n] of size n. We would like to do in much less than 2n comparisons.

Answer: Time Complexity = n + ceil(log n) - 2

Solution: show

Code for n + log n - 2 algorithm: second_smallest.py
Alternate implementation using linked list: second_smallest_linkedlist.py

January 23, 2009

String Permutations

Printing all permutations of a string is a very common interview question. We'll discuss this problem and some interesting variations of it. The original problem of string permutation says, "print all permutations of a string". As an example, if the string is "abc" there are 6 permutations {abc, acb, bac, bca, cab, cba}. Assume that string contains no duplicate elements.

Solution: show

Source Code: permute_unique_chars.py

Lets us assume that string can contain duplicates. How do we print all non-redundant permutations of the string. For ex. If string is "abab" the permutations are {baab, abba, abab, baba, bbaa, aabb}

Solution: show

Source Code: permute_chars.py

Some other constraints could be added to make the problem more interesting. One such constrain could be of partial ordering of characters. For Ex. 'a' should appear before 'c', and 'b' should appear before 'c' in all permutations. There can exist ordering that leads to no permutation of the string such as 'a' precedes 'b', 'b' precedes 'c', 'c' precedes 'a'. Additionally implementing the permutation algorithm in these cases require checking for ordering violation at each permutation level or just at the leaf of recursion, depending on whether checking for ordering violations is more costly or generating all permutations.

An interesting problem, that talks of special kind of ordering and can be very efficiently computed is as follows: Assume that the string is made up of equal number of '{' and '}'. You need to print all permutations of this string which could satisfy a C compiler i.e. an ending bracket is balanced by an opening bracket. For ex. for '{{}}' the permutations are [{{}}, {}{}].  

Solution: show

Source Code: permute_brackets.py

January 22, 2009

Party Friends

Problem: In any party, there exists two people with same number of friends. Is is true or false. Assume that party has more than one people.

Solution: show

January 18, 2009

Innocents and Criminals: Finding Minority Entity

Problem: There are two types of people in a particular city, innocents and criminals. All you know is that innocents are in majority and would like to get rid of criminals and criminals would like to protect themselves from persecution. You can ask any number of questions, with yes/no type answer, from any person in the city. Propose an algorithm which requires asking the minimum number of questions.

Hint: Trivial solution requires asking O(n^2) questions. You can do it in less than 2n questions.

Solution: show

January 16, 2009

Finding First Non-Zero Digit of N Factorial

Find the first non-zero digit of N!. Assume N! to be extremely large.

Solution:
If we assume that the product doesnt contain either of 2's and 5's, then finding last digit is trivial. Simple multiply the result with next number and take its mod 10. The problem is complicated as 2's and 5's lead to trailing 0's. So the first step is to remove 2's and 5's. Then solve the problem. The cancel 5's with 2's and multiply back the left over 2's.

For Ex. 7! = 1 * 2 * 3 * 4 * 5 * 6 * 7. We extract 2's and 5's so the product is 1 * 3 * 3 * 7. The trailing digit is 3. We have extracted four 2's and one 5. So we get 3 extra 2's. So multiply to 3 we get 4 as the answer. To check 7! = 5040
d = 1 
n2 = 1
for i = 3 to n
  j = i

  while j%10 == 0 //remove trailing zeros
    j /= 10

  while j%2 == 0 //count of 2's
    j /= 2
    n2 += 1

  while j%5 == 0 //cancel a 5 with a 2
    j /= 5
    n2 -= 1    

  d = (d * j) % 10

d = (d * 2^n2) % 10 //multiply remaining 2's to d
return d

January 13, 2009

Count Trailing Zeros in N Factorial

How many trailing zeros does N factorial contain, for ex. 5! = 120 so ans is 1. Propose an algorithm to count trailing zeros. Assume that N! is very large and does not fit into word size of the computer.

Solution:
This problem is straight forward. N! = 1*2*3*4*5* .... *N. If we count all 5's in this product, then that many trailing zeros are there. This is because a 5 is countered by 2 to produce a zero. Additionally, there are more 2's than 5's so counting 5's gives the right answer. The following pseudo-code does that:
count = 0
i = 5
while i <= n
  j = i
  while j%5 == 0
    j = j/5
    count += 1
  i += 5
return count
This algorithm takes n/5 outer loop iterations and is bounded by log n [base=5] inner loop iterations, hence it takes O(n log n) time. We can clearly do better than this.

Actually, we can solve this problem in O(log n) time. For a given i the term i/5 gives total number of 5's that come before it. For ex. 17/5 is 3, there are 3 numbers that contain fives which are less than 17 i.e. 5, 10, 15. So we count number of 5, then number of 25, then so on.
count = 0
d = r = 1
while r > 0
  d *= 5
  r = [n/d] // [x] is greatest integer less than x
  count += r
return count
This algorithm runs in O(log n) time and an asymptotic improvement on the previous algorithm.

January 12, 2009

Picking k Elements Randomly from Stream

Consider the problem of picking K elements randomly from a stream of N elements. Solve the problem for known and unknown (but finite N).

Solution:
For K=1, the solution is posted in my previous blog post Picking an Element Randomly from Stream.

Case 1: N is known
In this case problem can be solved in O(N) time using O(K) space. The idea is that pick the first element with probability K/N. If the element is picked then pick the next element with probability (K-1)/(N-1) otherwise pick it with probability K/(N-1). To see that the second element is still picked with probability K/N. There are two cases.
  1. First element is picked. So second element is picked with probability = K/N * (K-1)/(N-1).
  2. First element is not picked. So second element is picked with probability = (1-K/N) * K/(N-1).
So the selection probability of second element is = K/N*(K-1)/(N-1) + (1-K/N)*K/(N-1) = K/N. Similar logic extends further. The following code implements this approach.
for i = 1 to N
   if rand <= K/(N+1-i)
      print A[i]
      K -= 1
      if K == 0
         break
      end
   end
end
if K != 0
  print A[N-K:N]
end
Case 2: N is unknown.
If we do not know N beforehand, and yet we want to guarantee that all elements are selected with probability K/N, we can use Reservoir Sampling algorithm.
for i = 1:K
   S[i] = A[i]
end

for i = K+1:length(A)
   j = random(1,i)
   if j <= K
      S[j] = A[i]
   end 
end
The first K elements of the stream are automatically selected. So for N = K this solves the problem. Now consider N = K+1. For i <= N-1 all elements are selected with probability 1 so far. The last element should be selected with probability K/N and so should others be. The index j is less than or equal to K with probability K/(K+1), so last element is rejected with probability 1/(K+1). For the elements in the array at index j they can be rejected if last element is accepted and its index is j. The chances of that is K/(K+1) * 1/K = 1/(K+1). So all elements are rejected with 1/(K+1) probability. Now consider N = K+2. Until index i=K+1 all elements are rejected with probability 1/(K+1). Using same logic as we applied above, we can see that an element gets rejected with probability 2/(K+2).

January 7, 2009

Optimal Coin Change Problem

Given an unlimited supply of coins of denominations C1, C2, ..., CN we wish to make change for a value V. Give an algorithm for producing change with minimum number of coins.

Complexity: Depending on denomination of coins, we can either use fast greedy algorithm or dynamic programming.

Solution:
Consider two specific types of denominations of coins
a) 1, C, C^2, C^3, ..., C^(N-1)
b) 1, C, 2C, 3C, ...., (N-1)C

For both of these arrangements, the problem can be solved using a greedy approach. We see that if we first pay with the highest denomination coin, the value (for which change is required) reduces greatest (as compared to any lower denomination). Additionally we see that optimal solution would require us to use as many highest denomination coin as possible. We can prove this claim using contradiction. The greedy algorithm that is pretty fast and runs in O(N), is as follows:
change = 0
for j = N to 1
  change += int(value/C[j])
  value = value % C[j]
return change
The generic problem of coin change cannot be solved using the greedy approach, because the claim that we have to use highest denomination coin as much as possible is wrong here and it could lead to suboptimal or no solutions in some cases. For ex, if coin denomination are 1, 4, 6 and we want change for 8. Using one coin of denomination 6 forces us to use 2 coins of denomination 1, but clearly the optimal solution is to use 2 coin of denomination 4 each. The key observation to make is the optimal structure of the sub-problem. If we assume for the time being that we have optimal change for all values < i. To produce change for value i, we need to add a coin to some previous solution with value j < i. We need to choose j that minimizes the value of change[i], the O(V*N) algorithm looks like the following:
change[0] = 0 // no coin required.
for i = 1 to V
  change[i] = INF
  for j = 1 to N
    if i >= C[j] and change[i] > change[i - C[j]]
      change[i] = 1 + change[i - C[j]]
return change[n]
To print the choice of coins, we can keep a separate array to track which coin, cj was chosen for a given i.

Source Code: coin_change.py

January 5, 2009

Picking an Element Randomly from Stream

How to pick a number randomly from a stream (or a linked list) with a finite but unknown length N? Solution must use constant space and guarantee that number is picked with probability = 1/N.

Solution:
If we know the length of the stream, then problem is trivial. Simply generate a random number r between 1 and N. Declare the number at index r as the desired random number.

For an unknown but finite N, we need that elements are selected with equal probability. Let us first build the loop invariant property for the solution. Say we have seen k elements so far, the selected number (say n) must be selected with probability 1/k. As we see a new element (say n1), we should set n=n1 with probability 1/(k+1). This ensures that n1 is selected with probability 1/(k+1). In the other case, n (the one NOT replaced by n1) is selected with probability = 1/k * (1-1/(k+1)) = 1/(k+1). Hence this strategy ensures that a number a selected with probability 1/N for N>=k. This loop invariant is adhered by the following code:
n=0, N=0
while Stream.hasElement()
  ++N
  if rand <= 1/N 
    n = Stream.nextElement()
return n
After the algorithm processes all elements, n would contain the randomly selected element and N would contain the number of elements in the stream.

January 1, 2009

Generating Random Numbers

How to generate random integer between A and B, using an unbiased coin. For ex. If A = 3 and B = 6 then your algorithm should generate 3, 4, 5, 6 with equal probabilities.

Solution:
Without loss of generality, we consider the problem of generating random number r in range [0,n] with n=B-A. We can return r+A as the solution to input problem. The number n can be represented using k bits where k = 1 + [log n] ([x] indicates the largest integer smaller than x). Now the k-bits can be generated using the unbiased coin trivially in O(k). The generated number r can be greater than n with probability = 1-(n+1)/2^k. If that happens then repeat the experiment. Pseudo code for the algorithm looks like the following:-
function random(A, B)
  n = B - A
  k = 1 + int(log n)
  r = 0
  for i = 1 to k
    bit = coin_toss(); // head means 1, tail means 0
    r += bit
    r <<= 1
    if r > n
      return random(A, B)
  return r
This function can go into infinite loop. Let us calculate the expected number of times random(A,B) is called. It is called once by the the actual caller and called again if r > n. Let us assume that the expected number of calls be e. The probability that r <= n is given as Pr(r <= n) = (n+1)/2^k.
=> e = 1 * Pr(r <= n) + (1 + e) * Pr(r > n) => e = 2^k/(n+1) 
In the worst case, n=2^m+1. In this case, k=1+m. We get
==> e = 2 * 2^m/(2^m + 2) < 2
So expected number of internal calls is at-most 2. This gives a good assurance against going into infinite loops. Note that e can be estimated by using the observation that random variable e - 1 is a geometric distribution. Another way to reach above conclusion is by observing probability of failure, p = 1-(n+1)/2^k < 1/2. Since the probability of failure is less than 1/2, the chances that we run into infinite loop is quite low.

The above solution can be explained in simpler terms as well. Assume that we have people numbered A to B and we are playing a tournament. In each round of the tournament, two players are paired up, they call on the coin toss, player who wins goes to next level. In case if pairs can not be formed we add dummies to form pairs. If a normal player wins we are done, but if dummy wins we repeat the experiment again. So for example if we have 3, 4, 5, 6, 7 as initial number, and we add 'd' as dummy. Then a possible tournament could be as follows:
round 1: (3 vs 4) , (5 vs 6) , (7 vs d) --- winner --> 3, 5, d
round 2: (3 vs 5), (d vs d) -- winner --> 5, d
round 3: (5 vs d) -- winner --> 5
If d wins the last round, we repeat the experiment. This algorithm is just the same as earlier algorithm.