Problem: Given an array of integers, find the longest subsequence of elements which monotonically increases. for ex. array = {1 4 8 2 5 7 3 4 6}, the longest subsequence = {1 2 3 4 6}
Solution: show
Problem: Given a string, find the longest size palindrome in the string. for ex. string = "aaabbccaccbaaa", solution = "bccaccb"
Solution: show
February 15, 2009
January 31, 2009
Loops in Linked List
Problems with linked lists occur mainly while building or using a bad memory manager. Let us discuss two well known linked list problems.
Problem: Find if two linked lists short at some node. Find the node at which the two lists short.
Solution: show
Problem: A linked list might contain a loop. How do we detect existence of the loop and find the node from which loop starts. Propose an O(n) time algorithm that doesn't take extra space and doesn't modify the linked list.
Solution: show
Problem: Find if two linked lists short at some node. Find the node at which the two lists short.
Solution: show
Subtract the length of smaller list from the larger one. Move a pointer ahead on larger list by the difference of lengths. Now put a pointer on head of smaller list. More two pointers together. The point at which they meet is the point where they short. The following code gives the idea.
d = length(head1) - length(head2) l1, l2 = head1, head2 if d < 0 d *= -1 l1, l2 = head2, head1 ptr1 = l1 for i <- 1 to d ptr1 = ptr1->next ptr2 = l2 while ptr1 != null if ptr2 == ptr1 return ptr1 ptr1 = ptr1->next ptr2 = ptr2->nextWe can use this idea to find the shorted node to find the solution to next problem.
Problem: A linked list might contain a loop. How do we detect existence of the loop and find the node from which loop starts. Propose an O(n) time algorithm that doesn't take extra space and doesn't modify the linked list.
Solution: show
To find if the linked list contains a loop, run two pointers over the linked list, one that move one node at a time and other that moves two nodes at a time. Both of these pointers meet if the linked list contains a loop, otherwise the pointer moving with twice speed reaches the end while the slower pointer reaches midway.
fastptr = head slowptr = head while fastptr != null and fastptr->next != null fastptr = fastptr->next->next slowptr = slowptr->next if fastptr == slowptr return "loop exists" return "loop doesnt exist"If loop exists, then fast pointer meets the slow pointer before slow pointer completes one rotation of the loop. This fact can be easily proved. Once the pointers meet, we can count the number of nodes in the loop.
loopNodeCount = 0 do loopNodeCount += 1 slowptr = slowptr->next while slowptr != fastptr return loopNodeCountNow using the trick used in the two linked list shorting problem above, we start two pointers, one from the head and other after advancing "loopNodeCount" nodes ahead. The node at which both the pointers meet is the loop node.
advptr = head for i in range(1, loopNodeCount) advptr = advptr->next normalptr = head while normalptr != advptr normalptr = normalptr->next advptr = advptr->next return advptr
Monty Hall Problem
This is a very famous probability problem. This problem illustrates that probability is sometimes more convincing than "what we perceive as obvious". I'll post another problem that defeats "obvious reasoning".
Monty Hall problem: Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?
Solution: show
Red Green Cards: A box has three cards. First card has both sides green, second card has both sides red, third card has one side green and one side red. A card is picked at random and its one side is observed to be green. What is the probability that the other side of the card is also green.
Solution: show
Monty Hall problem: Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?
Solution: show
Well for a mathematician stuck on the game show, I would suggest tossing the unbiased coin. The reason being that Game Show host might be clever and seeing that mathematician got the prize behind door 1, would play this card in order to entice the mathematician to switch. Mathematician can be well aware of this trick of Game show host and might not switch. But game show host thought so .. errr. This is a recursive logic and would go to infinity.
Under the assumption that game show host has no hidden motive and performs this step always, the best choice is to switch. That increases the chances by 33.33%. There is a very simple explanation to this. The prize could be behind any doors. Since you pick door 1. Assume game show hosts says that either keep door 1 or take both door 2 and 3, you would go for both door. That doubles up wining chances to 66%. He opens door 3 for you and you open door 2 when you say switch.
Under the assumption that game show host has no hidden motive and performs this step always, the best choice is to switch. That increases the chances by 33.33%. There is a very simple explanation to this. The prize could be behind any doors. Since you pick door 1. Assume game show hosts says that either keep door 1 or take both door 2 and 3, you would go for both door. That doubles up wining chances to 66%. He opens door 3 for you and you open door 2 when you say switch.
Red Green Cards: A box has three cards. First card has both sides green, second card has both sides red, third card has one side green and one side red. A card is picked at random and its one side is observed to be green. What is the probability that the other side of the card is also green.
Solution: show
Well it seems that answer should be 1/2. But its 2/3. For explanation, let us label card 1 as g1|g2, second card as r1|r2, third card as g3|r3. Now since one side of the card is green. It can either be g1, g2, g3. Three events and two point to first card.
January 30, 2009
Finding Second Smallest Element
Jargon: Order Statistics problem is to find the kth smallest element in an unsorted array A[1..n]. This problem can be solved in O(n) time in average case using randomized select algorithm or in O(n) time in worst case time using an algorithm that chooses the pivot element carefully.
Problem: How to find second smallest element in an array A[1..n] of size n. We would like to do in much less than 2n comparisons.
Answer: Time Complexity = n + ceil(log n) - 2
Solution: show
Code for n + log n - 2 algorithm: second_smallest.py
Alternate implementation using linked list: second_smallest_linkedlist.py
Problem: How to find second smallest element in an array A[1..n] of size n. We would like to do in much less than 2n comparisons.
Answer: Time Complexity = n + ceil(log n) - 2
Solution: show
Finding the smallest element is trivial. The following code would find the smallest element using n-1 comparisons.
Another algorithm that is very easy to implement but sadly takes 2n - 2logn - 1 in terms of array element comparisons. Simply build a min heap (rooted at 1). Return the smaller of numbers at index 2 or 3 in the array. The cost of min heap building is = 2 * [n/4 + 2n/8 + ...... + (logn - 1) * n / 2^logn]. The solution to this arithmetic - geometric series is 2n - 2logn - 2. Comparing elements at 2nd and 3rd index adds one to the cost. Its very easy to implement though.
min = A[1] for i = 2 to n if A[i] < min min = A[i] return minThe trivial algorithm to find the second minimum is to keep another variable (smin) along with min and if an element knocks out smin then check if it knocks out min and accordingly do the book-keeping. It can be done using 2n-3 comparisons and takes constant space. A point to observe is that the second smallest element is knocked out by the smallest one at some stage. If we preserve which element is knocked out by whom then finding the second smallest element would just require us to find the smallest amongst the elements knocked out by the smallest element. But in order to do that we need to build the algorithm in a way that we are playing a pair wise tournament. Lets say we play a tournament and knockout n/2 elements after n/2 comparisons. The recurrence relation looks like this:
T(n) = T(n/2) + n/2Solving this recurrence gives that T(n) = n - 1. The height of the recurrence tree is lg n. So if we just check these elements knocked by root (min) we get the second minimum.For. E.x.
1 5 2 6 \ / \ / 1 2 \ / 1As soon as we build the tree using n-1 comparisons, we can check the second largest amongst the number knocked out by 1. i.e. 2, 5. so the second minimum is 2. The number of comparison is n + ceil(log n) - 2, where log is taken to the base 2. We implement this algorithm by maintaining a list of indices knocked by an element. The python code is towards the end of this post.
Another algorithm that is very easy to implement but sadly takes 2n - 2logn - 1 in terms of array element comparisons. Simply build a min heap (rooted at 1). Return the smaller of numbers at index 2 or 3 in the array. The cost of min heap building is = 2 * [n/4 + 2n/8 + ...... + (logn - 1) * n / 2^logn]. The solution to this arithmetic - geometric series is 2n - 2logn - 2. Comparing elements at 2nd and 3rd index adds one to the cost. Its very easy to implement though.
Code for n + log n - 2 algorithm: second_smallest.py
Alternate implementation using linked list: second_smallest_linkedlist.py
January 23, 2009
String Permutations
Printing all permutations of a string is a very common interview question. We'll discuss this problem and some interesting variations of it. The original problem of string permutation says, "print all permutations of a string". As an example, if the string is "abc" there are 6 permutations {abc, acb, bac, bca, cab, cba}. Assume that string contains no duplicate elements.
Solution: show
Source Code: permute_unique_chars.py
Lets us assume that string can contain duplicates. How do we print all non-redundant permutations of the string. For ex. If string is "abab" the permutations are {baab, abba, abab, baba, bbaa, aabb}
Solution: show
Source Code: permute_chars.py
Some other constraints could be added to make the problem more interesting. One such constrain could be of partial ordering of characters. For Ex. 'a' should appear before 'c', and 'b' should appear before 'c' in all permutations. There can exist ordering that leads to no permutation of the string such as 'a' precedes 'b', 'b' precedes 'c', 'c' precedes 'a'. Additionally implementing the permutation algorithm in these cases require checking for ordering violation at each permutation level or just at the leaf of recursion, depending on whether checking for ordering violations is more costly or generating all permutations.
An interesting problem, that talks of special kind of ordering and can be very efficiently computed is as follows: Assume that the string is made up of equal number of '{' and '}'. You need to print all permutations of this string which could satisfy a C compiler i.e. an ending bracket is balanced by an opening bracket. For ex. for '{{}}' the permutations are [{{}}, {}{}].
Solution: show
Source Code: permute_brackets.py
Solution: show
There are many ways in which this problem can be solved. The most easiest of the ways is to code it using simple yet powerful recursion technique.
There are other ways in which string permutation can be printed that require circular shifting, but it cant get any simpler than above.
function permute(str, d) if d == length(str) print str else for i <- d to length(str) swap(str[d] <-> str[i]) // swap character for permutation permute(str, d + 1) swap(str[d] <-> str[i]) // undo swap for parent callAbove code performs the swapping operation twice. First time to generate all possible permutations of character at that level and second time to restore to the original string. It is important to restore to the original string (as passed to this call), otherwise the algorithm ends up printing redundant permutations. Additionally, this code assumes that there are no duplicates in the string, else it fails.
There are other ways in which string permutation can be printed that require circular shifting, but it cant get any simpler than above.
Source Code: permute_unique_chars.py
Lets us assume that string can contain duplicates. How do we print all non-redundant permutations of the string. For ex. If string is "abab" the permutations are {baab, abba, abab, baba, bbaa, aabb}
Solution: show
Clearly the solution to first problem fails here. This is due to fact that we generate all permutations of the string after the swapping step. Same character could get swapped to dth spot again, leading to redundant permutations. A very simple trick solves the game. The trick is to pre-sort the string on alphabetical order. Now we need to ensure that while we are swapping, to generate permutation for that particular character, if that matches the last swapped character we ignore generating permutations. Effectively, we are doing a cyclic shift and before the end of the call we reset the cyclic shift. The following code implements the same:
function permute(str, d) if d == length(str) print str else lastSwap = Nil for i <- d to length(str) if lastSwap == str[i] continue else lastSwap = str[i] swap(str[d] <-> str[i]) // swap character for permutation permute(str, d + 1) for i <- d to length(str)-1 str[i] = str[i+1] str[length(str)] = lastIf the string is pre-sorted alphabetically then the above algorithm works magically. Note that in this case, after permute is called, we do not undo the swap. This ensures that we are cyclic shifting and hence the intermediate str from d to l is also sorted. After the for loop call, we reverse the cyclic shift. The reason is that sorting ensures that if there are duplicates then the second occurrence of the duplicate element is picked immediately after we finish generating permutations for the first occurrence of the duplicate element. If sorting is not allowed, then you need to maintain a local hashtable (local to each function call), that maintains characters swapped to dth spot and ensures that duplicate candidates are discarded. Presort and make life simple.
Source Code: permute_chars.py
Some other constraints could be added to make the problem more interesting. One such constrain could be of partial ordering of characters. For Ex. 'a' should appear before 'c', and 'b' should appear before 'c' in all permutations. There can exist ordering that leads to no permutation of the string such as 'a' precedes 'b', 'b' precedes 'c', 'c' precedes 'a'. Additionally implementing the permutation algorithm in these cases require checking for ordering violation at each permutation level or just at the leaf of recursion, depending on whether checking for ordering violations is more costly or generating all permutations.
An interesting problem, that talks of special kind of ordering and can be very efficiently computed is as follows: Assume that the string is made up of equal number of '{' and '}'. You need to print all permutations of this string which could satisfy a C compiler i.e. an ending bracket is balanced by an opening bracket. For ex. for '{{}}' the permutations are [{{}}, {}{}].
Solution: show
The hard question is how many such permutations exist. For N open and N close brackets, the number of such permutations is Nth Catalan number = 1/(n+1) * factorial(2*N)/factorial(N)^2. The solution is derived from the problem: Number of paths in rectangular grid.
Permutation problems can be easily solved using recursion. The trick is to formulate all the possible cases at a particular level and code according to the formulation. The steps used below illustrate this approach. Let's formulate the problem in general sense. Assume that at a given step of recursion, we have n open brackets and m closed brackets that are not used so far. Now we can formuate each and every case for a given (n,m) and code the possible substeps.
Permutation problems can be easily solved using recursion. The trick is to formulate all the possible cases at a particular level and code according to the formulation. The steps used below illustrate this approach. Let's formulate the problem in general sense. Assume that at a given step of recursion, we have n open brackets and m closed brackets that are not used so far. Now we can formuate each and every case for a given (n,m) and code the possible substeps.
Problem Formulation: (n,m) ---> NOT POSSIBLE, if m < n (n,m) ---> PRINT AND DONE, if m = 0 and n = 0 (n,m) ---> (n-1, m) if n > 0 + (n, m-1) if m > 0 and m > nThe above formulation is generic and takes into account all possible input values of (n,m). If we put following constraint on the initial value of n and m such as following, then actual implementation can omit some of the above cases:
Constraints: n > 0 m == nThe actual implementation that follows above constraints and problem formuation:
function permute(str, n, m) if m == 0 print str else if n > 0 permute (str + '{', n - 1, m); if m > n permute (str + '}', n, m - 1);To enforce the above mentioned constraints, permute function should be kept as inner/private function and the actual function that should be exposed is as follows:
function GenerateCStyleBrackets(N) if N <= 0 return str = new String[2 * N] permute(str, N, N) // str is passed as pointer.The memory cost is O(N). The maximum stack depth is also O(N). So total space cost is O(N).
Source Code: permute_brackets.py
January 22, 2009
Party Friends
Problem: In any party, there exists two people with same number of friends. Is is true or false. Assume that party has more than one people.
Solution: show
Solution: show
Yes, The statement is true. We can prove it by contradiction.
Let there be N people at the party.
Let all the N people have different number of friends.
Since a person can have maximum of N-1 friends and a minimum of 0 friends. Hence the possible number of friends for a person is 0, 1, 2, ..., N-1. Since, we are assuming that there are no two people with same number of friends, so everyone has different number of friends. This means that someone has 0 friend, someone has 1 friend, so on and someone has N-1 friends. But 0 & N-1 cannot coexist. This is because if a person has N-1 friends then that means he is friend with every one, including the one who has 0 friends, which is a contradiction. Hence by pigeon hole principle at least two people should have the same number of friends.
Let there be N people at the party.
Let all the N people have different number of friends.
Since a person can have maximum of N-1 friends and a minimum of 0 friends. Hence the possible number of friends for a person is 0, 1, 2, ..., N-1. Since, we are assuming that there are no two people with same number of friends, so everyone has different number of friends. This means that someone has 0 friend, someone has 1 friend, so on and someone has N-1 friends. But 0 & N-1 cannot coexist. This is because if a person has N-1 friends then that means he is friend with every one, including the one who has 0 friends, which is a contradiction. Hence by pigeon hole principle at least two people should have the same number of friends.
January 18, 2009
Innocents and Criminals: Finding Minority Entity
Problem: There are two types of people in a particular city, innocents and criminals. All you know is that innocents are in majority and would like to get rid of criminals and criminals would like to protect themselves from persecution. You can ask any number of questions, with yes/no type answer, from any person in the city. Propose an algorithm which requires asking the minimum number of questions.
Hint: Trivial solution requires asking O(n^2) questions. You can do it in less than 2n questions.
Solution: show
Hint: Trivial solution requires asking O(n^2) questions. You can do it in less than 2n questions.
Solution: show
The trivial solution is very simple and is in-fact O(n^2) time algorithm. Round up every person and ask him about the status of everyone else. People with majority vote of criminal are criminals and people with minority vote of criminal are innocents. Following code solves it:
The most efficient algorithm requires asking only 2N-2 questions. The key to solution of this problem lies in the solution to the problem of Finding Majority Element. If we can find one innocent person in the city, then he can label everyone else truthfully. We know for sure that an innocent will tell the truth. Criminals can lie or say truth depending on circumstances.
Assume that we formulate the problem this way. Let us consider a pool of people, who claim to be innocents. Also let us say that we declare that we will pick the first person in the pool to be our innocent man and he will reveal the identity of everyone else. Then definitely both innocents and criminals would try their best to capture the first spot in the pool.
Now, we play a game like this. We choose a person randomly to represent the first person in the pool. Now we select a new person randomly and ask him should the last person inducted in the pool is Innocent. If he says Yes, then we add him to the pool. If he says No then we remove him and the last person from the pool and discard them from further selection. If the pool is empty we select a person not selected previously to get the first spot in the pool.
We repeat the process until we don't have anymore people to consider. The intuition is that even if all Criminals join the pool initially (by lying), then innocents would boot them out as they are in majority. If more innocents join the pool initially then criminals will not be able to boot all innocents out. Hence the first person remaining in the pool is indeed an innocent. We can see this argument working inductively as well.
The following code implements above logic with the help of a stack:
for i = 1 to N for j = 1 to N if person[i].isCriminal(j) // doesnt matter if i == j vote[j] += 1 else vote[j] -= 1 for j <- 1 to n if vote[j] > 0 person[j].persecute()
The most efficient algorithm requires asking only 2N-2 questions. The key to solution of this problem lies in the solution to the problem of Finding Majority Element. If we can find one innocent person in the city, then he can label everyone else truthfully. We know for sure that an innocent will tell the truth. Criminals can lie or say truth depending on circumstances.
Assume that we formulate the problem this way. Let us consider a pool of people, who claim to be innocents. Also let us say that we declare that we will pick the first person in the pool to be our innocent man and he will reveal the identity of everyone else. Then definitely both innocents and criminals would try their best to capture the first spot in the pool.
Now, we play a game like this. We choose a person randomly to represent the first person in the pool. Now we select a new person randomly and ask him should the last person inducted in the pool is Innocent. If he says Yes, then we add him to the pool. If he says No then we remove him and the last person from the pool and discard them from further selection. If the pool is empty we select a person not selected previously to get the first spot in the pool.
We repeat the process until we don't have anymore people to consider. The intuition is that even if all Criminals join the pool initially (by lying), then innocents would boot them out as they are in majority. If more innocents join the pool initially then criminals will not be able to boot all innocents out. Hence the first person remaining in the pool is indeed an innocent. We can see this argument working inductively as well.
The following code implements above logic with the help of a stack:
stack.push(person[1]) for i <- 2 to N if !stack.isEmpty() && person[i].isCriminal(stack.top()) stack.pop() else stack.push(person[i]) return stack.bottom()
January 16, 2009
Finding First Non-Zero Digit of N Factorial
Find the first non-zero digit of N!. Assume N! to be extremely large.
Solution:
If we assume that the product doesnt contain either of 2's and 5's, then finding last digit is trivial. Simple multiply the result with next number and take its mod 10. The problem is complicated as 2's and 5's lead to trailing 0's. So the first step is to remove 2's and 5's. Then solve the problem. The cancel 5's with 2's and multiply back the left over 2's.
For Ex. 7! = 1 * 2 * 3 * 4 * 5 * 6 * 7. We extract 2's and 5's so the product is 1 * 3 * 3 * 7. The trailing digit is 3. We have extracted four 2's and one 5. So we get 3 extra 2's. So multiply to 3 we get 4 as the answer. To check 7! = 5040
Solution:
If we assume that the product doesnt contain either of 2's and 5's, then finding last digit is trivial. Simple multiply the result with next number and take its mod 10. The problem is complicated as 2's and 5's lead to trailing 0's. So the first step is to remove 2's and 5's. Then solve the problem. The cancel 5's with 2's and multiply back the left over 2's.
For Ex. 7! = 1 * 2 * 3 * 4 * 5 * 6 * 7. We extract 2's and 5's so the product is 1 * 3 * 3 * 7. The trailing digit is 3. We have extracted four 2's and one 5. So we get 3 extra 2's. So multiply to 3 we get 4 as the answer. To check 7! = 5040
d = 1 n2 = 1
for i = 3 to n j = i while j%10 == 0 //remove trailing zeros j /= 10 while j%2 == 0 //count of 2's j /= 2 n2 += 1 while j%5 == 0 //cancel a 5 with a 2 j /= 5 n2 -= 1 d = (d * j) % 10 d = (d * 2^n2) % 10 //multiply remaining 2's to d return d
January 13, 2009
Count Trailing Zeros in N Factorial
How many trailing zeros does N factorial contain, for ex. 5! = 120 so ans is 1. Propose an algorithm to count trailing zeros. Assume that N! is very large and does not fit into word size of the computer.
Solution:
This problem is straight forward. N! = 1*2*3*4*5* .... *N. If we count all 5's in this product, then that many trailing zeros are there. This is because a 5 is countered by 2 to produce a zero. Additionally, there are more 2's than 5's so counting 5's gives the right answer. The following pseudo-code does that:
Actually, we can solve this problem in O(log n) time. For a given i the term i/5 gives total number of 5's that come before it. For ex. 17/5 is 3, there are 3 numbers that contain fives which are less than 17 i.e. 5, 10, 15. So we count number of 5, then number of 25, then so on.
Solution:
This problem is straight forward. N! = 1*2*3*4*5* .... *N. If we count all 5's in this product, then that many trailing zeros are there. This is because a 5 is countered by 2 to produce a zero. Additionally, there are more 2's than 5's so counting 5's gives the right answer. The following pseudo-code does that:
count = 0 i = 5 while i <= n j = i while j%5 == 0 j = j/5 count += 1 i += 5 return countThis algorithm takes n/5 outer loop iterations and is bounded by log n [base=5] inner loop iterations, hence it takes O(n log n) time. We can clearly do better than this.
Actually, we can solve this problem in O(log n) time. For a given i the term i/5 gives total number of 5's that come before it. For ex. 17/5 is 3, there are 3 numbers that contain fives which are less than 17 i.e. 5, 10, 15. So we count number of 5, then number of 25, then so on.
count = 0 d = r = 1 while r > 0 d *= 5 r = [n/d] // [x] is greatest integer less than x count += r return countThis algorithm runs in O(log n) time and an asymptotic improvement on the previous algorithm.
January 12, 2009
Picking k Elements Randomly from Stream
Consider the problem of picking K elements randomly from a stream of N elements. Solve the problem for known and unknown (but finite N).
Solution:
For K=1, the solution is posted in my previous blog post Picking an Element Randomly from Stream.
Case 1: N is known
In this case problem can be solved in O(N) time using O(K) space. The idea is that pick the first element with probability K/N. If the element is picked then pick the next element with probability (K-1)/(N-1) otherwise pick it with probability K/(N-1). To see that the second element is still picked with probability K/N. There are two cases.
If we do not know N beforehand, and yet we want to guarantee that all elements are selected with probability K/N, we can use Reservoir Sampling algorithm.
Solution:
For K=1, the solution is posted in my previous blog post Picking an Element Randomly from Stream.
Case 1: N is known
In this case problem can be solved in O(N) time using O(K) space. The idea is that pick the first element with probability K/N. If the element is picked then pick the next element with probability (K-1)/(N-1) otherwise pick it with probability K/(N-1). To see that the second element is still picked with probability K/N. There are two cases.
- First element is picked. So second element is picked with probability = K/N * (K-1)/(N-1).
- First element is not picked. So second element is picked with probability = (1-K/N) * K/(N-1).
for i = 1 to N if rand <= K/(N+1-i) print A[i] K -= 1 if K == 0 break end end end if K != 0 print A[N-K:N] endCase 2: N is unknown.
If we do not know N beforehand, and yet we want to guarantee that all elements are selected with probability K/N, we can use Reservoir Sampling algorithm.
for i = 1:K S[i] = A[i] end for i = K+1:length(A) j = random(1,i) if j <= K S[j] = A[i] end endThe first K elements of the stream are automatically selected. So for N = K this solves the problem. Now consider N = K+1. For i <= N-1 all elements are selected with probability 1 so far. The last element should be selected with probability K/N and so should others be. The index j is less than or equal to K with probability K/(K+1), so last element is rejected with probability 1/(K+1). For the elements in the array at index j they can be rejected if last element is accepted and its index is j. The chances of that is K/(K+1) * 1/K = 1/(K+1). So all elements are rejected with 1/(K+1) probability. Now consider N = K+2. Until index i=K+1 all elements are rejected with probability 1/(K+1). Using same logic as we applied above, we can see that an element gets rejected with probability 2/(K+2).
January 7, 2009
Optimal Coin Change Problem
Given an unlimited supply of coins of denominations C1, C2, ..., CN we wish to make change for a value V. Give an algorithm for producing change with minimum number of coins.
Complexity: Depending on denomination of coins, we can either use fast greedy algorithm or dynamic programming.
Solution:
Consider two specific types of denominations of coins
a) 1, C, C^2, C^3, ..., C^(N-1)
b) 1, C, 2C, 3C, ...., (N-1)C
For both of these arrangements, the problem can be solved using a greedy approach. We see that if we first pay with the highest denomination coin, the value (for which change is required) reduces greatest (as compared to any lower denomination). Additionally we see that optimal solution would require us to use as many highest denomination coin as possible. We can prove this claim using contradiction. The greedy algorithm that is pretty fast and runs in O(N), is as follows:
Source Code: coin_change.py
Complexity: Depending on denomination of coins, we can either use fast greedy algorithm or dynamic programming.
Solution:
Consider two specific types of denominations of coins
a) 1, C, C^2, C^3, ..., C^(N-1)
b) 1, C, 2C, 3C, ...., (N-1)C
For both of these arrangements, the problem can be solved using a greedy approach. We see that if we first pay with the highest denomination coin, the value (for which change is required) reduces greatest (as compared to any lower denomination). Additionally we see that optimal solution would require us to use as many highest denomination coin as possible. We can prove this claim using contradiction. The greedy algorithm that is pretty fast and runs in O(N), is as follows:
change = 0 for j = N to 1 change += int(value/C[j]) value = value % C[j] return changeThe generic problem of coin change cannot be solved using the greedy approach, because the claim that we have to use highest denomination coin as much as possible is wrong here and it could lead to suboptimal or no solutions in some cases. For ex, if coin denomination are 1, 4, 6 and we want change for 8. Using one coin of denomination 6 forces us to use 2 coins of denomination 1, but clearly the optimal solution is to use 2 coin of denomination 4 each. The key observation to make is the optimal structure of the sub-problem. If we assume for the time being that we have optimal change for all values < i. To produce change for value i, we need to add a coin to some previous solution with value j < i. We need to choose j that minimizes the value of change[i], the O(V*N) algorithm looks like the following:
change[0] = 0 // no coin required. for i = 1 to V change[i] = INF for j = 1 to N if i >= C[j] and change[i] > change[i - C[j]] change[i] = 1 + change[i - C[j]] return change[n]To print the choice of coins, we can keep a separate array to track which coin, cj was chosen for a given i.
Source Code: coin_change.py
January 5, 2009
Picking an Element Randomly from Stream
How to pick a number randomly from a stream (or a linked list) with a finite but unknown length N? Solution must use constant space and guarantee that number is picked with probability = 1/N.
Solution:
If we know the length of the stream, then problem is trivial. Simply generate a random number r between 1 and N. Declare the number at index r as the desired random number.
For an unknown but finite N, we need that elements are selected with equal probability. Let us first build the loop invariant property for the solution. Say we have seen k elements so far, the selected number (say n) must be selected with probability 1/k. As we see a new element (say n1), we should set n=n1 with probability 1/(k+1). This ensures that n1 is selected with probability 1/(k+1). In the other case, n (the one NOT replaced by n1) is selected with probability = 1/k * (1-1/(k+1)) = 1/(k+1). Hence this strategy ensures that a number a selected with probability 1/N for N>=k. This loop invariant is adhered by the following code:
Solution:
If we know the length of the stream, then problem is trivial. Simply generate a random number r between 1 and N. Declare the number at index r as the desired random number.
For an unknown but finite N, we need that elements are selected with equal probability. Let us first build the loop invariant property for the solution. Say we have seen k elements so far, the selected number (say n) must be selected with probability 1/k. As we see a new element (say n1), we should set n=n1 with probability 1/(k+1). This ensures that n1 is selected with probability 1/(k+1). In the other case, n (the one NOT replaced by n1) is selected with probability = 1/k * (1-1/(k+1)) = 1/(k+1). Hence this strategy ensures that a number a selected with probability 1/N for N>=k. This loop invariant is adhered by the following code:
n=0, N=0 while Stream.hasElement() ++N if rand <= 1/N n = Stream.nextElement() return nAfter the algorithm processes all elements, n would contain the randomly selected element and N would contain the number of elements in the stream.
January 1, 2009
Generating Random Numbers
How to generate random integer between A and B, using an unbiased coin. For ex. If A = 3 and B = 6 then your algorithm should generate 3, 4, 5, 6 with equal probabilities.
Solution:
Without loss of generality, we consider the problem of generating random number r in range [0,n] with n=B-A. We can return r+A as the solution to input problem. The number n can be represented using k bits where k = 1 + [log n] ([x] indicates the largest integer smaller than x). Now the k-bits can be generated using the unbiased coin trivially in O(k). The generated number r can be greater than n with probability = 1-(n+1)/2^k. If that happens then repeat the experiment. Pseudo code for the algorithm looks like the following:-
The above solution can be explained in simpler terms as well. Assume that we have people numbered A to B and we are playing a tournament. In each round of the tournament, two players are paired up, they call on the coin toss, player who wins goes to next level. In case if pairs can not be formed we add dummies to form pairs. If a normal player wins we are done, but if dummy wins we repeat the experiment again. So for example if we have 3, 4, 5, 6, 7 as initial number, and we add 'd' as dummy. Then a possible tournament could be as follows:
Solution:
Without loss of generality, we consider the problem of generating random number r in range [0,n] with n=B-A. We can return r+A as the solution to input problem. The number n can be represented using k bits where k = 1 + [log n] ([x] indicates the largest integer smaller than x). Now the k-bits can be generated using the unbiased coin trivially in O(k). The generated number r can be greater than n with probability = 1-(n+1)/2^k. If that happens then repeat the experiment. Pseudo code for the algorithm looks like the following:-
function random(A, B) n = B - A k = 1 + int(log n) r = 0 for i = 1 to k bit = coin_toss(); // head means 1, tail means 0 r += bit r <<= 1 if r > n return random(A, B) return rThis function can go into infinite loop. Let us calculate the expected number of times random(A,B) is called. It is called once by the the actual caller and called again if r > n. Let us assume that the expected number of calls be e. The probability that r <= n is given as Pr(r <= n) = (n+1)/2^k.
=> e = 1 * Pr(r <= n) + (1 + e) * Pr(r > n) => e = 2^k/(n+1)In the worst case, n=2^m+1. In this case, k=1+m. We get
==> e = 2 * 2^m/(2^m + 2) < 2So expected number of internal calls is at-most 2. This gives a good assurance against going into infinite loops. Note that e can be estimated by using the observation that random variable e - 1 is a geometric distribution. Another way to reach above conclusion is by observing probability of failure, p = 1-(n+1)/2^k < 1/2. Since the probability of failure is less than 1/2, the chances that we run into infinite loop is quite low.
The above solution can be explained in simpler terms as well. Assume that we have people numbered A to B and we are playing a tournament. In each round of the tournament, two players are paired up, they call on the coin toss, player who wins goes to next level. In case if pairs can not be formed we add dummies to form pairs. If a normal player wins we are done, but if dummy wins we repeat the experiment again. So for example if we have 3, 4, 5, 6, 7 as initial number, and we add 'd' as dummy. Then a possible tournament could be as follows:
round 1: (3 vs 4) , (5 vs 6) , (7 vs d) --- winner --> 3, 5, d round 2: (3 vs 5), (d vs d) -- winner --> 5, d round 3: (5 vs d) -- winner --> 5If d wins the last round, we repeat the experiment. This algorithm is just the same as earlier algorithm.
Subscribe to:
Posts (Atom)