Hashtable are extremely useful data-structure as they provide storage and retrieval in O(1) time (amortized). Several problems of algorithms can be very efficiently solved using hashtables which otherwise turn out to be quite expensive. In this post, we consider some of these problems:
Problem 1: Remove duplicate elements from an unsorted array of size N.
Solution: show
Problem 2: Find intersection of K unsorted array of N elements each. Intersection consists of elements that appear in all the K arrays.
Problem 3: How to make a linked list support operations in O(1) time. The operations on linked list can be insertion after any arbitrary valued node, deletion of any arbitrary valued node.
Problem 4: Find all unique pairs of element in an array that sum to S. For ex. If array = {2,4,6,4,6} and S = 8 then answer is {(2,6), (4,4)}
Problem 5: Consider an array containing unique elements. Find a triplet of elements in the array that sum to S (extension of problem 4). Can hash-tables improve the running time of your algorithm.
Problem 6: Consider two strings of size M, N. Perform string matching in size O(M+N).
March 26, 2010
March 19, 2010
Dragon and Knight
Problem: A dragon and knight live on an island. This island has seven poisoned wells, numbered 1 to 7. If you drink from a well, you can only save yourself by drinking from a higher numbered well. Well 7 is located at the top of a high mountain, so only the dragon can reach it.
One day they decide that the island isn't big enough for the two of them, and they have a duel. Each of them brings a glass of water to the duel, they exchange glasses, and drink. After the duel, the knight lives and the dragon dies.
Why did the knight live? Why did the dragon die?
Solution: show
Problem: Now consider that Dragon and Knight are equally intelligent then who is expected to die.
Answer: Both survive the battle.
Solution: show
One day they decide that the island isn't big enough for the two of them, and they have a duel. Each of them brings a glass of water to the duel, they exchange glasses, and drink. After the duel, the knight lives and the dragon dies.
Why did the knight live? Why did the dragon die?
Solution: show
Dragon knows that knight cant reach well 7. So he thinks that if i give knight water from well 6 or 7, knight would die for sure. So he would get water from well 6 or 7 for knight.
Knight knew that no matter from which well he gives the water, dragon will go and drink from well 7 and live. So he got normal water for dragon and before duel drank water from well 1.
When they exchanged glasses and drank water, dragon rushed to well 7 and drank poison from it, thinking that it would cure the poison he just drank (but he drank normal water), so dragon died. Knight on the other hand already had poison from well 1 so what dragon gave him effectively cured him. So he lived.
Knight knew that no matter from which well he gives the water, dragon will go and drink from well 7 and live. So he got normal water for dragon and before duel drank water from well 1.
When they exchanged glasses and drank water, dragon rushed to well 7 and drank poison from it, thinking that it would cure the poison he just drank (but he drank normal water), so dragon died. Knight on the other hand already had poison from well 1 so what dragon gave him effectively cured him. So he lived.
Problem: Now consider that Dragon and Knight are equally intelligent then who is expected to die.
Answer: Both survive the battle.
Solution: show
After the battle, dragon drinks from well 1 in order to guarantee that he is poisoned. He then drinks from well 7 to cure himself. Even if knight gets normal water or poison from well 1-6, dragon is now cured and survives.
As for the knight, he drinks from well 1 before the duel. If dragon has got poison for him then he is cured. But if dragon got normal water then he has to ensure that he cures by drinking poison from higher than well 1. So after the battle, knight re-drinks from well 1. This guarantees that he is poisoned from well 1. Then he goes and drinks from well 2 to cure himself.
So both dragon and knight survive the battle.
As for the knight, he drinks from well 1 before the duel. If dragon has got poison for him then he is cured. But if dragon got normal water then he has to ensure that he cures by drinking poison from higher than well 1. So after the battle, knight re-drinks from well 1. This guarantees that he is poisoned from well 1. Then he goes and drinks from well 2 to cure himself.
So both dragon and knight survive the battle.
March 18, 2010
Number and Age of David's Kids
Problem: The product of the ages of David's children is the square of the sum of their ages. David has less than eight children. None of his children have the same age. None of his children is more than 14 years old. All of his children is at least two years old. How many children does David have, and what are their ages?
Answer: (12,6,4,2)
Solution: show
Source Code: david_kids_ages.py
Answer: (12,6,4,2)
Solution: show
A way to mathematically analyze this problem is to write the general form of equations, differentiate and maximize w.r.t to largest age. That leads to a Fibonacci number series. But after that complex heuristics have to be employed to devise the solution (and we cannot guarantee if that is the unique).
This is a constraint satisfaction problem. The best way to solve it is write the code. The following code solves the problem in general sense:
After running the code, we see the following solution
This is a constraint satisfaction problem. The best way to solve it is write the code. The following code solves the problem in general sense:
Function find_ages(kidAges) s = sum(kidAges) p = product(kidAges) if s*s == p print kidAges l = length(kidAges) if l >= 7 return else if l == 0 x = 14 else x = kidAges[l] end for i = 2 to x-1 kidAges1 = clone(kidAges) kidAges1.add(i) find_ages(kidAges1) end end // call find_ages([])
After running the code, we see the following solution
[12, 6, 4, 2]Note that in the above pseudo code, we cloned the kidAges array. If we dont clone then it wont work properly. Down side of cloning is that now we use a huge amount of memory. How do we write code without cloning? Check the python script attached below.
Source Code: david_kids_ages.py
March 11, 2010
Largest Sum of Consecutive Numbers
Problem: Given an array of N integers (both positive and negative), find the sub-sequence with largest sum.
For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest sum is 10 (start = 4, end = 7)
Solution: show
Problem: Given an array of N integers (both positive and negative), find the sub-sequence with largest absolute sum. This problem differs from the one at the top as we want the absolute sum (taking mod)
For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest absolute sum is 11.
Solution: show
For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest sum is 10 (start = 4, end = 7)
Solution: show
The naive solution to this problem can be formulated and coded in O(N^3) time. Consider every possible pair of start and end index and calculate sum of the elements within those index and keep track of the max sum. The following code performs that
If we save the cumulative sum of elements in an array CSUM, then CSUM[i] indicates the sum of elements enclosed in indices (1,i), CSUM[j]-CSUM[i-1] would then indicate the sum of elements enclosed in indices (i,j). The following algorithm captures this intuition.
max = - Inf imax = jmax = -1 for i = 1 to N for j = i to N // get sum of elements enclosed in index (i,j) s = 0 for k = i to j s += A[k] if s > max max = s imax = i jmax = jClearly the above algorithm is O(N^3). We can definitely improve the running time of the above algorithm. After observing carefully, we can see that an operation that is run several times (O(N^2) times) is computing the sum of elements between indices i,j. Can we do it quickly. Indeed we can.
If we save the cumulative sum of elements in an array CSUM, then CSUM[i] indicates the sum of elements enclosed in indices (1,i), CSUM[j]-CSUM[i-1] would then indicate the sum of elements enclosed in indices (i,j). The following algorithm captures this intuition.
csum[0] = 0 for i = 1 to N csum[i] = csum[i-1] + A[i] max = - Inf imax = jmax = -1 for i = 1 to N for j = i to N s = CSUM[j] - CSUM[i-1] if s > max max = s imax = i jmax = jThis is the best we can do with this approach. This problem can be done in O(N) time using a clever approach. Consider a sub-sequence with sum s > 0. Let the next element n is such that s + n < 0. Clearly this sub-sequence cannot be part of largest subsequence that contains s and n as removing s,n from that sub-sequence we get a larger sum. Easy to prove using contradiction. This idea is captured in the following O(N) algorithm
max = -Inf imax = jmax = -1 i_temp = -1 s_temp = 0 for i = 1 to N s_temp += A[i] if s_temp > max // keep track of max so far max = s_temp imax = i_temp jmax = i if s_temp < 0 // abort this sub-sequence s_temp = 0 i_temp = i + 1
Problem: Given an array of N integers (both positive and negative), find the sub-sequence with largest absolute sum. This problem differs from the one at the top as we want the absolute sum (taking mod)
For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest absolute sum is 11.
Solution: show
Simply keep track of max and min at the same time in the above O(N) solution.
Assuming that we have to use the above algorithm without modification, then we can take these steps to get max absolute sum in O(N) time:
Instead of trying the above two methods, we can try a very nice and simple approach
Assuming that we have to use the above algorithm without modification, then we can take these steps to get max absolute sum in O(N) time:
1. Run O(N) algorithm on array A to get the max 2. Create array B by multiply all numbers in A by -1 3. Run O(N) algorithm on array B to get the max 4. Pick the max between the output of two runs of the algorithm
Instead of trying the above two methods, we can try a very nice and simple approach
// build csum matrix as before csum[0] = 0 for i = 1 to N csum[i] = csum[i-1] + A[i] answer = max(csum) - min(csum)The answer is maximum element of csum minus the minimum element of csum.
February 3, 2010
Polynomial Evaluation
Problem: Given a polynomial with degree bound of n=m^r. How do we evaluate the polynomial at n different points such that the polynomial can be recovered from these n points (i.e. the coefficients can be calculated back using the points).
Note: We do not want to recover the coefficients but cleverly choosing the points as that impacts the running time of the algorithm.
Solution: show
Note: We do not want to recover the coefficients but cleverly choosing the points as that impacts the running time of the algorithm.
Solution: show
Let the polynomial be A(x) = a0 + a1 x + a2 x^2 + ... + a(n-1) x^(n-1).
A(x) has a degree bound of n. Its actual degree can be lower than n-1 if a(n-1) equals zero but that doesn't matter here.
Using Horner's rule, we can compute a polynomial in O(n) time by arranging the terms as follows:
A(x) = a0 + x (a1 + x (a2 + ... + x (a(n-2) + x a(n-1))))
So it will take O(n^2) time to compute polynomial at n different points. But we can use divide and conquer to do better than that i.e. in O(n lg n). We can create "m" different polynomials such that each term of ith polynomial has degree mod m = i.
=> A(x) = [ a0 + a(m) x^m + a(2m) x^2m ... ] +
[ a1 + a(m+1) x^(m+1) + a(2m+1) x^(2m+1) ... ] +
.
.
.
[ a(m-1) x^(m-1) + a(2m-1) x^(2m-1) + ... ]
=> A(x) = A[0](y) + x A[1](y) + ... + x^i A[i](y) + ... + x^(m-1) A[m-1](y)
where y = x^m, and
A[i](x) = a(i) + a(m+i) . x + ... + a(n-m+i) . x^(n/m-1)
Still the recurrence for evaluation each point is T(n) = m T(n/m) + O(m) = O(n). So evaluating polynomial at n points leads to O(n^2) which is not great.
Now we can choose the n points cleverly. Lets set w as principal nth root of unity.
=> w^n = 1
=> w = e^(2 . pi . i / n)
= cos (2 . pi /n) + i . sin(2 . pi / n)
The two interesting properties of w are
1. w^(n/2) = -1
2. [w^(kn/m + p)]^m = w^(kn + mp) = w^(mp)
So the n points on which the polynomial can be evaluated are w^0, w^1, ..., w^(n-1). Now we need to evaluate lower order polynomials of size n/m at only first m powers of w as afterwards it cycles again and we can reuse the computation. This comes from the observation that
A(w^p) = A[0](w^(mp)) + w^p A[1](w^(mp)) + ... + w^(m-1) A[m-1](w^(mp))
Also,
A(m^(kn/m+p)) = A[0](w^(mp)) + w^(kn/m+p) A[1](w^(mp)) + ... + [w^(kn/m+p)]^(m-1) A[m-1](w^(mp))
So A[0], ... A[m-1] are evaluated at p = { 0, 1, ..., m-1 }, and the computation is used for all the other n-m powers of w.
The general recurrence of this algorithm is:
T(n) = m T(n/m) + O(n)
=> T(n) = O(n lg n)
m = 2 works well in practice and is employed by Fast Fourier Transform in order to intrapolate the coefficients of the polynomial. This is generally used for multiplying two polynomials to get higher order polynomial. The steps are to first evaluate the lower order polynomials at n points and then use the multiplication of the point values to generate the coefficients of the higher order polynomial.
A(x) has a degree bound of n. Its actual degree can be lower than n-1 if a(n-1) equals zero but that doesn't matter here.
Using Horner's rule, we can compute a polynomial in O(n) time by arranging the terms as follows:
A(x) = a0 + x (a1 + x (a2 + ... + x (a(n-2) + x a(n-1))))
So it will take O(n^2) time to compute polynomial at n different points. But we can use divide and conquer to do better than that i.e. in O(n lg n). We can create "m" different polynomials such that each term of ith polynomial has degree mod m = i.
=> A(x) = [ a0 + a(m) x^m + a(2m) x^2m ... ] +
[ a1 + a(m+1) x^(m+1) + a(2m+1) x^(2m+1) ... ] +
.
.
.
[ a(m-1) x^(m-1) + a(2m-1) x^(2m-1) + ... ]
=> A(x) = A[0](y) + x A[1](y) + ... + x^i A[i](y) + ... + x^(m-1) A[m-1](y)
where y = x^m, and
A[i](x) = a(i) + a(m+i) . x + ... + a(n-m+i) . x^(n/m-1)
Still the recurrence for evaluation each point is T(n) = m T(n/m) + O(m) = O(n). So evaluating polynomial at n points leads to O(n^2) which is not great.
Now we can choose the n points cleverly. Lets set w as principal nth root of unity.
=> w^n = 1
=> w = e^(2 . pi . i / n)
= cos (2 . pi /n) + i . sin(2 . pi / n)
The two interesting properties of w are
1. w^(n/2) = -1
2. [w^(kn/m + p)]^m = w^(kn + mp) = w^(mp)
So the n points on which the polynomial can be evaluated are w^0, w^1, ..., w^(n-1). Now we need to evaluate lower order polynomials of size n/m at only first m powers of w as afterwards it cycles again and we can reuse the computation. This comes from the observation that
A(w^p) = A[0](w^(mp)) + w^p A[1](w^(mp)) + ... + w^(m-1) A[m-1](w^(mp))
Also,
A(m^(kn/m+p)) = A[0](w^(mp)) + w^(kn/m+p) A[1](w^(mp)) + ... + [w^(kn/m+p)]^(m-1) A[m-1](w^(mp))
So A[0], ... A[m-1] are evaluated at p = { 0, 1, ..., m-1 }, and the computation is used for all the other n-m powers of w.
The general recurrence of this algorithm is:
T(n) = m T(n/m) + O(n)
=> T(n) = O(n lg n)
m = 2 works well in practice and is employed by Fast Fourier Transform in order to intrapolate the coefficients of the polynomial. This is generally used for multiplying two polynomials to get higher order polynomial. The steps are to first evaluate the lower order polynomials at n points and then use the multiplication of the point values to generate the coefficients of the higher order polynomial.
Subscribe to:
Posts (Atom)