Hashtable are extremely useful data-structure as they provide storage and retrieval in O(1) time (amortized). Several problems of algorithms can be very efficiently solved using hashtables which otherwise turn out to be quite expensive. In this post, we consider some of these problems:

Problem 1: Remove duplicate elements from an unsorted array of size N.

Solution:

show
Lets first see the solution without hashtable. Naive solution would be to compare each pair of elements. If they are same then drop one of them. This solution would take O(N^2) time in average case and O(N) time in best case, when all the elements are same. The following code does that:

for i = 1 to N-1
if a[i] == NaN: continue
print a[i]
for j = i+1 to N
if a[i] == a[j]: a[j] = NaN

Above code prints all non-duplicates from the array. Naive solution is average case O(N^2). In practice if you believe that array consists of few numbers duplicated lots of times, say K numbers duplicated N/K times, then the running time of above algorithm is O(KN).

We can do better than worst case O(N^2) time. If we sort the elements in O(N logN) time and compare adjacent elements in O(N), we can solve the above problem in O(N logN) time.

sort(a) // O(N log N) time, O(1) space using Heap Sort
k = a[1]
for i = 2 to N
if a[i] == k: a[i] = NaN
else: k = a[i]

Using heap sort, we can make the above algorithm run in O(N log N) time in worst case and it would take O(1) extra space (apart from N for the array). Using hashtable, we can solve this problem in O(N) time (amortized) and O(N) space (for the hash table). The algorithm looks like following:

for i = 1 to N
if hash.has_key(a[i]) == false:
print a[i]
hash.put(a[i])

Above algorithm stores an unseen element in hash-table and ignores the seen element. To check if an element is seen or not, hash table provides storage/retrieval methods in O(1) time.

Problem 2: Find intersection of K unsorted array of N elements each. Intersection consists of elements that appear in all the K arrays.

Problem 3: How to make a linked list support operations in O(1) time. The operations on linked list can be insertion after any arbitrary valued node, deletion of any arbitrary valued node.

Problem 4: Find all unique pairs of element in an array that sum to S. For ex. If array = {2,4,6,4,6} and S = 8 then answer is {(2,6), (4,4)}

Problem 5: Consider an array containing unique elements. Find a triplet of elements in the array that sum to S (extension of problem 4). Can hash-tables improve the running time of your algorithm.

Problem 6: Consider two strings of size M, N. Perform string matching in size O(M+N).

Hi, Thanks for your efforts.

ReplyDeleteI did n't understand the notation NaN in the solution of Problem 1: Remove duplicate elements from an unsorted array of size N.

Can please elaborate the meaning of short form "NaN".

Thanks in advance

NaN corresponds to Not a Number. Equivalent to null or undefined in certain programming languages. It means that number is removed from that position.

ReplyDelete