Puzzles, Maths and Algorithms: 2013

December 9, 2013

Palindrome Numbers

Problem 1: Given a number n, check whether it is a palindrome or not. For e.g., n = 303 is a palindrome, where as n = 302 is not. Give an O(log n) algorithm.

Problem 2: Given a number n, find the largest palindrome that is less than or equal to n. For e.g., n = 36, then 33 is the largest palindrome less than or equal to n. Give an O(log n) algorithm.

Problem 3: Given a number n, find all the palindrome less than or equal to n. Give an efficient O(k) algorithm, where k = number of palindromes less than n.

Problem 4: Prove that all palindromes with even length is divisible by 11. For e.x. 6996 is divisible by 11.

Problem 5: Consider a palindrome number generator as follows.

Step 1: n = n + reverse(n)
Step 2: If !is_palindrome(n): Goto Step 1

Given a palindrome number m, find the smallest n which can generate m using this generator. Is it possible that no such n exist?

For e.g., m = 121, we get n = 19. To see this, we get, n1 = 19+91 = 110. In the next step we get n2 = 110 + 011 = 121.

December 1, 2013

Finding Relevant Keywords from Large Corpus of Text

Problem 1: Given a large paragraph of words containing n words (separated by space), and k keywords. Find the smallest distance between these keywords in the paragraph.

To illustrate the problem, consider that the paragraph is A B A C E D O A B. Let us say that we are interested in two keywords A and D. Then the closest they appear is at position 8 and 6 (D O A) and the minimum distance is 1 (i.e. number of non-keyword words between them).

Build an efficient algorithm with O(k) space and O(n log k) running cost.

Solution: show

To solve this problem, we need to register the last seen position of each of k keywords, as we read the paragraph word-by-word. If the next word, and if that matches a keyword, we need to update the last seen position of the matching keyword and also compute the minimum of the position register. That helps us find the smallest distance.

In order to implement this efficiently, we need a hash map and min heap (tree based implementation, not array based). Following is the code to solve this problem.

Let K[i] indicate ith keyword.
Let W[i] indicate ith word in the paragraph. 

// init map and heap
map = {}
heap = []
for i in 1 to k

  // init heap with -Inf as position of the keyword
  node = heap_node()
  node.val = -Inf

  // map stores the keyword and its pointer to node in the heap
  heap.put(node)
  map.put(K[i], node)
end

idx_start = 0
idx_end = 0
min_dist = Inf
for i in 1 to n
  w = W[i]

  // check if w is one of the keywords, if not ignore
  if !map.contains_key(w)
    continue
  end

  // get the heap node corresponding to the
  // key W[i] and update its position
  node = map.get(w)
  node.val = i

  // re-heapify the min heap
  heap.heapify()

  // min of all positions
  idx = heap.min()

  // if idx = -Inf then some keywords are not seen yet.
  if idx == -Inf
    continue
  end

  // compute distance between keywords
  dist = i - idx - k + 1
  if dist < min_dist
    idx_start = idx
    idx_end = i
    min_dist = dist
  end 
end

The above algorithm requires O(k) memory for the map and the heap. For each word, it requires O(log k) to re-heapify the min heap.

Problem 2: Given a large paragraph of words (separated by space). Consider N tuples of names and interests. If a given person and their interest appears within K words then its a match else not. Find all matches from the paragraph.

To illustrate the problem, consider that the paragraph is "John likes swimming and Jack likes Tennis". Let the interest map is <john swimming>, <john tennis>, <jack swimming>, <jack tennis>. For k = 3, 4, 5, 6 the matches found in the paragraph are <john swimming>, <jack swimming>, <jack tennis>.

Propose an efficient algorithm for two cases: k is small, k is very large.