3 different approaches to the same problem

Keywords: list node, count map, current character, unique character, scan, position, solution, string, pointer, return, step, dna. Powered by TextRank.

I came a across a seemingly simple problem today that turned out to have multiple solutions with varying efficiency. These type of problems are very well suited for step by step iterative solutions where each step you improve upon the existing solution. The problem is the simple matter of find the first unique character in a string. I'm skipping the obvious solution of starting at the beginning of the string and scanning through the string comparing the current charater with every other. This type of solution will lead to an N^2 solution which is usually unacceptable. Think that you are searching a DNA sequence of 10 million nucleotides. No way exponential solutions are going to work. So the first solution is this:

  1. scan the list from left, and keep track of the characters and the number of times we have seen a character.
  2. scan the list from left and lookup the number of times the current character was seen. If it's 1 then we've found the first non-repeating character.
Character findFirstUniqWorst(String s) {
        HashMap<Character, Integer> map = new HashMap<>();
        for (char c : s.toCharArray()) {
            if (!map.containsKey(c)) {
                map.put(c, 0);
            }
            int e = map.get(c);
            map.put(c, e + 1);
        }
        for (char c: s.toCharArray()) {
            if (map.get(c) == 1) {
                return c;
            }
        }
        throw new RuntimeException();
    }

an improvement on this would be to eliminate the 2nd scan. If we stored the position of the 1st occurance of a character with the count we could return that. Something like:

  1. scan the list from the left and build the count map. Store the position of the first occurance too like 'a'->(count, position)
  2. scan the map and return the result with count equal to 1 having to lowest position.
static class Pair {
        public int count;
        public int position;
        public Pair(int count, int position) {
            this.count = count;
            this.position = position;
        }
    }
    Character findFirstUniqMed(String s) {
        HashMap<Character, Pair> map = new HashMap<>();
        char[] chars = s.toCharArray();
        for (int i=0;i<chars.length;i++) {
            char c = chars[i];
            if (!map.containsKey(c)) {
                map.put(c, new Pair(1, i));
            } else {
                Pair e = map.get(c);
                e.count += 1;
                map.put(c, e);
            }
        }
        int index = Integer.MAX_VALUE;
        Character returnChar = null;
        for (Character c : map.keySet()) {
            if (map.get(c).count == 1) {
                if (map.get(c).position < index) {
                    returnChar = c;
                }
            }
        }
        return returnChar;
    }

this is a good option if the alphabet is small like in the DNA example. But if the alphabet is large, we still would do a large scan. So enter the third option: instead of keeping the position of the character, keep a pointer to a linked list node. This linkedlist will hold the the unique characters we've seen so far and the head of the link list will be the first.

  1. scan the list from the left.
    1. if the character is not in the map, insert a node to the tail of the list and add the (char->node) mapping
    2. if the characater is in the map, and pointer to a node exists remove the node from the link delete, set (char->null) as the mapping
    3. if the character is in the map and the pointer doesn't exists do nothing and continue with the next.

this method will scan the input string once but return the first found character o(1).

static class DoubleLinkedList {
        public Character value;
        public DoubleLinkedList prev;
        public DoubleLinkedList next;
    }
    Character findFirstUniqBest(String s) {
        char[] chars = s.toCharArray();
        DoubleLinkedList list = new DoubleLinkedList();
        HashMap<Character, DoubleLinkedList> map = new HashMap<>();
        DoubleLinkedList head, tail;
        head = list;
        tail = list;
        for (int i = 0; i < chars.length; i++) {
            char cur = chars[i];
            if (!map.containsKey(cur)) {
                DoubleLinkedList node = new DoubleLinkedList();
                node.value = cur;
                node.prev = tail;
                node.next = null;
                tail.next = node;
                map.put(cur, node);
                tail = node;
            } else {
                if (map.get(cur) != null) {
                    if (map.get(cur).prev == null) {
                        head = map.get(cur).next;
                    } else {
                        map.get(cur).prev.next = map.get(cur).next;
                    }
                    if (map.get(cur).next == null) {
                        map.get(cur).prev.next=null;
                        tail = map.get(cur).prev;
                    } else {
                        map.get(cur).next.prev = map.get(cur).prev;
                    }
                    map.put(cur, null);
                }
            }
        }
        if (head == null) throw new RuntimeException();
        return head.value == null ? head.next.value : head.value;
    }


Metadata

Similar posts

Powered by TF-IDF/Cosine similarity

First published on 2018-05-09

Generated on May 5, 2024, 8:48 PM

Index

Mobile optimized version. Desktop version.