Tag Archives: Algorithm

Algorithm Tips (1) – Avoiding Data Allocation

For next following posts, I will write down some tips when coding. These are all summaries while I practice on LeetCode. I know there are so many LeetCode answers of questions online, but that is a just question-to-answer model. If we just read the question and then answer it, we never grow up fast. We need to know why others’ algorithm is faster than mine and what’s the differences between them; in future, how can we apply others’ good places into our own. So, let’s start!

Example-1:

We all know Map (key-value pair) is a good data structure to optimize algorithm speed sometimes. Because Map’s get speed is O(1) with well-designed hash function. But if we don’t write it well, it still might lose its advantage.

Worse Case:

Map<Integer, List<Integer>> map = new HashMap();
// nums is a int[] array. 
for(int i=0; i< nums.length; i++) {
    List<Integer> value = new ArrayList();
    value.add(nums[i]);
    if (map.containsKey(i)) {       
        value.addAll(map.get(i));
    } 
    map.put(i, value);
}

Better Version:

Map<Integer, List<Integer>> map = new HashMap();
for(int i=0; i<nums.length; i++) {
    if (!map.containsKey(i)) {
        map.put(i, new ArrayList<>());
    }
    map.get(i).add(nums[i]);
}

For small volume of data, this kind of advantage might be not too obvious, but for big data, the second version is much better than the first one.

It is quite easy to understand its improvement. For the first version, it has “addAll” to insert the whole list to another one. From physical side, we know it needs to handle a lot of things, such as allocating space. In fact, ArrayList.addAll already did some optimization, like allocating enough space to add every single element in one step, rather than having to reallocate multiple times if adding large numbers of elements. To avoid these things, the second version uses an empty data (” new ArrayList<>()” ) to well get rid of space allocation issue.

Example-2:

The Question requests to design a set data structure that supports all operations in average O(1) time.

Worse Version:

List<Integer> data;
public boolean insert(int val) {
  if (data.contains(val)) return false;
  else {
    data.add(val);
    return true;
  }
}
public boolean remove(int val) {
  if (data.contains(val)) {
    int index = data.indexOf(val);
    data.remove(index);
    return true;
  } else return false;
}

Better Version:

List<Integer> data;
Map<Integer, Integer> map; // to store val->index
public boolean insert(int val) {
  if (map.containsKey(val)) return false;
  else {
    data.add(val);
    map.put(val, data.size()-1);
    return true;
  }
}
public boolean remove(int val) {
  if (!map.containsKey(val)) return false;
  else {
    int index = map.get(val);
    if (index != data.size()-1) {
      int lastValue = data.get(data.size()-1);
      data.set(index, lastValue);
      map.put(lastValue, index);
    }
    data.remove(data.size()-1);
    map.remove(val);
    return true;
  }
}

We see the great difference between the two version is that we introduce a Map to store each value’s index; for remove function, we swap the target with the last element of the list and then finally delete the last one.

The reason we improve like this is to reach O(1) purpose. ArrayList’s indexOf and remove method are opposite with this target.

  • remove(int)
    This method removes a single element at given position. After that all elements from the right are shifted to the left via System.arraycopy.all, so this method has O(n) complexity. Each call to remove the last element would not invoke System.arraycopy all, so such method call complexity would be O(1).
  • remove(Object)
    This method removes the first occurrence of a given element from the list. It iterates all list elements, so it has O(n) complexity. This method would access all array elements in any case – either read them while looking for requested element or move them from one position to the left by System.arraycopy call after requested element was found.
    Never call remove(Object) when you know at which position you can find your element.
  • contains(Object), indexOf(Object)
    The first method checks a given object whether it’s present in the list (and defined as indexOf(elem)>=0). The second method tries to find the position of given element in the list. Both methods have O(n) complexity because they are scanning an internal array from the beginning in order to find given element. 

Summary of List:

  • add elements to the end of the list
  • remove elements from the end
  • avoid contains, indexOf and remove(Object) methods
  • even more avoid removeAll and retainAll methods
  • use subList(int, int).clear() idiom to quickly clean a part of the list

Additional Read:

  1. java.util.ArrayList performance guide

 

Advertisements