Category Archives: Performance

Algorithms Tips (4) – Basic Data Structure

Here we record some basic data structures which are easily involved in coding.

  • PriorityQueue
    Sometimes, we want to sort hashmap by its value, we can introduce PriorityQueue to override its compare function to reach the goal. (Laster, I will add more compare function to popular PriorityQueue’s field.)

     Map<Integer, Integer> tweets = new HashMap();
     PriorityQueue<Integer> pq = new PriorityQueue<Integer>(new Comparator<Integer>(){
       @Override
       public int compare(Integer a, Integer b) {
         return tweets.get(b) - tweets.get(a);
         // tweets.get(a)-tweets.get(b) is increasing/ascending
         // tweets.get(b)-tweets.get(a) is downing trend
       }
     });
     pq.add(key);
     pq.poll(); //return head and delete head 
     pq.peek(); // return head
  • Set
    If we want to add unique value to the output, we can use set as an intermedium variable to help to check during adding process.

    Set set = new HashSet();
    set.add("a");
    set.remove("a");
    Iterator iterator = set.iterator();
    while(iterator.hasNext()) {
      String elem = iterator.next();
    }
  • TreeMap
    A TreeMap provides an efficient means of sorting key/value pairs in sorted order and allows rapid retrieval. Its elements will be sorted in an ascending key order.

    TreeMap tm = new TreeMap();
    tm.put("a", 2);
    tm.put("b", 1);
    tm.get("a"); // return 2
    tm.lastKey(); // return "b"
    tm.firstKey(); // return "a"
    tm.remove("a"); 
    TreeMap<Integer, HashSet> t = new TreeMap();
    // if HashSet's size is zero, we need to use t.remove(key) to 
    remove this element in TreeMap. Because its null value will 
    influence TreeMap's firstKey and lastKey result.
  • List
    Arrays.asList("a", "b");// initialize List with values
    // List to Array
    lists.toArray(new String[lists.size()]);

    For recursion method, we might need to copy the last result to current result. For example, for f(i-1), its result is left (List<Integer>) and for f(i), its result is to merge left with current value.
    Wrong version:

     List<List<Integer>> rec = new ArrayList<List<Integer>>():
     rec.addAll(left);
     for(List<Integer> left: lefts) {
       left.add(nums[index]);
       rec.add(elem);
     }

    Right version:

     List<List<Integer>> rec = new ArrayList<List<Integer>>():
     rec.addAll(left);
     for(List<Integer> left: lefts) {
       ArrayList<Integer> elem = new ArrayList<Integer>(left);
       elem.add(nums[index]);
       rec.add(elem);
     }

    The problem of the wrong version is that it will change left’s content. For example, if left=[[], [1]] and nums[1]=2, the wrong version’s output is [[2],[1,2],[2],[1,2]]. For the right version, it creates a new ArrayList to physically build a new list, so later add method won’t change original content. So the right output is [[], [1], [2], [1,2]].

  • Stack
    Stack<String> s = new Stack();
    s.push("zara");
    s.peek();
    s.pop();
  • String
    int integer = 1;
    String t = String.valueOf(integer);

The string is very common, so the only special point is to convert to char[] when needed.

  •  char[] arr = str.toCharArray();
     Arrays.sort(arr);
     String t = String.valueOf(arr);
     String t1 = new String(arr);

 

Algorithm Tips (3) – Building Suitable Data Structure

The more LeetCode questions we practice, the more rules we find. We can split all questions to serval categories, such as Stack, Map/HashMap, Dynamic Programming, etc. It looks good to use these existing data structure to improve the solution, but sometimes, we still need to build our own suitable data structure to fit some questions’ special scenario.

Today I take “Binary Tree Maximum Path Sum” as an example. (Later, I will enhance this post by adding more examples.)

Question:

Given a binary tree, find the maximum path sum. For this problem, a path is defined as any sequence of nodes from some starting node to any node in the tree along the parent-child connections. The path must contain at least one node and does not need to go through the root.

Analysis:

For any node, it has two cases, one is single side(not circle), the other is go-through-circle. We need to record the two cases for each node and then to be provided to parent’s node to evaluate. So it is quite clear to solve the problem. Now you can use int[] which contains 2 elements to code it, but we all know it is not the good one and hard to read. The better solution is to build our own data structure to describe the two cases clearly.

private class ResultType {
  int singlePath, maxPath;
  ResultType(int singlePath, int maxPath) {
    this.singlePath = singlePath;
    this.maxPath = maxPath;
  }
}

Here we conclude above two cases to two parameters: singlePath is to record the maximum path for single side(not circle) and maxPath to record the maximum value for the current node, including circle case and non-circle case. So the whole codes are here:

/**
 * Definition for a binary tree node.
 * public class TreeNode {
 *   int val;
 *   TreeNode left;
 *   TreeNode right;
 *   TreeNode(int x) { val = x; }
 * }
 */
public class Solution {
   private class ResultType {
     int singlePath, maxPath;
     ResultType(int singlePath, int maxPath) {
       this.singlePath = singlePath;
       this.maxPath = maxPath;
     }
   }
   public int maxPathSum(TreeNode root) {
     ResultType res = helper(root);
     return res.maxPath;
   }
 
   public ResultType helper(TreeNode root) {
     if (root == null) 
       return new ResultType(Integer.MIN_VALUE, Integer.MIN_VALUE);
     ResultType left = helper(root.left);
     ResultType right = helper(root.right);
 
     int singlePath = Math.max(0, 
             Math.max(left.singlePath, right.singlePath)) + root.val;
     int maxPath = Math.max(left.maxPath, right.maxPath);
     maxPath = Math.max(maxPath, 
             Math.max(left.singlePath, 0)+
                      Math.max(right.singlePath, 0)+root.val);
     return new ResultType(singlePath, maxPath);
   }
}

We see parent’s singlePath = Max(their children’s singlePath + root value) and parent’s maxPath = Max( children’s maxPath,  children’s singlePath + root value).

Meanwhile, we comparing singlePath with zero, to determine whether to involve it or not.

Another clever point is to return Integer.MIN_VALUE, not zero for the empty node. This is very clever to solve the case, like only one node with a negative value. Because in this case, this negative value is still the maximum value for this node.

Algorithm Tips (2) – Avoiding Data Overflow

Today we talk about overflow. This is a very shy problem when interviewing. And it is also a very good way to distinguish whether you are a good programmer or just so-so programmer.

When I interviewed in a big company (I forgot its name, maybe eBay), the interviewer asked me to write a sum function. So easy?! In fact, No. You have to consider overflow for your solution.

Concept of Overflow and Underflow

First, let’s understand what’s overflow and what’s underflow. Overflow and underflow are related to a data type. Every data type has its own range. For example, int, in Java, you can use Integer.MIN_VALUE and Integer.MAX_VALUE to know its range. In Java arithmetic operators, it doesn’t report overflow and underflow problem and never throws an error exception. It just simply swallows it.

  • int operators:
    When the value of an operation is larger than 32 bits, then the low 32 bits only taken into consideration and the high order bits are discarded. And when its most significant bit(MSB) is 1 then the value is treated as negative.
  • floating point operators:
    Overflow will result in Infinity and underflow will result as 0.0.

Solutions to avoid overflow or underflow

Second, there are several ways to avoid it.

  • using long to replace int
  • using uint to replace int
  • using double to apply intermediate variable
  • use the mod method to avoid it, This is a very common solution in LeetCode, within the question, it already obviously reminds you to use mod 100000007 if the answer is very large.

Examples

Finally, I list my solution for LeetCode-576 (Out of Boundary Paths) (Here I don’t list the content of the question, you can read it from LeetCode’s website. ) to help you understand how to avoid overflow issue. Meanwhile, it is a good way to see how to use three dimensions array to fix the problem with dynamic programming method.

public class Solution {
  public int findPaths(int m, int n, int N, int i, int j) {
    final int MOD = 1000000007;
    if (N == 0) return 0;
    long[][][] dp = new long[m][n][N+1]; //k:1-N
    for(int ii=0; ii<m;ii++) {
      for(int jj=0; jj<n;jj++) {
        int count = 0;
        if (ii-1<0) count++;         if (ii+1>=m) count++;
        if (jj-1<0) count++;         if (jj+1>=n) count++;
        dp[ii][jj][1] = count;
      }
    }
    for(int k=2; k<=N;k++) {
      for(int ii=0; ii<m; ii++) {
        for(int jj=0; jj<n; jj++) {
          long count = 0;
          if (ii-1>=0) count += dp[ii-1][jj][k-1];
          if (jj-1>=0) count += dp[ii][jj-1][k-1];
          if (ii+1<m) count += dp[ii+1][jj][k-1];
          if (jj+1<n) count += dp[ii][jj+1][k-1];
          dp[ii][jj][k] = count%MOD;
        }
      }
    }
    long rec = 0;
    for(int k=1; k<=N; k++) {
      rec += dp[i][j][k];
    }
    return (int)(rec%MOD);
    }
}

Another simple LeetCode question also requires us to consider overflow issue: Valid Binary Search Tree. Here is the solution which considers using Long to replace Integer to avoid overflow.

/**
 * Definition for a binary tree node.
 * public class TreeNode {
 *   int val;
 *   TreeNode left;
 *   TreeNode right;
 *   TreeNode(int x) { val = x; }
 * }
 */
public class Solution {
 public boolean isValidBST(TreeNode root) {
   return isValidBSTCheck(root, Long.MIN_VALUE, Long.MAX_VALUE);
 }
 public boolean isValidBSTCheck(TreeNode root, long min, long max) {
   if (root == null) return true;
   if (root.val > min && root.val < max) {
     return isValidBSTCheck(root.left, min, root.val) && isValidBSTCheck(root.right, root.val, max);
   } else return false;
 }
}

Algorithm Tips (1) – Avoiding Data Allocation

For next following posts, I will write down some tips when coding. These are all summaries while I practice on LeetCode. I know there are so many LeetCode answers of questions online, but that is a just question-to-answer model. If we just read the question and then answer it, we never grow up fast. We need to know why others’ algorithm is faster than mine and what’s the differences between them; in future, how can we apply others’ good places into our own. So, let’s start!

Example-1:

We all know Map (key-value pair) is a good data structure to optimize algorithm speed sometimes. Because Map’s get speed is O(1) with well-designed hash function. But if we don’t write it well, it still might lose its advantage.

Worse Case:

Map<Integer, List<Integer>> map = new HashMap();
// nums is a int[] array. 
for(int i=0; i< nums.length; i++) {
    List<Integer> value = new ArrayList();
    value.add(nums[i]);
    if (map.containsKey(i)) {       
        value.addAll(map.get(i));
    } 
    map.put(i, value);
}

Better Version:

Map<Integer, List<Integer>> map = new HashMap();
for(int i=0; i<nums.length; i++) {
    if (!map.containsKey(i)) {
        map.put(i, new ArrayList<>());
    }
    map.get(i).add(nums[i]);
}

For small volume of data, this kind of advantage might be not too obvious, but for big data, the second version is much better than the first one.

It is quite easy to understand its improvement. For the first version, it has “addAll” to insert the whole list to another one. From physical side, we know it needs to handle a lot of things, such as allocating space. In fact, ArrayList.addAll already did some optimization, like allocating enough space to add every single element in one step, rather than having to reallocate multiple times if adding large numbers of elements. To avoid these things, the second version uses an empty data (” new ArrayList<>()” ) to well get rid of space allocation issue.

Example-2:

The Question requests to design a set data structure that supports all operations in average O(1) time.

Worse Version:

List<Integer> data;
public boolean insert(int val) {
  if (data.contains(val)) return false;
  else {
    data.add(val);
    return true;
  }
}
public boolean remove(int val) {
  if (data.contains(val)) {
    int index = data.indexOf(val);
    data.remove(index);
    return true;
  } else return false;
}

Better Version:

List<Integer> data;
Map<Integer, Integer> map; // to store val->index
public boolean insert(int val) {
  if (map.containsKey(val)) return false;
  else {
    data.add(val);
    map.put(val, data.size()-1);
    return true;
  }
}
public boolean remove(int val) {
  if (!map.containsKey(val)) return false;
  else {
    int index = map.get(val);
    if (index != data.size()-1) {
      int lastValue = data.get(data.size()-1);
      data.set(index, lastValue);
      map.put(lastValue, index);
    }
    data.remove(data.size()-1);
    map.remove(val);
    return true;
  }
}

We see the great difference between the two version is that we introduce a Map to store each value’s index; for remove function, we swap the target with the last element of the list and then finally delete the last one.

The reason we improve like this is to reach O(1) purpose. ArrayList’s indexOf and remove method are opposite with this target.

  • remove(int)
    This method removes a single element at given position. After that all elements from the right are shifted to the left via System.arraycopy.all, so this method has O(n) complexity. Each call to remove the last element would not invoke System.arraycopy all, so such method call complexity would be O(1).
  • remove(Object)
    This method removes the first occurrence of a given element from the list. It iterates all list elements, so it has O(n) complexity. This method would access all array elements in any case – either read them while looking for requested element or move them from one position to the left by System.arraycopy call after requested element was found.
    Never call remove(Object) when you know at which position you can find your element.
  • contains(Object), indexOf(Object)
    The first method checks a given object whether it’s present in the list (and defined as indexOf(elem)>=0). The second method tries to find the position of given element in the list. Both methods have O(n) complexity because they are scanning an internal array from the beginning in order to find given element. 

Summary of List:

  • add elements to the end of the list
  • remove elements from the end
  • avoid contains, indexOf and remove(Object) methods
  • even more avoid removeAll and retainAll methods
  • use subList(int, int).clear() idiom to quickly clean a part of the list

Additional Read:

  1. java.util.ArrayList performance guide

 

Server Protection and Monitor

As we all know, there is a huge DDoS attack recently which influences lots of websites. At first beginning, our server didn’t get influenced by it. But last week, I allow ssh server by password.(In the past, we only allow the user to use the public key to ssh server. But last week, we just want to allow a user to log in simply and fast. We open it temporarily and forget to close it) This week, the server is totally attacked. Anyway, the final problem is that our server is blocked by increasing useless threads and bandwidth is used up. So here I list how I find these problems and how we try to fix it.

Step1: check network status

According to check network status, you will find bandwidth is too high which influences other normal customers to use this server’s resources.

sudo apt-get install nethogs
sudo nethogs
sudo nethogs eth0 eth1

Step2: find exact threads

Except knowing the bandwidth status, you also need to know thread status. By viewing real-time thread status, you will know there are many malicious threads which take too many resources. For my case, I find there are 300 malicious threads which are created every 2 minutes. It is not hard to understand that the final server is slow enough to undertake these increasing malicious threads.

sudo apt-get install htop
htop

Step3: kill useless threads

After known these malicious threads, we need to kill them. In fact, killing them can solve this problem by root cause. Because for now I only know its effect, not the root cause. Luckily we find a bash script which causes it. So I kill this bash script together. Until now, it looks like we already finish everything. But things haven’t done. After 8 hours stable network, the server is attacked again. I can’t ssh into the server. So final solution is to shut down and rebuild the server. So for now, I don’t know the root cause.

sudo kill -9 $(pgrep <useless_threads_main_name>)

Step4: add your own public key to ssh

In order to avoid the attack happens again, I involve public key back. Because I’m sure this attack happened when I open password login.

ssh <user_name>@<server_ip> 'mkdir -p ~/.ssh;cat >> ~/.ssh/authorized_keys' < ~/.ssh/id_rsa.pub

Here id_rsa.pub is your own public key file.

Step5: disable password to ssh

sudo vi /etc/ssh/sshd_config
# change to no to disable tunnelled clear text passwords
PasswordAuthentication no
PubkeyAuthentication yes
service ssh restart

Conclusion:

Using high-level security configuration is needed to avoid this kind of attack. Once this malicious attack happens, the best way is to backup your data as soon as possible and rebuild your server. (During investigation process, I also install several kinds of maldetect tools. I’m trying to use these tools to scan out the malicious scripts/codes/software. But unfortunately, they all failed.)

I still need more knowledge to help me understand and find out the root cause. Keep learning.

JVM Memory Management (1)

Last week, we normally deal with lots of things related to performance. But we didn’t dig into it and check why the problem happened. So I decide to write a post to explain JVM Memory Management. I know there are many blogs which are talking about it. It is not a new topic, but I will write down it from my own understanding and how I use some commands to prove this knowledge.

1. Check Memory Usage

Before we go to understand what is garbage collection, what is young generation, etc. First, we go to see our application’s memory usage status. You can use htop to see all thread’s memory usage. For Java/Scala Application, you have more choices.

# get java application pid
>> jcmd 
# force Garbage Collection from the Shell
>> jcmd <PID> GC.run
>> jps -l
# check which instances cost most memory
>> jmap -histo:live <PID> | head
>> jmap -histo:live <PID> | head -n 20
# check real-time memory usage status
>> jstat -gc <PID> 1000ms

2. Understand jstat output

Here we list my application jstat output:

 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT   
61952.0 62976.0  0.0   32790.1 1968128.0 29322.0  2097152.0   24306.3   60888.0 60114.9 7808.0 7676.4     33    0.545  13      1.057    1.601

We explain each column’s meaning:

  • S0C/S1C: the current size of the Survivor0 and Survivor1 areas in KB
  • S0U/S1U: the current usage of the Survivor0 and Survivor1 areas in KB. Notice that one of the survivor areas are empty all the time. (See “Young Generation” to know reason)
  • EC/EU: the current size and usage of Eden space in KB. Note that EU size is increasing and as soon as it crosses the EC, Minor GC is called and EU size is decreased.
  • OC/OU: the current size and current usage of Old generation in KB.
  • PC/PU: the current size and current usage of Perm Gen in KB.
  • YGC/YGCT: YGC displays the number of GC event occurred in young generation. YGCT displays the accumulated time for GC (unit is second) operations for Young generation. Notice that both of them are increasing in the same row where EU value is dropped because of minor GC. (See “Young Generation” to know reason)
  • FGC/FGCT: FGC displays the number of Full GC event occurred. FGCT displays the accumulated time for Full GC operations. Notice that Full GC time is too high when compared to young generation GC timing. 
  • GCT: total accumulated time for GC operations. Notice that it is sum of YGCT and FGCT values.

3. How to set JVM parameters

 Here we explain why you see S0C, S1C, EC, OC value is like above. There are multiple parameters which can set these values by VM Switch

  • -Xms: For setting the initial heap size when JVM starts
  • -Xmx: For setting the maximum heap size
  • -Xmn: For setting the size of the Young Generation, rest of the space goes for Old Generation
  • -XX:PermGen: For setting the initial size of the Permanent Generation memory
  • -XX:MaxPermGen: For setting the maximum size of Perm Gen
  • -XX:SurvivorRatio: For providing a ratio of Eden space and Survivor Space, for example, if Young Generation size is 10m and VM switch is -XX:SurvivorRatio=2 then 5m will be reserved for Eden Space and 2.5m each for both Survivor spaces. The default value is 8
  • -XX:NewRatio: For providing a ratio of old/new generation sizes. The default value is 2

4. JVM Memory Usage

The primary use of memory is in the heap and outside of the heap memory is also consumed in Metaspace, and the stack.

(1) Java Heap

The heap is where your class instantiations or objects are stored. Instance variables are stored in Objects. When discussing Java memory and optimization we most often discuss the heap because we have the most control over it and it is where Garbage Collection and GC optimizations take place. Heap size is controlled by the -Xms and -Xmx JVM flags.

(2) Java Stack

Each thread has its own call stack. The stack stored primitive local variables and object references along with the call stack (method invocations) itself. The stack is cleaned up as stack frames move out of context so there is no GC performed here. The -Xss JVM option controls how much memory gets allocated for each thread’s stack.

(3) Metaspace

Metaspace stores the class definitions of your objects. The size of Metaspace is controlled by setting -XX:MetaspaceSize.

(4) Additional JVM

In addition to the above values, there is some memory consumed by the JVM itself. This holds the C libraries for the JVM and some C memory allocation overhead that it takes to run the rest of the memory pools above. This type of memory can be affected by Tuning glibc Memory Behavior.

5. JVM Memory Model

Until now, we already know the status of our application. But we still don’t know what is Eden, What is Survivor, etc. Here we talk about how does JVM organizes memory. And then finally, we will better understand how to optimize it. I suggest when we read this part, we’d better go back to part2 and part3 to map each concept to real data output. This would be better.

There are five JVM Memory Models:

  • Eden
  • S0
  • S1
  • Old Memory
  • Perm

Eden + S0 + S1 === Young Gen (-Xmn)

Screenshot 2016-04-21 11.37.11

Eden + S0 + S1 + Old Memory === JVM Heap (-Xms  -Xmx)

Screenshot 2016-04-21 11.40.41

JVM Heap memory is physically divided into two parts-Young Generation and Old Generation. 

(1) Young Generation

Young generation is the place where all the new objects are created. When young generation is filled, garbage collection is performed. This garbage collection is called Minor GC. Young Generation is divided into three parts-Eden Memory and two Survivor Memory spaces.

  • Most of the newly created objects are located in the Eden Memory space. All new allocation happens in Eden. It only costs a pointer bump.
  • When Eden space is filled with objects, Minor GC is performed and all the survivor objects are moved to one of the survivor spaces. When Eden fills up, stop-the-world copy-collection into the survivor space. Dead objects cost zero to collect.
  • Minor GC also checks the survivor objects and move them to the other survivor space. So at a time, one of the survivor space is always empty.
  • Objects that are survived after many cycles of GC, are moved to the old generation memory space. Usually it’s done by setting a threshold for the age of the young generation objects before they become eligible to promote to Old generation.

Since Young Generation keeps short-lived objects, Minor GC is very fast and the application doesn’t get affected by this.

(2) Old Generation

Old Generation memory contains the objects that are long lived and survived after many rounds of Minor GC. Usually garbage collection is performed in Old Generation memory when it is full. Old Generation Garbage Collection is called Major GC and usually takes longer time. 

Major GC takes longer time because it checks all the live objects. Major GC should be minimized because it will make your application unresponsive for the garbage collection duration.

throughput collections: -XX:+UseSerialGC -XX:+UseParallelGC -XX:+UseParallelOldGC

low-pause collectors: -XX:+UseConcMarkSweepGC -XX:+UseGIGC

6. Garbage Collection

All the Garbage Collections are “Stop the world” events because all application threads are stopped until the operation completes.

One of the best feature of java programming language is the automatic garbage collection. There are many JVM switch to enable the garbage collection strategy for the application: (I will not explain each) Serial GC (-XX:+UseSerialGC), Parallel GC(-XX:+UseParallelGC), Parallel Old GC(-XX:+UseParallelOldGC), Concurrent Mark Sweep(CMS) Collector (-XX:+UseConcMarkSweepGC) and G1 Garbage Collector( -XX:+UseG1GC).

7. How to optimize JVM parameters

We talk about so much, it looks like JVM already has automatic garbage collection, so we don’t need to do anything. In fact, there are still some tunings we can do.

(1) java.lang.OutOfMemoryError: PermGen

increase the Perm Gen memory space using -XX:PermGen and -XX:MaxPermGen

(2) a lot of Full GC operations

increase Old generation Memory space.

(3) java.lang.StackOverflowError

increase stack size by -Xss

(4) Good Practices

  • set the minimum -Xms and maximum -Xmx heap sizes to the same value
  • -Xmn value should be lower than the -Xmx value. 
  • older generation is the value of -Xmx minus the -Xmn. Generally, you don’t want the Eden to be too big or it will take long for the GC to look through it for space that can be reclaimed.
  • keep the Eden size between one fourth and one third the maximum heap size. The old generation must be larger than the new generation. 

To summary, there is no universal solution to fix all. When we meet problems, we need to use tool to find root and dig into it and then fix it.