Monthly Archives: November 2016

Scala – Performance Optimization

First, we talked lot of information about performance optimization, including slick’s performance optimization by connection pool, JavaScript High Performance tips, MySQL’s performance tuning, and Play Framework Tuning(1) and Tuning(2). If we think it is enough, we are too naive. This post I will list performance optimization tips on Scala.

  1. par
    If we have N tasks which don’t have any relationship between each task, like order, shared variable, we can consider using the parallel collection to fasten computation.
    The parallel collection will make use of max-currency depending on the number of cores to execute which will greatly improve function’s efficiency.

    val list = List[String]("a", "b", "c", ...)
    list.par.foreach( r => {
      ... // your task.  
    })

    Note:
    Sometimes, the sequential implementation might have better performance than parallel implementation. That’s because using parallel collection has some overhead for distributing(fork) and gathering(join) the data between cores. Thus one can conclude having heavy computations, parallel collections can be of great performance improvement.

  2. Future
    Future is the same with par which can reach to the same purpose.

    import scala.concurrent.ExecutionContext.Implicits.global
    import scala.concurrent.{Await, Future}
    import scala.concurrent.duration._
    val arr = List[String]("a", "b", "c")
    val futures = arr.map( r => Future {
      ... // your task.
    })
    val future = Future.sequence(futures)
    Await.result(future, 1 hour)
  3. Avoid unnecessary loop
    Even though we have  the parallel method to fasten steps, avoiding unnecessary loop is still needed. For my code, I use the random method to handle tokens selection problem to make sure the resources are fair to every customer. This is benefit from the probability theory.
  4. print out thread name to debug parallel running status
    println(Thread.currentThread().getName)
  5. Separate ExectureContent
    If you don’t want to influence default ExectureContent, you can create additional one to separate it.

    implicit val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())

    newCachedThreadPool vs newFixedThreadPool
    newFixedThreadPool:
    Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. At any point, at most nThreads threads will be active processing tasks. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available. If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks. The threads in the pool will exist until its is explicitly shutdown.
    newCachedThreadPool:
    Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. Calls to execute will reuse previously constructed threads if available. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for sixty seconds are terminated and removed from the cache. Thus, a pool that remains idle for long enough will not consume any resources. Note that pools with similar properties but different details (for example, timeout parameters) may be created using ThreadPoolExecutor constructors.
    If you have a huge number of long running tasks I would suggest the FixedThreadPool. Otherwise, please choose CachedThreadPool.

Server Protection and Monitor

As we all know, there is a huge DDoS attack recently which influences lots of websites. At first beginning, our server didn’t get influenced by it. But last week, I allow ssh server by password.(In the past, we only allow the user to use the public key to ssh server. But last week, we just want to allow a user to log in simply and fast. We open it temporarily and forget to close it) This week, the server is totally attacked. Anyway, the final problem is that our server is blocked by increasing useless threads and bandwidth is used up. So here I list how I find these problems and how we try to fix it.

Step1: check network status

According to check network status, you will find bandwidth is too high which influences other normal customers to use this server’s resources.

sudo apt-get install nethogs
sudo nethogs
sudo nethogs eth0 eth1

Step2: find exact threads

Except knowing the bandwidth status, you also need to know thread status. By viewing real-time thread status, you will know there are many malicious threads which take too many resources. For my case, I find there are 300 malicious threads which are created every 2 minutes. It is not hard to understand that the final server is slow enough to undertake these increasing malicious threads.

sudo apt-get install htop
htop

Step3: kill useless threads

After known these malicious threads, we need to kill them. In fact, killing them can solve this problem by root cause. Because for now I only know its effect, not the root cause. Luckily we find a bash script which causes it. So I kill this bash script together. Until now, it looks like we already finish everything. But things haven’t done. After 8 hours stable network, the server is attacked again. I can’t ssh into the server. So final solution is to shut down and rebuild the server. So for now, I don’t know the root cause.

sudo kill -9 $(pgrep <useless_threads_main_name>)

Step4: add your own public key to ssh

In order to avoid the attack happens again, I involve public key back. Because I’m sure this attack happened when I open password login.

ssh <user_name>@<server_ip> 'mkdir -p ~/.ssh;cat >> ~/.ssh/authorized_keys' < ~/.ssh/id_rsa.pub

Here id_rsa.pub is your own public key file.

Step5: disable password to ssh

sudo vi /etc/ssh/sshd_config
# change to no to disable tunnelled clear text passwords
PasswordAuthentication no
PubkeyAuthentication yes
service ssh restart

Conclusion:

Using high-level security configuration is needed to avoid this kind of attack. Once this malicious attack happens, the best way is to backup your data as soon as possible and rebuild your server. (During investigation process, I also install several kinds of maldetect tools. I’m trying to use these tools to scan out the malicious scripts/codes/software. But unfortunately, they all failed.)

I still need more knowledge to help me understand and find out the root cause. Keep learning.