Category Archives: MySQL

MySQL(3)-Joins

  • left join = A
    select <select_list> from tableA A left join tableB B on A.key=B.key
  • inner join = (common part between A and B)
    select <select_list> from tableA A inner join tableB B on A.key=B.key
  • right join = B
    select <select_list> from tableA A right join tableB B on A.key=B.key
  • A – (common part between A and B)
    select <select_list> from tableA A left join tableB B on A.key=B.key 
    where B.key is NULL
  • B – (common part between A and B)
    select <select_list> from tableA A right join tableB B on A.key=B.key
    where A.key is NULL
  • A + B
    select <select_list> from tableA A full outer join tableB B on 
    A.key=B.key
  • A + B – (common part between A and B)
    select <select_list> from tableA A full outer join tableB B on 
    A.key=B.key where A.key is NULL or B.key is NULL

Slick (3) – connection pool

Before talking into detailed knowledge about connection pool in Slick, we need to figure out two things: One is about Slick. Slick does not provide a connection pool implementation of its own. The other thing is about JDBC, which not provides asynchronous database driver for JVM. So we need to thread pool in Play Framework application to help improve performance.

1. Add connection pool to Slick:

Event Database contains an AsyncExecutor that manages the thread pool for asynchronous execution of Database I/O Actions. Its size is the main parameter to tune for the best performance of the Database object. It should be set to the value that we would use for the size of the connection pool in traditional, blocking application. For Slick, its default connection pool is HikariCP, which is a “zero-overhead” production-quality connection pool.

import play.api.db.DB
import play.api.Play.current
import slick.driver.MySQLDriver.simple._
object Contexts {
  def withSession[Res](f: Session => Res) = {
    Database.forDataSource(DB.getDataSource(),
      AsyncExecutor(<executor_name>, <num_threads>, <queue_size>)) withSession f
  }
}

2. connection pool configuration

There’re some important default configuration which we need to know. Of course, we can override them to optimize.

# HikariCP connection pool configure
play.db.hikaricp.connectionTimeout=30 seconds
play.db.hikaricp.maxLifetime=30 minutes
play.db.hikaricp.maximumPoolSize=10

3. thread pool vs connection pool

There is dependency between these two pool setting is in maximum attribute.Let us assume that your database can support maximum of 50 connections then it is straight forward to configure the connection pool maximum attribute as 50. But how this setting going to impact thread pool setting? Configure application server thread pool as maximum of 100. In this scenario application server will allow 100 request to be processed simultaneously however database have only 50 connection. So only 50 thread will get connection another 50 thread will fight to get database connection hence connections will be switched between threads frequently that slow down overall application performance.

Just adding database connections will not improve performance, and can compromise availability because of the need to reestablish all the connections during failover.

Let us assume we set thread pool maximum setting same as connection pool maximum setting of 500. Application server will allow 50 thread to be processed simultaneously the remaining thread will be in wait state. All the 50 thread will get database connection immediately hence it will processed quickly.

The above example assumed each thread will use one database connection (may be multiple connections but sequentially); if your application uses two database connections parallels by each thread then configure thread pool maximum setting as half of the connection pool maximum.

Setting the thread pool size too large can cause performance problems because if there are too many concurrent threads, task switching overhead becomes a serious bottleneck. Any more threads than the number of connections available could be wasteful given contention for the connections. Fewer threads will not consume the number of connections available. 

4. number of processor core vs the size of a thread pool

  • More threads mean more memory usage. Each thread requires a thread stack. For recent HotSopt JVMs, the minimum thread stack size is 64KB, and the default can be as much as 1MB. That can be significant. In addition, any thread that is alive is likely to own or share objects in the heap whether or not it is currently runnable. Therefore it is reasonable to expect that more threads mean a larger memory working set.
  • A JVM cannot have more threads actually running than their cores (or hyperthreaded cores) on the execution hardware.

5. Involve connection pool to database querying

def delete(userID: Int): Boolean = Contexts.withSession { implicit session =>
  val users = TableQuery[Users]
  val filterInfo = users.filter(_.id === userID)

  if (filterInfo.exists.run) {
    filterInfo.delete
    true
  } else false
}

6. Reuse session within one logic

We know the more sessions we use, the much easier the connection pool uses up. So within one logic, we suggestion to pass session around to finish one purpose. In order to reuse session, we need to pass session as implicit.

def save(implicit session: Session, traitInfo: TraitInfo): Int = {
  val traits = TableQuery[Traits]
  (traits returning traits.map(_.id)) += traitInfo
}

7. Additional important things to know

Some Concepts on Slick we need to know/understand:

  1. For every database session (withSession blocks), Slick is opening a new connection to the database server.
  2. Session Handling:
    1. The Database object’s withSession method creates a Session, passes it to a given function and closes it afterward. If we use a connection pool, closing the Session returns the connection to the pool.
    2. Only the method actually executing the query in the database requires a Session. The executing methods are made available via implicit conversions.
    3. It is not recommended to handle the lifetime of a Session manually.
  3. All methods that execute a query take an implicit Session value. Of course, you can also pass a session explicit if you prefer.

Basic Knowledge on connection pool:

  1. Connection pools are common in network applications and they’re often tricky to implement. For example, it is often desirable to have timeouts on acquisition from the pool since various clients have different latency requirement.
  2. Pools are simple in principle: we maintain a queue of connections and we satisfy waiters as they come in. With additional synchronization primitives this typically involves keeping two queues: one of waiters (when there are no connections), and one of connections (when there are no waiters).

8. Todo In Future:

There are some points we can optimize.

  1. Slick uses prepared statements wherever possible but it does not cache them on its own. We should therefore enable prepared statement caching in the connection pool’s configuration and select a sufficiently large pool size.

MySQL(2)- performance monitor

Currently, I use MySQL as Database to store data. The whole architecture is build under Play framework. But I meet the problem, like MySQL connection timeout, etc. So MySQL performance becomes more serious important. Here I list some commands to monitor MySQL performance. I don’t like to install additional tools to do it. In fact, MysqlAdmin already can do a lot of things.

1. MysqlAdmin

  • check MySQL server status
    mysqladmin -u root -p ping
  • check MySQL version
    mysqladmin -u root -p version
  • find out the current status of MySQL server?
    mysqladmin -u root -p status
    # Output:
    Uptime: 269161  Threads: 11  Questions: 36607  Slow queries: 0  Opens: 96  Flush tables: 1  Open tables: 70  Queries per second avg: 0.136
  • check status of all MySQL server variable’s and value’s?
    mysqladmin -u root -p extended-status
  • see all MySQL server variables and values
    mysqladmin -u root -p variables
  • check all the running process of MySQL server
    mysqladmin -u root -p processlist
  • some useful MySQL flush commands
    mysqladmin -u root -p flush-hosts 
    mysqladmin -u root -p flush-tables 
    mysqladmin -u root -p flush-threads 
    mysqladmin -u root -p flush-logs 
    mysqladmin -u root -p flush-privileges 
    mysqladmin -u root -p flush-status
  • kill sleeping MySQL client process
    mysqladmin -u root -p processlist
    mysqladmin -u root -p kill #id,#id

2. Within MySQL

  • find slow query

    • show configuration/variables first
      show variables like "%slow%";

      “slow_query_log” and “slow_query_log_file” are important.
      screenshot-2016-11-02-14-10-57

       show variables like 'long_query_time';

      Screenshot 2016-11-02 14.34.46.png

    • set long_query_time and turn on slow_query_log
      set long_query_time=1;
      set global slow_query_log='ON';
      
    • view log file
      cat /var/lib/mysql/<file_name>.log | more
    • show slow queries
      show global status like "%slow%";

      Screenshot 2016-11-02 14.11.34.png

  • check read/write collision or table lock check

    • table status
      show status like 'Table%';

      Table_locks_immediate shows the table-level lock number and these locks are immediately released. Table_locks_waited shows the table-level lock number which needs wait.
      If Table_locks_immediate/Table_locks_waited > 5000, this means you’d better to adopt InnoDB engine. Because InnoDB uses row-level lock, but MyISAM uses table-level lock. For high concurrent write application, InnoDB would be better.
      If Table_looks_waited is too high, this means it exists worse table-level lock contention situation.
      Screenshot 2016-11-02 14.12.27.png

  • check key lock which is related to index buffer usage status

    • show key buffer size
      show variables like "key_buffer_size";
      

      Screenshot 2016-11-02 14.13.24.png

    • show hit ratio
      show global status like "key_blocks_u%";
      

      Note: If “Key_blocks_unused” is too high, buffer size should be decreased.
      Screenshot 2016-11-02 14.13.51.png

    • modify key buffer size
      set global key_buffer_size=4194304;
  • find full table scan statement

    select * from sys.statements_with_full_table_scans;

    Screenshot 2016-11-02 14.15.01.png

  • others
    show [session|global] status like ...

    The difference betwee seesion and global is that session is only for current conversation, global is for the whole running time.

    # show all mysql operating info
    show status;
    # show insert times
    show status like "com_insert%";
    # show delete times
    show status like "com_delete%";
    # show query times
    show status like "com_select%";
    # show mysql service running time
    show status like "uptime";
    # show myslq connect times
    show status like "connections";

3. MySQL Health Check

http://www.fromdual.com/mysql-database-health-check is an online website which helps you to check MySQL status.

Screenshot 2016-11-02 14.15.36.png

4. Related Links which deserve to read

MySQL(1)- Index Optimization

  • Table Schema

This is table condition. There are five fields.

Image

My command is here to add new data.

MonthlyMovie.objects.filter(MovieID=movieid, Time=itime).update(Count=clicktime)

Now my requirement is to add big data to this table, almost 700,000 rows.

At first, I want to optimize each field. For example, I try to use small field type. Int is better than char. Meanwhile, I use constant to define each field’e length.But the kind of optimization doesn’t have great influence on efficiency.And then, I try to create index on the table. Different index on different field would give variable result.

  • Experiments

I do some experiments on it.

solution1: idx1:MovieID,Time,Count;idx2:MovieID,Time,Rating
time: 846.411153
solution2: idx1:Time,MovieID,Count;idx2:Time,MovieID,Rating
time: 855.032292
solution3:idx1:Count;idx2:Rating;idx3:MovieID, Time
time: 840.703046
solution4:idx1:Count;idx2:MovieID, Time
time: 826.489348
  • Conclusions:

    • focus on query
    • focus on sort
    • don’t create unused index
    • understand composite index
  • Tips:

    • indexes are not just for the primary keys or the unique keys. If there are any columns in your table that you will search it, you should almost always index them.
    • The rule applies on a partial search like “last_name LIKE ‘A%'”. When searching from the beginning of the string, MySQL is able to utilize the index on that column.
    • Some of searches can’t use the regular indexes. For instance, when searching for a world, like “WHERE post_content LIKE ‘%apple%'”, you will not see a benefit from anormal index.
    • If your application contains many JOIN queries, you need to make sure that the columns you join by are indexed on both tables.