Scala (17): JDBC connection pool with Hive

JDBC means Java Database Connection. So when we need to read data from database, we use JDBC to get connection between application layer and database layer. In fact, we can directly connect with database by JDBC, why we need to use connection pool. The benefit of it is that a connection pool is a cache of database connections maintained so that the connections can be used when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database.

Benefit

Dynamic web pages without connection pooling open connections to database services when they are needed and close them when the page is done servicing a particular request. Pages that use connection pooling instead maintain open connections in a pool. When the page requires access to the database, it simply uses an existing connection from the pool, and establishes a new connection only if no pooled connections are available. This reduces the overhead associated with connecting to the database to service individual request. To summarize, there are three advantages:

  1. In connection pool, connection or set of connection objects are created single time (usually at the start of the application)
  2. The connections can be reused when future requests to the database are required.
  3. It can enhance the performance of executing commands on a database.

Here our target database is Hive, first we need to start hive. Its default port is 10000.

./hive --service hiveserver -p 10002
Starting Hive Thrift Server

So you will see that you already start hiveserver on 10002, and then you can use java proxy to connect hiveserver.

There are several methods to do this thing; here we only introduce two ways, BoneCP and c3p0.

BoneCP

The reason why we use BoneCP is that Play 2.0 JDBC datasource is managed by it.

Configuration

You need to go build.sbt to add “jdbc”.

libraryDependencies ++ = Seq (
  jdbc,
  "mysql" % "mysql-connector-java" % "5.1.31",
  "org.apache.hive" % "hive-jdbc" % "0.12.0"
)

And then go to conf/application.conf to modify database engine connection properties.

db.orders.driver=org.h2.Driver
db.orders.url="jdbc:h2:mem:orders"
db.customers.driver=org.h2.Driver
db.customers.url="jdbc:h2:mem:customers"

Accessing the JDBC datasource

import play.api.db._
DB.withConnection("order") { conn =>
  // TODO
}
DB.withConnection("customers") { conn =>
  // TODO
}

If you use default configuration, you only need to use “DB.withConnection” to get connection.

c3p0

Configuration

The build.sbt part is partially same with BoneCp. We need to add one more.

"com.mchange" % "c3p0" % "0.9.2.1"

Accessing

import com.mchange.v2.c3p0.ComboPooledDataSource
// implement a datasource
val cpds = new ComboPooledDataSource()
cpds.setDriverClass(...)
cpds.setJdbcUrl(...)
cpds.setUser(...)
cpds.setPassword(...)
cpds.setInitialPoolSize(...)
// when number of connections is over than this value, you only can wait for the connection until other connection is free. 
cpds.setMaxPoolSize(...)
cpds.setMinPoolSize(...)
cpds.setMaxStatements(...)
// get a connection
cpds.getConnection
cpds.getJdbcUrl
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s