Mercury Distributed Cache for Oracle

Based on the same premise asWaffle Grid for MySQL andInfinite Cache for EnterpriseDB, Mercury gives database and system administrators the ability to improve Oracle performance by reducing the amount of disk-related I/O for reads using a unique, cluster-oriented distribution method.

What is Mercury?
Designed as a distributed exclusive main-memory cache management system, Mercury supplements the Oracle buffer cache with optimized network-based data block storage to reduce the amount of read-related disk I/O which must be performed when working with large-scale databases.

In short, if you’ve read a block from disk once, and it has already been flushed out of the Oracle buffer cache, it becomes managed by Mercury. If you request that block again at a later time, such as for an end-of-day batch processing job, Mercury will deliver Oracle the blocks it needs without ever having to go to disk.

Why would I want to use it?
If you have a large database, a memory-limited system, databases which can not easily be sharded, or you simply want to make use of memory in additional servers (such as a standby node), Mercury can improve database performance by reducing repetitive read-based disk I/O.

What versions of Mercury are there?
Mercury Express (the version released), Mercury Standard, and Mercury Enterprise. Both Mercury Standard and Enterprise require a relink of Oracle.

What Oracle Database platforms are supported?
Linux x86 and x86_64 are the only database platforms currently supported.

What memcached platforms are supported?
Any memcached system can be used as part of the cache cluster.  Similarly, this also includes third-party systems which support the memcached protocol such as Tokyo Tyrant and appliances from Schooner orGear6.

What versions of Oracle does Mercury work with?
I develop against Standard and Enterprise Editions of 10gR2 with basic testing being performed with 11g. Specific testing of Mercury Express Edition has been performed with Oracle XE for the purposes of this blog post.

What kind of performance improvement can I expect?
You will only see an improvement from Mercury when working with a database larger than memory (if O_DIRECT is disabled) or larger then the Oracle buffer cache (if O_DIRECT is enabled). For example, running a 6500 warehouse DBT-2 OLTP workload on a system with 64GB of RAM showed a 34% increase in performance after adding another 32GB of RAM on a remote server via Mercury.

DBT-3 improvements show anywhere between 5% and 43% depending on the configuration.

Does Mercury go to the network for every block request?
As Mercury has been designed to account for network overhead and minimize the number of network requests, it only goes to the network when necessary.

First, Mercury makes use of a local Bloom Filter to reduce unnecessary network requests on block read operations which have not yet been cached. Second, Mercury is able to utilize some of the latest memcached advancements, such as multi-get, to satisfy multi-block reads. Lastly, because Mercury tracks all resource usage and response time information, it can adapt to changing conditions by varying cache expiration and switching between local I/O or distributed cache depending on available system resources.

Initial testing has also begun for using InfiniBand RDMA for block transfer, which will also improve performance.

Does Mercury cache all Oracle blocks?
No. Currently, Mercury only tracks Oracle data files and caches specific block types.

Is Mercury like Exadata?
No.

Is Mercury like RAC?
No.  Unlike RAC, which has the ability to allow multiple active nodes to share database blocks, Mercury is designed for a single database node to utilize multiple nodes for cache-only purposes.  But if it helps you to visualize block transfers, you can look at a RAC diagram for Cache Fusion and take out all of the transaction-specific before-image/after-image stuff (i.e., the majority of Cache Fusion); it’s kinda like that.

Can I track Mercury statistics?
Yes! Mercury makes use of a shared, IPC-based statistics and configuration memory context which you can access using the Mercury Control utility (mercuryctl -p). Very soon, you’ll also be able to access this data from within Oracle via a SQL query.

Data collected includes the number of memcached gets/sets/hits/misses as well as the number of bytes sent/received, the Oracle PIDs using Mercury, and a wait event history table. NOTE: the wait events table in Mercury Express is defined at compile-time to hold only 50 records in a circular buffer. Standard and Enterprise editions have the ability to persist the data as it is flushed out of the buffer.

What happens to Mercury when Oracle shuts down?
One of two things can happen depending on the configuration.

By default, nothing will happen to the shared memory block when Oracle is shut down. But, it will be invalidated at Oracle startup by checking the Kernel Cache Recovery File Header. The second option is to invalidate the shared memory context when the last process detaches from it.

Can I run this in production?
NO! The current release of Mercury is a proof-of-concept with bugs and potential data integrity issues. Feel free to play with it, but use it at your own risk.

Is this supported by Oracle?
NO! This is an alpha-release of a hobby project which is most certainly not supported by Oracle. Do not use this on a production Oracle environment!

What license is Mercury released under?
A proprietary EULA that gives you the ability to play with it.

What programming language is Mercury written in?
None other than C and assembly!

Do you have access to Oracle’s source code?
No, but I someday hope to work on it!

How did you add this to Oracle without their source code?
Because I do not have access to the Oracle Database source code, almost everything Mercury does has to be performed at a level below the Oracle Kernel using function interposing.

However, because the version of Mercury for Oracle Standard/Enterprise is more closely coupled to the Oracle Database, I have replaced several Oracle kernel functions with my own variants which retain the same functionality (by calling Oracle’s own functions), but allow me to access the information needed. This is why a relink of Oracle is required for the Standard and Enterprise versions.

How did you know how to do this?
In addition to the fairly sizable amount of Oracle Database internals knowledge required to do this in a fairly safe manner, I had already written something similar to this for PostgreSQL as well as a couplecopy-on-write systems based on function interposing.

Why did you do this?
I was bored.

Can I get the source code?
No.

How do I install it?
Download Oracle Database 10g Express Edition (XE) and install is as follows (on RHEL):

[root@rhel53vm array0]# rpm -ivh oracle-xe-univ-10.2.0.1-1.0.i386.rpm
Preparing...                ########################################### [100%]
   1:oracle-xe-univ         ########################################### [100%]
Executing Post-install steps...
groupadd: group dba exists
You must run '/etc/init.d/oracle-xe configure' as the root user to
configure the database.

[root@rhel53vm array0]# /etc/init.d/oracle-xe configure

Oracle Database 10g Express Edition Configuration
-------------------------------------------------
This will configure on-boot properties of Oracle Database 10g Express
Edition.  The following questions will determine whether the database should
be starting upon system boot, the ports it will use, and the passwords that
will be used for database accounts.  Press <Enter> to accept the defaults.
Ctrl-C will abort.
Specify the HTTP port that will be used for Oracle Application Express [8080]:
Specify a port that will be used for the database listener [1521]:
Specify a password to be used for database accounts.  Note that the same
password will be used for SYS and SYSTEM.  Oracle recommends the use of
different passwords for each database account.  This can be done after
initial configuration:
Confirm the password:

Do you want Oracle Database 10g Express Edition to be started on boot (y/n) [y]:
Starting Oracle Net Listener...Done
Configuring Database...Done

Starting Oracle Database 10g Express Edition Instance...Done
Installation Completed Successfully.

To access the Database Home Page go to "http://127.0.0.1:8080/apex"

[root@rhel53vm array0]# ORACLE_HOME=/usr/lib/oracle/xe/app/oracle/product/10.2.0/server && export ORACLE_HOME && $ORACLE_HOME/bin/sqlplus system@XE

SQL*Plus: Release 10.2.0.1.0 - Production on Sun Mar 29 09:32:40 2009

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Enter password:

Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

SQL> EXEC DBMS_XDB.SETLISTENERLOCALACCESS(FALSE);

PL/SQL procedure successfully completed.

SQL> quit
Disconnected from Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

Install Mercury Express:

[root@rhel53vm array0]# tar jxf mercury-0.1.2.tar.bz2
[root@rhel53vm array0]# patch -p0 < /etc/init.d/oracle-xe mercury-0.1.2/etc/oracle-xe.patch
[root@rhel53vm array0]# cp mercury-0.1.2/dbs/mercuryXE.ora $ORACLE_HOME/dbs
[root@rhel53vm array0]# cp mercury-0.1.2/lib/* $ORACLE_HOME/lib
[root@rhel53vm array0]# cp mercury-0.1.2/bin/* $ORACLE_HOME/bin

Restart Oracle:

[root@rhel53vm array0]# /etc/init.d/oracle-xe restart

Check the Mercury stats:

[root@rhel53vm array0]# $ORACLE_HOME/bin/mercuryctl -p

You should see something similar to the following:

[root@rhel53vm array0]# $ORACLE_HOME/bin/mercuryctl -p

Mercury Express: Release 0.1.2 - Alpha

Copyright (c) 2008,2009 Jonah H. Harris.  All rights reserved.

MERCURY STATS
key = 0x037a839f

SHARED CONFIGURATION
cfgVersionNumber = 0

MEMCACHED STATS
mcSetCount  = 49
mcGetCount  = 0
mcHitCount  = 0
mcMissCount = 0

NETWORK STATS
bytesSent     = 0
bytesReceived = 0

PROCESS STATS
attachedPidCount = 22
attached PID List:
  4308
  4310
  4312
  4314
  4316
  4318
  4320
  4322
  4324
  4326
  4328
  4330
  4332
  4334
  4336
  4338
  4340
  4346
  4356
  4358
  5549
  5593

WAIT EVENT HISTORY

PID   WAIT EVENT                       P1         P2         P3 WAITED (us)
----- ------------------------ ---------- ---------- ---------- -----------
5549  memcached: set                    0          0          0           0
5549  memcached: connect                0          0          0           0
4322  disk: data block read           508  234356736       8192           0
4322  disk: data block read           508  218931200       8192           0
5527  memcached: set                    0          0          0           0
5527  memcached: connect                0          0          0           0
5505  memcached: set                    0          0          0           0
5505  memcached: connect                0          0          0           0
4710  disk: data block read           612    7028736       8192           0
4710  disk: data block read           612    7020544       8192           0
4710  disk: data block read           612    7036928       8192           0
4710  disk: data block read           612    8617984       8192           0
4710  disk: data block read           612    8552448       8192           0
4710  disk: data block read           612    8470528       8192           0
4710  disk: data block read           612    8462336       8192           0
4710  disk: data block read           612    8478720       8192           0
4710  disk: data block read           611   56532992       8192           0
4710  disk: data block read           611  160071680       8192           0
4710  disk: data block read           611   56541184       8192           0
4710  disk: data block read           612   36405248       8192           0
4710  disk: data block read           612    7897088       8192           0
4710  disk: data block read           612    7110656       8192           0
4710  disk: data block read           612    7634944       8192           0
4710  disk: data block read           612    7241728       8192           0
4710  disk: data block read           612    6979584       8192           0
4710  disk: data file open            613          0          0           0
4710  disk: data block read           612    6504448       8192           0
4710  disk: data block read           612    6496256       8192           0
4710  disk: data block read           612    6512640       8192           0
4710  disk: data block read           612    6414336       8192           0
4710  disk: data block read           612    6455296       8192           0
4710  disk: data block read           611   15867904       8192           0

Download

Click here to download .