Hibernate: Second Level Cache – EhCache

Posted: July 23, 2012 in Hibernate, Techilla
Tags: , , , ,

Hibernate and performance considerations: these 2 items are like twins joined at the hip. And a 2nd level cache is probably the first step towards happy customers and a fast application!

We will go over the following things in this post:


1. A brief introduction to the different cache available and why we chose EhCache


2. Hibernate 2nd level cache implementation with EhCache with a small application.


3. Detailed differences between the various caching strategies : read-only, nonstrict-read-write, read-write and transactional.


4. Ways to clean up the 2nd level cache.


5. Query Cache


Introduction

There are 2 cache levels in Hibernate:

  • 1st level or the session cache.
  • 2nd level or the SessionFactory level Cache.

The 1st level cache is mandatory and is taken care by Hibernate.

This 2nd level cache can be implemented via various 3rd party cache implementations. We shall use EhCache to implement it. There are several other possibilities like : SwarmCache and OSCache.

Why EhCache?

Pre 3.2 Hibernate releases EhCache is the default one.

EhCache looks to be a really vibrant development community and believe me that’s an important consideration to make before choosing any open source project/tool. We don’t want to be stuck midway in a project and then hunt for answers from a development community which doesn’t answer your queries or track bugs.

As implied earlier, the ‘second-level’ cache exists as long as the session factory is alive. It holds on to the ‘data’ for all properties and associations (and collections if requested) for individual entities that are marked to be cached

It is possible to configure a cluster or JVM-level (SessionFactory-level) cache on a class-by-class and collection-by-collection basis.

As a side-note, to improve on the N+1 selects, 2nd level cache is also used, though a better approach is obviously to improve the original query using the various fetch strategies.

Application Overview

Let have an application structure like below. We have a state where some patients have to be transferred from their homes to the hospitals. We have several organizations who have voluntarily decided to help on this. Each organization has several volunteers. The volunteers can be either drivers or caregivers who help in transporting the patients. Now, the entire state Is split into regions to help the volunteer pick and choose the regions they want to serve in.(perhaps close to home etc) .

To summarize,

  • 1 Organization will have m volunteers.
  • 1 volunteer can be either Driver / Caregiver
  • 1 volunteer will be linked to m regions
  • 1 region will be linked to n volunteers

So, Org: Volunteer = 1 : m

Volunteer : Region = m:n

Hibernate 2nd level cache implementation with EhCache:

Step 1

Download ehcache-core-*.jar from http://ehcache.org/ and add it to your classpath. We also need an ehcache.xml in our classpath to override the default implementations

Hibernate Version: 3.6

Step 2

Sample ehcache.xml (to be put in classpath)



<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:noNamespaceSchemaLocation="ehcache.xsd" updateCheck="true"
	monitoring="autodetect" dynamicConfig="true">

<defaultCache
	        maxElementsInMemory="100000"
	        eternal="false"
	        timeToIdleSeconds="1000"
	        timeToLiveSeconds="1000"
	        overflowToDisk="false"
	        />

Step 3

enable ehcache in our hibernate.cfg.xml

<property name="hibernate.cache.region.factory_class"> 	net.sf.ehcache.hibernate.EhCacheRegionFactory
</property>


<property name="hibernate.cache.use_second_level_cache">true</property>
<property name="hibernate.cache.use_query_cache">true</property>


	

Note: Prior Hibernate versions will require different hibernate properties to be enabled.

As seen above, we have the second level cache and the query cache both enabled.

The second level cache stores the entities/associations/collections(on request). The query cache stores the results of the query but as a key based format, where the values are actually stored in the actual 2nd level cache. So, query cache is useless without a 2nd level cache.

Recall that we can put cache strategies for both classes and collections.

Step 4

Enable the cache at class level


<class name="com.spring.model.Region" table="volunteer">
	<cache usage="read-only" />
	<!—- other properties -->
</class>

Difference between Cache strategies in detail

usage (required) specifies the caching strategy: transactional, read-write, nonstrict-read-write or read-only

Straight from the Hibernate API below: (The explanation comes below though :)

Strategy: read only(usage=”read-only”)

  • If your application needs to read, but not modify, instances of a persistent class, a read-only cache can be used.
  • Simplest and best performing strategy.
  • Safe for use in a cluster.

Note: We shall see later that Read-Only cache allows for insertions but no updates/deletes.

For our Region persistent class earlier, we had used read-only cache strategy, as the regions will be inserted in the database only through the database, and not from the UI, so we can safely say that the changes would not be made to the cached data.

Strategy: nonstrict read/write(usage=”nonstrict-read-write”)

  • Caches data that is sometimes updated without ever locking the cache.
  • If concurrent access to an item is possible, this concurrency strategy makes no guarantee that the item returned from the cache is the latest version available in the database. Configure your cache timeout accordingly! This is an “asynchronous” concurrency strategy.
  • In a JTA environment, hibernate.transaction.manager_lookup_class has to be set

Foreg. for Weblogic hibernate.transaction.manager_lookup_class=org.hibernate.transaction.WeblogicTransactionManagerLookup

  • For non-managed environments, tx. should be closed when session.close() or session.disconnect() is invoked.
  • This is slower than READ-ONLY but obviously faster than the next one.(READ-WRITE)

Strategy: read/write(usage=”read-write”)

  • Caches data that is sometimes updated while maintaining the semantics of “read committed” isolation level. If the database is set to “repeatable read”, this concurrency strategy almost maintains the semantics. Repeatable read isolation is compromised in the case of concurrent writes. This is an “asynchronous” concurrency strategy.
  • If the application needs to update data, a read-write cache might be appropriate.
  • This cache strategy should never be used if serializable transaction isolation level is required. In a JTA environment, hibernate.transaction.manager_lookup_class has to be set

For eg. hibernate.transaction.manager_lookup_class=org.hibernate.transaction.WeblogicTransactionManagerLookup

Strategy: transactional

  • Support for fully transactional cache implementations like JBoss TreeCache.
  • Note that this might be a less scalable concurrency strategy than ReadWriteCache. This is a “synchronous” concurrency strategy
  • Such a cache can only be used in a JTA environment and you must specify hibernate.transaction.manager_lookup_class.
  • Note: This isn’t available for ehCache singleton ( ava. with a cache server:Terracota)

Now, if you cannot understand the differences between nonStrict R/W vs R/W vs Transactional from above very well, I dont blame you as I was in the same boat earlier. Let’s delve a bit deeper into the cache workings, shall we?

Basically two different cache implementation patterns are provided for :

  • A transaction-aware cache implementation might be wrapped by a “synchronous” concurrency strategy, where updates to the cache are written to the cache inside the transaction.
  • A non transaction-aware cache would be wrapped by an “asynchronous” concurrency strategy, where items are merely “soft locked” during the transaction and then updated during the “after transaction completion” phase;

Note: The soft lock is not an actual lock on the database row – only upon the cached representation of the item. In a distributed cache setup, the cache provider should have a cluster wide lock, otherwise cache correctness is compromised.

In terms of entity caches, the expected call sequences for Create / Update / Delete operations are:

DELETES :

  1. lock(java.lang.Object, java.lang.Object)
  2. evict(java.lang.Object)
  3. release(java.lang.Object, org.hibernate.cache.CacheConcurrencyStrategy.SoftLock)

UPDATES :

  1. lock(java.lang.Object, java.lang.Object)
  2. update(java.lang.Object, java.lang.Object, java.lang.Object, java.lang.Object)
  3. afterUpdate(java.lang.Object, java.lang.Object, java.lang.Object,org.hibernate.cache.CacheConcurrencyStrategy.SoftLock)

INSERTS :

  1. insert(java.lang.Object, java.lang.Object, java.lang.Object)
  2. afterInsert(java.lang.Object, java.lang.Object, java.lang.Object)

In terms of collection caches, all modification actions actually just invalidate the entry(s).The call sequence here is:

  1. lock(java.lang.Object, java.lang.Object)
  2. evict(java.lang.Object)
  3. release(java.lang.Object, org.hibernate.cache.CacheConcurrencyStrategy.SoftLock)

For an asynchronous cache, cache invalidation must be a two step process (lock->release, or lock-> afterUpdate). Note, however, lock() is only for read-write and not for nonstrict-read-write. release() is meant to release the lock and update() update the cache with the changes.

For a synchronous cache, cache invalidation is a single step process (evict, or update). Since this happens within the original database transaction, there is no locking. Eviction will force Hibernate to look into the database for subsequent queries whereas update will simply update the cache with the changes.

Note that query result caching does not go through a concurrency strategy; they are managed directly against the underlying cache regions

Lets analyze what each of the caches do : though TransactionalCache will most likely be overwritten by the individual implementation. (3rd party cache provider)

DELETE / Collection

METHOD READ-ONLY NONSTRICT-READ-WRITE READ-WRITE TRANSACTIONAL
lock() throws UnsupportedOperationException(“Can’t write to a readonly object”) returns null, so no lock applied
  • Stop any other transactions reading or writing this item from/to the cache.
  • Send them straight to the database instead. (The lock does time out eventually.)
  • This implementation tracks concurrent locks of transactions which simultaneously attempt to write to an item.
returns null, so no lock applied.
evict()   this.cache.remove(key); //does nothing this.cache.remove(key);
release()   this.cache.remove(key); Release the soft lock on the item. Other transactions may now re-cache the item (assuming that no other transaction holds a simultaneous lock). But obviously, for this item, there will be nothing, since it has been deleted. //does nothing

UPDATE

METHOD READ-ONLY NONSTRICT-READ-WRITE READ-WRITE TRANSACTIONAL
lock() throws UnsupportedOperationException (“Can’t write to a readonly object”); returns null, so no lock applied
  • Stop any other transactions reading or writing this item from/to the cache.
  • Send them straight to the database instead. (The lock does time out eventually.)
  • This implementation tracks concurrent locks of transactions which simultaneously attempt to write to an item.
returns null, so no lock applied.
update()   evict(key);this.cache.remove(key);return false; return false; Updates cache
afterUpdate()   release(key, lock);this.cache.remove(key);return false;
  • Re-cache the updated state, if and only if there are no other concurrent soft locks.
  • Release our lock..
Return false

INSERT

METHOD READ-ONLY NONSTRICT-READ-WRITE READ-WRITE TRANSACTIONAL
insert() return false; return false; return false; Update cache.
afterInsert() this.cache.update(key, value);return true; return false; Add the new item to the cache, checking that no other transaction has accessed the item.. Return false.

When Hibernate looks into the cache and when it looks into the database ?

Hibernate will look into the database if any of the below is true:

  1. Entry is not present in the cache
  2. The session in which we look for the entry is OLDER than the cached entry, meaning the session was opened earlier than the last cache loading of the entry. Thus cache will be refreshed.
  3. If the entry is currently being updated/deleted and the cache implemented is a read-write
  4. An update/delete has recently happened for a nonstrict-read-write which has caused the item to be evicted from the cache.

Now, armed with the knowledge above which basically tells that nonstrict R/w(NSRW) never locks any entity while RW locks it, and when Hibernate looks into the database,let look at some code.

Lets have the domain objects(only associations and collections depicted) :

Organization :

<set name="volSets" cascade="all" inverse="true">
	<key column="org_id" not-null="true" />
	<one-to-many class="com.spring.model.Volunteer" />
</set>	

Volunteer:

     <many-to-one name="org" column="org_id" 						class="com.spring.model.Organization" not-null="true" />
		
		
<set name="regions" table="volunteer_region" inverse = "false"	 lazy="true" fetch="select" cascade="none" >
           <key column name="volunteer_fk" not-null="true" />
      	 <many-to-many class="com.spring.model.Region">
              <column name="region_fk" not-null="true" />
      	 </many-to-many>
</set>

Region:


		<set name="volunteers" table="volunteer_region" inverse="true"
			lazy="true" fetch="select" cascade="none">
			<key column name="region_fk" not-null="true" />
			</key>
			<many-to-many class="com.spring.model.Volunteer">
				<column name="volunteer_fk" not-null="true" />
			</many-to-many>
		</set>

We will load the Organization and its set of volunteers in 1 transaction, and then we shall then update the organization name in another tx and we will see the differences in action.

NonStrict R/W vs R/W

DEMO 1
organization.hbm.xml is marked with nonstrict-read-write

     <cache usage="nonstrict-read-write"/>

Java code:

System.out.println("session1 starts");
Session session1 = sf.openSession();
Transaction tx = session1.beginTransaction();
Organization orgFromSession1= (Organization) 	session1.get(Organization.class,421l);
//loaded in the cache at time t0
orgFromSession1.setOrgName("org2"+System.currentTimeMillis());
session1.save(orgFromSession1);
tx.commit(); 	//evicted from the cache 
session1.close();
System.out.println("session1 ends");


System.out.println("session2 starts");
Session session2 = sf.openSession();//session 2 opened at time t2
Transaction tx2 = session2.beginTransaction();
Organization orgFromSession2= (Organization) session2.get(Organization.class,421l); System.out.println(orgFromSession2.getOrgName());
session2.close();
System.out.println("session2 ends");

Logs :

session1 starts
Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ from organization organizati0_ where organizati0_.org_id=?
Hibernate: update organization set version=?, org_name=? where org_id=? and version=?
session1 ends

session2 starts
Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ from organization organizati0_ where organizati0_.org_id=?
org21338618053131

	

Note:

a. In session 1, a select +update

b. In session 2, another select from the DB to fetch the item since the item was evicted with the update.

DEMO 2

read-write cache enabled at organization.hbm.xml

<cache usage="read-write"  region="org_region"  />
	

Java code:

Same code as above
	

Logs:

session1 starts

Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ from organization organizati0_ where organizati0_.org_id=?
Hibernate: update organization set version=?, org_name=? where org_id=? and version=?

session1 ends

session2 starts
org21338617974053
session2 ends

	

Note:

a. In session 1, a select +update

b. In session 2, no selects since there was no eviction but instead the cache was updated.

Now, we shall just tweak the code such that, we open session 2 just before the transaction 1 commits.We shall also put a check if the item actually exists in the cache. So the changed code becomes:

DEMO 3

Session 2 opened just before the transaction 1 commits. Some more diagnostic messages added to check if the item is indeed in the 2nd level cache using sf.getCache().containsEntity(Organization.class,421l)
Java code:

System.out.println("session1 starts");
Session session1 = sf.openSession();
Transaction tx = session1.beginTransaction();
Organization orgFromSession1= (Organization) 	session1.get(Organization.class,421l);
//loaded in the cache at time t0
orgFromSession1.setOrgName("org2"+System.currentTimeMillis());
session1.save(orgFromSession1);
System.out.println("session2 starts");
Session session2 = sf.openSession();//session 2 opened at time t1
tx.commit();	
System.out.println("Cache Contains?"+sf.getCache().containsEntity(Organization.class,421l));

		
session1.close();//reloaded in the cache at time t2,after the flush happens
System.out.println("session1 ends");
Transaction tx2 = session2.beginTransaction();
Organization orgFromSession2= (Organization) session2.get(Organization.class,421l); //should be from the database
System.out.println(orgFromSession2.getOrgName());
session2.close();
System.out.println("session2 ends");

	

Logs:

session1 starts
Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ from organization organizati0_ where organizati0_.org_id=?
session2 starts
Hibernate: update organization set version=?, org_name=? where org_id=? and version=?
Cache Contains? true
session1 ends
Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ from organization organizati0_ where organizati0_.org_id=?
org21338618256152
session2 ends

	

Note:

a. In session 1, a select +update

b. In session 2, again a select since, this session was opened before the cache was updated with the entry.

c. Note that cache did contain the item.

The actual summary is already present in the code comments, but let’s reiterate:The below is true for both Nonstrict-R/w and R/W cache

  • Whenever a session starts, a timestamp is added to it.(ST)
  • Whenever an item is loaded in the cache, a timestamp is added to it.(CT)
  • Now if ST < CT, meaning the session is older than the cached item, then if we look for the cached item in this older session, Hibernate will NOT look in the cache. Instead it will always look in the database, and consequently re-load the item in the cache with a fresh timestamp.

The above was demonstrated in demo 3 where we started the 2nd session before the cache was reloaded with the item. If you check the output, the item was actually present in the cache at the time of querying, but yet the database was referred.

Summary of diff. between NS R/W and R/W

For NonStrict-Read-Write

• There’s no locking ever.

• So, when the object is actually being updated in the database, at the point of committing(till database completes the commit), the cache has the old object, database has the new object.

• Now, if any other session looks for the object , it will look in the cache and find the old object.(DIRTY READ)

• However, as soon as the commit is complete, the object will be evicted from the cache, so that the next session which looks for the object, will have to look in the database.

• If you execute the same code (Demo1) with the diagnostic: System.out.println(“Cache Contains?”+sf.getCache().containsEntity(Organization.class,421l));

Before and update the tx.commit() you will find that before the commit, the cache contained the entry, after the commit, it’s gone. Hence forcing session2 to look in the database and reload the data in the cache.

So, nonstrict read/write is appropriate if you don’t require absolute protection from dirty reads, or if the odds of concurrent access are so slim that you’re willing to accept an occasional dirty read. Obviously the window of Dirty Read is during the time when the database is actually updated, and the object has not YET been evicted from the cache.

For Read-Write

• As soon as somebody tries to update/delete an item, the item is soft-locked in the cache, so that if any other session tries to look for it, it has to go to the database.

• Now, once the update is over and the data has been committed, the cache is refreshed with the fresh data and the lock is released, so that other transactions can now look in the CACHE and don’t have to go to the database.

• So, there is no chance of Dirty Read, and any session will almost ALWAYS read READ COMMITTED data from the database/Cache.

Differences between R/W and Transactional Cache

Below adapted from http://clustermania.blogspot.sg/2009/07/with-read-write-hibernate-2nd-level.html (Supplemented with code examples of mine below)

We have to understand that since R/W is asynchronous, the updating of the cache happens outside the tx(ie in the afterCommit phase of the transaction) What happens if something goes wrong there?

How is the cache transactionality/correctness maintained in Read-Write caching strategy during transaction commit, rollback & failures (the so called value proposition of transactional cache). Here is how -

1. When application commits transaction, cache entry is soft-locked, there by deflecting all the reads for this entry to DB now.

2. Then changes are flushed to DB and transactional resources are committed.

3. Once transaction is committed (i.e. reflected inside DB), in the after completion phase cache entry is updated and unlocked. Any other transaction starting after the update time stamp can now read the cache entry contents, since the lock has been released.

This is what happens in different stages of transaction completion/failure -

  • So anytime lag between 2 & 3 (i.e. when DB and cache are out of sync), you are using DB to read the latest state, since the cache is still soft-locked.
  • If transaction rolled back, cache entry still remains locked and later reads from the DB refreshes the cache entry state
  • What if node making transactional change fails between step 2 & 3 (i.e. transaction is committed to DB but not to cache) and cache state is preserved (e.g. in clustered cache), is my cache left corrupted? not really.

Since cache entry is locked, other transaction keep reading from the DB directly. Later hibernate times out the locked cache entry and its contents are refreshed with database state and cache entry is again back for read/write operations.

Do you still need a transactional cache that either integrates with hibernate native transactions or JTA?

All a JTA transaction cache guarantees is cache state visibility across transactions and recoverability if any of the transaction phases fails.

With read-write you are guaranteed to read the correct state all the time. If cache entry state gets inconsistent because of any failure in transaction commit phase, it is guaranteed to recover with correct state. This is all what a transactional cache guarantees, but at a higher cost (esp. when read outweigh writes).

Hibernate Read-Write cache strategy makes a smart decision of reading from the cache or database based on the cache entries. Anytime if cache cannot guarantee the correct contents, application is deflected to DB.

What are the caveats? We will test them below in our code sample.

  • Read-Write cache might compromise repeatable read isolation level if an entity is read from the cache and later its contents are evicted from 1st level (session) cache. If transaction reads the same entry again from DB later and in the meantime other transaction updated the entry state, current transaction will get different state than what it read earlier.

Note: This should occur only if session cache contents are flushed otherwise once any entry is read from 2nd level cache/DB, every subsequent read in the same transaction will get the state from session cache and there by guaranteed the same state again and again. How many people really flush session cache?

  • Cache entries might expire in lock mode. In lock mode each entry is assigned a timeout value and if update doesn’t unlock the entry within specified timeout, the entry might be unlocked forcefully(this done to avoid any permanent fallout of cache entry from the cache, e.g. when node fails before unlocking the entry). A genuinely delayed transaction might create a very small window where cache contents are stale and other transactions are reading the old state. Cache entry timeout is a cache provider property and might to be able tune if provider supports it.

Note: For this to occur, update has to be delayed, read has to occur after timeout and moreover the stale window is miniscule. So majority of the applications are safe anyways.

Finally, one word of caution would be:

  • Entity type that are mostly updated and have got concurrent read and updates, read-write caching strategy may not be very useful as most of the read will be deflected to database.

Ok, let’s now put the caveats to test.

Testing the 1st caveat: Repeatable Reads might be compromised. What actually is a repeatable read? It means that if, within a transaction, u read a row at time T1, and you again read it at time T2 (T2>T1) the row shouldn’t change. One imp. thing for us to remember is that Hibernate always looks for the object in the session first, (1st level cache), and then in 2nd level cache.

Java code:

System.out.println("session1 starts");
Session session1 = sf.openSession();
Transaction tx = session1.beginTransaction();
Organization orgFromSession1= (Organization)	session1.get(Organization.class,96783514l);
session1.evict(orgFromSession1); 
//loaded in previous step and then evicted from session
System.out.println("Cache 	Contains?"+sf.getCache().containsEntity(Organization.class, 96783514l));
System.out.println("1:"+orgFromSession1.getOrgName());
//loaded from 2nd level, not session cache

{
System.out.println("session2 starts");
Session session2 = sf.openSession();//session 2 opened at time t2
Transaction tx2 = session2.beginTransaction();
Organization orgFromSession2= (Organization) session2.get(Organization.class,96783514l);
//should be from the 2nd level cache
orgFromSession2.setOrgName("org "+System.currentTimeMillis());
session2.save(orgFromSession2);
tx2.commit(); // cache updated with new entry
System.out.println("inner "+orgFromSession2.getOrgName());
session2.close();
System.out.println("session2 ends");
}
System.out.println("Cache 	Contains?"+sf.getCache().containsEntity(Organization.class,96783514l));


//we load the row again. From the database this time, since this session began before //the cache update
orgFromSession1= (Organization) session1.get(Organization.class,96783514l);
System.out.println("2:"+orgFromSession1.getOrgName());
tx.commit();
session1.close();

	

If you see the above code: u will see the following pattern:

  • Session 1 loads an object, (also in the2nd level thereby) and then removes it from sesssion using the evict(). Note that its still present in 2nd level cache, but has been removed from session cache.
  • Session 2 updates the same object, by retrieving it from 2nd level cache, hence no DB queries. Once update completes, the cache is refreshed with the new data
  • Session 1 tries to read the same entity again, and this time it refers to the 2nd level cache as it has been evicted from 1st level. Remember that the object is present in the 2nd level cache, but since, this session had started earlier, it will refer to the database for the object, thus the object loaded in step a differs from this one, and hence there are no repeatable reads. Note, that cache did contain the item, but still it referred to the database.

Logs:

session1 starts
Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ from organization organizati0_ where organizati0_.org_id=?

Cache Contains?true
1:org 1339490676715

session2 starts
Hibernate: update organization set version=?, org_name=? where org_id=? and version=?
inner org 1339490694572
session2 ends

Cache 	Contains?true
Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ from organization organizati0_ where organizati0_.org_id=?
2:org 1339490694572

	

Right, so we have analyzed the various cacheing strategies that are present.

To summarize:

  • ReadOnly cache can do only reads and inserts , cannot perform updates/deletes. Fastest performing.
  • Nonstrict Read Write Cache doesn’t employ any locks, ever, so there’s always a chance of dirty reads. However, it ALWAYS evicts the entry from the cache so that any subsequent sesssions always refer to the DB.
  • Read Write cache employs locks but in an asynchronous manner, first the insert/update/delete occurs w/in the tx. When the cache entry is softlocked and other sessions have to refer to the database. Once the tx. is completed, the lock is released and the cache is updated.(outside the transaction). In some cases, repeatable reads might be compromised.
  • Transactional caches obviously update the database and the cache within the same transaction, so its always in a consistent state with respect to the database.
  • Entity type that are mostly updated and have got concurrent read and updates, read-write caching strategy may not be very useful as most of the reads will be deflected to database

Collection Caching

Till now, we have discussed cacheing at the individual level. We can also cache the collections. Recall that the collections cacheing follows the same steps as the delete()

ie lock() –> evict() –> release()

Collections are by default not cached, and have to cached explicitly like below:

<set name="volSets" cascade="all" inverse="true" batch-size="10">
	<cache usage="read-write" />

	<key column="org_id" not-null="true" />
	<one-to-many class="com.spring.model.Volunteer" />
</set>


Removing Entities and Collections from 2nd level cache

Whenever you pass an object to save(), update() or saveOrUpdate(), and whenever you retrieve an object using load(), get(), list(), iterate() or scroll(), that object is added to the internal cache of the Session.

When flush() is subsequently called, the state of that object will be synchronized with the database. If you do not want this synchronization to occur, or if you are processing a huge number of objects and need to manage memory efficiently, the evict() method can be used to remove the object and its collections from the first-level cache.

session.evict(orgFromSession1);
session.evict(orgFromSession1.getVolSets());

To evict all objects for Organization, we can call:

session.evict(Organization.class);

To remove from the SessionFactory or the 2nd level cache:


sf.getCache().evictEntity(Organization.class,421l); //Entity

sf.getCache().evictCollection("com.spring.model.Organization.volSets",421l); //Collections
//Note: the collection contains the name of the fully qualified class.

where sf is the SessionFactory, we can remove the identifier if we would want all volunteer sets to be evicted.

Note: the collection contains the name of the fully qualified class.


Query Cache

As mentioned earlier, we need to enable the below property in our hibernate.cfg.xml

<property name="hibernate.cache.use_query_cache">true</property>

This setting creates two new cache regions:

  • org.hibernate.cache.StandardQueryCache, holding the cached query results
  • org.hibernate.cache.UpdateTimestampsCache, holding timestamps of the most recent updates to queryable tables. These are used to validate the results as they are served from the query cache.

Note:

UpdateTimestampsCache region should not be configured for expiry at all. Note, in particular, that an LRU cache expiry policy is never appropriate.

Recall , that the query cache does not cache the state of the actual entities in the cache; it caches only identifier values and results of value type. For this reason, the query cache should always be used in conjunction with the second-level cache for those entities expected to be cached as part of a query result cache (just as with collection caching).

But even then queries are not cached, and hence need to be explicitly specified for cacheing for both HQL and Criteria queries by setting setCacheable(boolean) to the Query.:

Criteria c2 = session.createCriteria(Guitar.class).setCacheable(true);

List volList = session.createQuery("from Volunteer vol join vol.regions").setCacheable(true).list();

Well, that’s it then! Whew! That was a long one indeed! If you have followed thus far, I would be delighted to hear your opinions on this post.

References

About these ads
Comments
  1. Java1 on1 says:

    A tad long, but good job!

  2. Dmitry Bedrin says:

    What will happen with cache if my JTA transaction is rollbacked? Say I’m using nonstrict-read-write mode on entity Product:
    1) start JTA transaction
    2) em.persist(product)
    3) em.flush(); em.clear();
    4) em.find(Product.class, product.getId());
    5) throw RuntimeException and rollback JTA transaction

    Will my entity stay in L2 cache or not?

    • A nonstrict-R/W cache will always evict data as soon as the update operation is over.(once you flush) However, since you do a lookup just after that, I presume, the cache will be refreshed with the entity. (nonstrict R/w being asych.) Just catch the exception and then put the below line outside the try-catch block to check if my assumption is correct.
      sf.getCache().containsEntity(your product)

      • Dmitry Bedrin says:

        @Test
        public void testL2Cache() throws Exception {

        final List holder = new ArrayList();

        transactionTemplate.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRES_NEW);
        transactionTemplate.execute(new TransactionCallbackWithoutResult() {
        @Override
        protected void doInTransactionWithoutResult(TransactionStatus status) {
        Person p = new Person();
        p.setName(“person1″);
        em.persist(p);

        holder.add(p.getId());

        em.flush();
        em.clear();

        em.find(Person.class, holder.get(0));

        status.setRollbackOnly();
        }
        });

        transactionTemplate.execute(new TransactionCallbackWithoutResult() {
        @Override
        protected void doInTransactionWithoutResult(TransactionStatus status) {
        Assert.assertNull(em.find(Person.class, holder.get(0)));
        }
        });

        }

        This test fails – transaction with INSERT INTO PERSON statement is rolled back, but Person object stays in L2 cache. What a shame! How can I rely on the cache in this case?

        I’m using Hibernate & Ehache together with Spring in this test:

        hibernate properties:
        hibernate.cache.region.factory_class=net.sf.ehcache.hibernate.SingletonEhCacheRegionFactory

        spring xml:
        transactionManager=org.springframework.orm.jpa.JpaTransactionManager

        persistence.xml:
        transaction-type=”RESOURCE_LOCAL”

        I’ll try it on JBoss now

      • Dmitry Bedrin says:

        Same stuff in JBoss – instance isn’t removed from cache though transaction is rollbacked

      • Dmitry Bedrin says:

        It seems to be a bug in hibernate:
        https://hibernate.onjira.com/browse/HHH-5690

      • Dmitry Bedrin says:

        The bad thing is that it doesn’t work for me with TRANSACTIONAL strategy as well. Do you have any ideas?

      • Tony says:

        Hello, I am using direct SQL to update a table that has a second level cache (EhCache) with read-write concurrency and also tried with Transactional(Annotation based). Rows updated with direct SQL which are in the second level cache need to be refreshed in the cache in order to be in synch with the database. Is it possible to refresh the second level cache or even to clear it so Hibernate will reread the underlying database?

  3. There are few things going on here.(Am using H 3.6) NSRW and RW cache are both Asynchronous, meaning, the caches are updated in the afterCommit section of the transaction.So, the L2Cache is NOT updated till the tx. is committed.
    When you flush the data, the database session is updated, but its still not committed. So, there’s nothing in the L2 Cache yet.
    So after the flush, and before the commit, whatever fetching you do, will always go hit the database(L1 –>L2–>database, L1 cleared,L2 not yet updated). This behaviour would be the same for both NSRW and RW.

    A small POC:

    		try {
    			session = sf.openSession();
    			tx = session.beginTransaction();
    			Organization org = new Organization("Test Org");
    			id = (Long) session.save(org);
    			System.out.println(id);
    			session.flush();//insert happens in the current session
    			session.clear(); 
    			System.out.println("Retrieving = "+session.get(Organization.class, id)); //from the database
    			System.out.println("Exists in L2? "+sf.getCache().containsEntity(Organization.class,id));//false,as tx. NOT committed yet
    			throw new HibernateException("Something bad happened");
    			//tx.commit();
    			//System.out.println("Exists in L2? "+sf.getCache().containsEntity(Organization.class,id));
    			//if it reaches here, previous stmt. would be true
    		} catch (HibernateException e) {
    			// TODO Auto-generated catch block
    			tx.rollback();
    			e.printStackTrace();
    			throw e;
    		}	finally{
    			tx2 = session.beginTransaction();
    			System.out.println("Retrieving after the rollback= "+session.get(Organization.class, id));//from the current session.
    			System.out.println("Exists in L2? "+sf.getCache().containsEntity(Organization.class,id));
    			//outcome depends on whether tx was successfully committed or not
    			tx2.commit();
    			session.close();
    			System.out.println("Session closed successfully");
    			
    		}
    

    Log output

    Hibernate: insert into organization (version, org_name, org_id) values (?, ?, ?)
    Hibernate: select organizati0_.org_id as org1_1_0_, organizati0_.version as version1_0_, organizati0_.org_name as org3_1_0_ 
    	from organization organizati0_ where organizati0_.org_id=?
    Retrieving = Organization [orgName=Test Org]
    Exists in L2? false
    Retrieving after the rollback = Organization [orgName=Test Org]
    Exists in L2? false
    
    org.hibernate.HibernateException: Something bad happened
    	at test.TestModule.testDmiCacheConundrum(TestModule.java:550)
    	at test.TestModule.main(TestModule.java:94)
    Session closed successfully
    Exception in thread "main" org.hibernate.HibernateException: Something bad happened
    	at test.TestModule.testDmiCacheConundrum(TestModule.java:550)
    	at test.TestModule.main(TestModule.java:94)
    
    

    Now, I didn’t test this with any transactional cache. But the L2 tx. cache is updated within the tx. (not after its over) and since the orginal tx. was rolledback, I foresee the same conclusion, as to the database getting hit with the query,and the L2 cache not having any data. So, doesn’t look as if the Hibernate baby is playing truant here, what do you reckon?

    BTW is there any particular reason why you are using flush() and the clear().This question was theoretically interesting
    (Thank you! ) but I don’t see where this code block can work in real life.(I specifically mean the flush and clear part)

    • Dmitry Bedrin says:

      I think you’re wrong – in NSRW data is put to the cache during load phase. See class org.hibernate.engine.TwoPhaseLoad method initializeEntity():

      persister.getCacheAccessStrategy().putFromLoad(
      cacheKey,
      persister.getCacheEntryStructure().structure( entry ),
      session.getTimestamp(),
      version,
      useMinimalPuts( session, entityEntry )
      );

      I’ll try your test case now

      • Dmitry Bedrin says:

        I was using Hibernate 3.3 while you’re suing Hibernate 3.6 – the TwoPhaseLoad class has changed. In 3.3 it was:

        boolean put = persister.getCacheAccessStrategy().putFromLoad(
        cacheKey,
        persister.getCacheEntryStructure().structure( entry ),
        session.getTimestamp(),
        version,
        useMinimalPuts( session, entityEntry )
        );

        While in 3.6 it’s more smart:

        // explicit handling of caching for rows just inserted and then somehow forced to be read
        // from the database *within the same transaction*. usually this is done by
        // 1) Session#refresh, or
        // 2) Session#clear + some form of load
        //
        // we need to be careful not to clobber the lock here in the cache so that it can be rolled back if need be
        if ( session.getPersistenceContext().wasInsertedDuringTransaction( persister, id ) ) {
        persister.getCacheAccessStrategy().update(
        cacheKey,
        persister.getCacheEntryStructure().structure( entry ),
        version,
        version
        );
        }
        else {
        boolean put = persister.getCacheAccessStrategy().putFromLoad(
        cacheKey,
        persister.getCacheEntryStructure().structure( entry ),
        session.getTimestamp(),
        version,
        useMinimalPuts( session, entityEntry )
        );
        }

        So the problem is solved in new versions of Hibernate both with NSRW and RW strategies.
        It’s still reproducible with TRANSACTIONAL strategy in Spring (local transactions), but I think it should be fine if I use JTA and ehcache as a XA resource. Anyway NSRW is fine for me.

        “BTW is there any particular reason why you are using flush() and the clear().This question was theoretically interesting”

        I’m new to L2 caching in Hibernate and was just playing with it to see how it actually works.
        In fact I don’t have em.flush();em.clear(); in my code

    • Dmitry Bedrin says:

      “BTW is there any particular reason why you are using flush() and the clear().”
      Actually this pattern is used when making batch inserts or updates using JPA – flush and clear session say every 1000 objects

      • Tony says:

        code to refresh Hibernate 2nd level cache implementation with EhCache:(Annotation based)

  4. Dmitry Bedrin says:

    Thanks a lot for this article – it’s best explanation of cache concurrency strategies I’ve found on the Internet!

  5. Thanks Dmitry, it was an interesting discussion we had. Though re. batch updates, I have always felt it easier and safer to rely on plain old JDBC type stuff, but new patterns are always handy.

    • Niteesh says:

      Hi Anirban,

      You have used sf, but no-where defined session factory. Can you publish piece of code for that?. It would be great. I am using below approach. But i am not getting getCache() from sf.

      Session session1 = null;
      SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();
      session1 =sessionFactory.openSession(); ….. continues as in your code.

      • Hi Neetesh,
        I use Spring for my config.
        For a standalone,

        Configuration configuration = new Configuration().configure(“com\\config-files\\hibernate.cfg.xml”);
        SessionFactory sessionFactory = configuration.buildSessionFactory();

  6. Niteesh says:

    Thanks for the reply Anirban,

    But the issue here is sessionfactory is not having getCache() method as you have used.

    Apart from that, Can you suggest how can I demo, data persistence in the absence of DB. It would be great if you ca throw some light on this. I just want to show its able to display the data in the absence of DB.

    Awaiting your response,
    Niteesh

    • Which version of Hibernate are you using?
      Just use some dataholder classes if you want to display some data.

      • Niteesh says:

        Anyways, i took care of that from some other way. But as i said earlier, I need to test data persistence in the absence of Database. Can you suggest on that or a piece of code will do.

        Thanks again,
        Niteesh

  7. Melle says:

    Even for a user of NHibernate this is very usefull, thanks. What are the performance differences you found for read-write and the nonstrict version? Does it only depend on the locks you describe or does also matter which second level cache provider used make a difference?

    • Thanks a lot Melle, I didn’t do any extensive performance testing for the different caches. But for a simultaneous read-write application, obviously the read-write cache would not factor in much, as most of the reads would be anyway referred to the database.

  8. Viktor says:

    The changes in the TwoPhaseLoad class which were introduced in the Hibernate 3.6 (they’re can be seen aboce) have broken a lot of our spring transactional tests because of the following:
    ” if ( session.getPersistenceContext().wasInsertedDuringTransaction( persister, id ) ) {
    persister.getCacheAccessStrategy().update(…”
    This update method call causes exception for READ_ONLY caches.
    And we can’t just test simple save and find scenarios in our tests like this:

    Person p = new Persion();
    entityManager.persist(p);
    entityManager.flush();
    entityManager.clear();
    assertEquals(entityManager.findById(Person.class, p.getId()), p)

    That’s really ugly because it’s perfectly fine to insert entity into 2nd level cache but Hibernate decides to perform an update without any checking what type of cache used,

    I wonder whether somebody had similar issue? I don’t see any other solution except disabling 2nd level cache for tests completely. But we’re considering this as a last resort,

  9. George Guan says:

    “Testing the 1st caveat” may be misleading. I tested in MySQL and PGSQL. If the transaction isolation levels are all RR(Repeatable Read), then it will not fail and “2:” will get the same as “1:”, which is promised by the DB, not Hibernate.

    Just a little modification of the test to show how RR is compromised by Second cache:
    Before start test, make sure the default transaction isolation level = 4(RR), if not, nothing to be compromised!

    1. session0 get entity(ID) from DB, so stored in 2nd cache.
    2. session1 get entity X(ID), directly from 2nd cache.
    3. session1 evict X.
    3. session2 modify the entity(ID), commit.
    4. session1 get entity Y(ID), from DB cause of OLDER TS.

    Now X!=Y, which breaks RR.

    What’s happening here is that RR just uses the table snapshot when the first query is made(at least in MySQL and PostgreSql), which happens at step 4. Because of 2nd cache, the 1st time fetch is postponed, leading to the break of RR in application level.

    It’s may not be a issue in real world because Read Committed + Version + session cache.

  10. Neeraj says:

    Hi Anirban, it’s really nice and helpful article, but i have one query on Query caching it’s throws error lot of time while we refresh the cache like “net.sf.ehcache.CacheException: java.lang.ClassNotFoundException: org.hibernate.engine.TypedValue not found by org.apache.servicemix.bundles.ehcache” what can be the reason for this

  11. Iqbal Khan says:

    Good article but I would ask you to mention other Hibernate Cache providers than Ehcache to help the reader determine which one is best for their needs.

    One such Hibernate Cache provider is JvCache. JvCache is an in-memory data grid for Java applications. It also has JvCache Express that is totally free to use. Get more details about it at http://www.alachisoft.com/jvcache/.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s