Search This Blog

Saturday 2 June 2012

Hibernate And Batch Operations

There are often scenarios in our application wherein we need to perform batch updates. For example if there was a need to update the zip code of all records in Address table a batch update would be a very good way to achieve the same. A single update SQL query would ensure that the job is done at the database level.
However there arise complications in big applications wherein the update operation would be much more complex then a singe update query or the code complexity better handled in java then in a stored procedure.
The Hibernate equivalent would involve
  1. opening a session, 
  2. executing a select query to get the records to update.
  3. On each object, run the update logic. 
  4. Then schedule the affected object for a save. 
  5. Once all entities are modified we commit our transaction thus flushing all modifications to the database.
The problem with the approach is if the objects run into hundreds and thousands and even bigger values, loading such object graphs into the session might at some point result in an Out Of Memory Error.
Hibernate provides an alternative to handle this situation ( or else I wouldn't be writing this post :p ) Here too we execute the query to identify the affected objects. However instead of loading all of them into the session, Hibernate provides us with a reference to a database cursor. Instead of retrieving all objects in the result-set we retrieve one at a time. Consider the below code.
static void testBatchUpdate() {
    final int BATCH_SIZE = 5;
    final String query = "from Entity where id < :id";
    Session session = sessionFactory.openSession();
    Transaction transaction = null;
    ScrollableResults allEntities = null;
    try {
        transaction = session.beginTransaction();
        allEntities = session.createQuery(query).setLong("id", 7).scroll();
        int updateCount = 0;
        while (allEntities.next()) {//Advance to the next result
            final Entity entity = (Entity) allEntities.get(0);
            System.out.println("Updating entity with id  - "
                    + entity.getId());
            entity.setName("Updated new Name " + entity.getId());
            session.update(entity);
            if (++updateCount % BATCH_SIZE == 0) {
                System.out.println(" Total records to be flushed as yet : "
                        + updateCount);
                System.out.println("performing db session flush and clear");
                session.flush();
                session.clear();
            }
        }
        session.flush();
        session.clear();
        transaction.commit();
        session.close();
        System.out.println("Total queries fired : " + updateCount);
    } catch (Exception e) {
        e.printStackTrace();
        transaction.rollback();
    
    }
}
The hero of the story is the ScrollableResults class. The class internally is holding the reference to an open cursor in the database. The HQL query was executed using the Query.scroll() method. Unlike the list() method which fetches the entire result the scroll method returns a ScrollableResults object. 
Every time we call the get method Hibernate retrieves a single entity and adds it to the session (also the persistence cache) making it available to us. To ensure that no memory issues occur we periodically flush and clear the session (in this case after every 5 reads) The logs indicate the story
2750 [main] DEBUG org.hibernate.SQL  - 
    select
        entity0_.ID as ID0_,
        entity0_.NAME as NAME0_ 
    from
        ENTITY entity0_ 
    where
        entity0_.ID<?
2890 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#1]
...
2922 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#2]
...
2953 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#3]
..
2953 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#4]
...
2953 [main] DEBUG org.hibernate.loader.Loader  - Initializing object from Result
Set: [com.batch.scrollable.Entity#5]
...
 Total records to be flushed as yet : 5
performing db session flush and clear
2953 [main] DEBUG org.hibernate.event.def.AbstractFlushingEventListener  - flush
ing session
...
2968 [main] DEBUG org.hibernate.event.def.AbstractFlushingEventListener  - Flush
ed: 0 insertions, 5 updates, 0 deletions to 5 objects
...
2984 [main] DEBUG org.hibernate.SQL  - 
    update
        ENTITY 
    set
        NAME=? 
    where
        ID=?
...
3000 [main] DEBUG org.hibernate.event.def.DefaultFlushEntityEventListener  - Upd
ating entity: [com.batch.scrollable.Entity#6]
An alternative to using ScrollableResults would be to go for a Stateless session.

1 comment: