Lucene Index Consistency

With the latest version of Imixs-Workflow we support now a consistency Lucene search index coupled with the Java EE container based transaction concept.

Writing a Lucene Index during a Java EE transaction, coupled to JPA database operations, it becomes quickly difficult to keep the Lucene index consistent. The reason is that a lucene index can become inconsistency if you write the lucene index within a long running Java transaction. In case the transaction fails lately, there is no way to roll back the already written lucene index automatically. This is different to the build-in roll-back functionality of a SQL database which is only writing new data in case the transaction succeeds. Also clients reading the index before a running transaction is closed will read uncommitted index data which will lead to wrong search results.

In Version 4.4.0 of Imixs-Workflow we now solved this problem by implementing a Lucene Event Log mechanism. Instead of directly updating the Lucene index during the processing life-cycle, the Imixs-Workflow engine just creates a new eventLogEntry with JPA using the same container managed transaction context.

@Stateless
@LocalBean
public class LuceneUpdateService {
  ....

   public void updateDocuments(Collection<ItemCollection> documents) {
      // JPA work....
      ...
      // Write the JPA eventLog entry indicating to update Lucene index
      ....
   }
 ....
}

Now whenever a client calls the lucene search method, Imixs-Workflow will automatically run a flush method to update teh lucene index based on the latest event log entries:

@TransactionAttribute(value = TransactionAttributeType.REQUIRES_NEW)
public void flush() {
   Query q = manager.createQuery("SELECT eventLog FROM EventLog AS eventLog" );
   Collection<EventLog> result = q.getResultList();
   if (result != null && result.size() > 0) {
  
   for (EventLog eventLogEntry : result) {
    .... update lucen index for each entry
    .......
    // remove the eventLogEntry.
    manager.remove(eventLogEntry);
   }
 }
}

With the annotation TransactionAttributeType.REQUIRES_NEW the flush() method will only read already committed eventLog entries. So a client will only see committed updates in the Lucene Index. Even during a transaction from any other business method the lucene will not include ‘uncommitted’ documents. This behavior is equals to the transaction model ‘Read Committed’.

See also the similar discussion at: How making stateless session beans transaction-aware?

The Search Functionality of Imixs-Workflow

If you work with the Imixs-Workflow engine you don’t have to deal with writing the lucene index by yourself. Imixs-Workflow hides this complexity for your if you call for example the process method from the WorkflowService:

@EJB
private WorkflowService workflowService;
  ....
  // create a new workitem assigned to a workflow model
  ItemCollection workitem=new ItemCollection().model("1.0.0").task(100).event(10);
  // assign business data
  workitem.replaceItemValue("_customer","M. Melman");
  workitem.replaceItemValue("_ordernumber",20051234);
  // process the workitem
  workitem = workflowService.processWorkItem(workitem);
   ....

You will find more details about how to use the search functionality of Imixs-Workflow here.