Thursday, November 13, 2014

Envers: it's slower but at least it doesn't work in some cases

Why not Envers?

Envers is an auditing extension of Hibernate, following the mantra that Java developers should code in Java and not think in terms of the database. Java developers are shielded from the relational features of the database; SQL is evil as it leads to vendor lock-in (when was the last time you switched the database of an existing system?), and the ORM pretends that what you load from the database is objects. The performance often suffers, the framework at times does not support features databases have had from late 80ties and you have to fight its leaky abstraction and bugs. But back to Envers.

Envers comes with limitations that emerge from its very nature. It tracks the Java entities changed within the transaction and then issues inserts into the auditing tables and the revinfo table. For us the main problems of this approach are with its performance:

  • The number of resulting SQL statements is too high. Every insert/update/delete triggers another statement run against the database. These statements need to passed to the database over the wire, increasing the duration of the transaction.
  • Bulk updates, deletes are not supported. Operations such as 'DELETE FROM foo' are not auditable because Envers has no way to know which entities were affected. If you want auditing you have to update every entity separately, leading to potentially many updates and auditing inserts instead of one simple statement.
It should be said that when Envers knows about the entities it works correctly, so for most applications it can be good enough. We used Envers in the prototyping phase of our project but it made it hard for us to handle high loads. Hence, we investigated those relational features we are told to avoid. We put the database to work and replaced Envers with database triggers. There were some turns back and forth and in the end it took us four days to get to a state we are happy with. In this article I want to share our solution for PostgreSQL.

Auditing with database triggers

Auditing

First, let's define what we mean by auditing. We need to audit for regulatory reasons, so we need to know who (which customer, not database user) and when performed what operation. For those tables we need to audit we defined auditing tables which have the same columns as original tables.

Generic Trigger

First, we created a generic database trigger that inserts the row being modified into the auditing table:
There are few things to note here:

  • Auditing table for foo is foo_aud.
  • We used the PostgreSQL trigger variables: TG_TABLE_NAME, OLD, NEW, TG_OP.
  • The query is by design dynamic and must be run with 'EXECUTE ... USING ...'.
  • The construct '$1.column_name' references a column in a record. The record is the first parameter to the dynamic query and is either the NEW or the OLD trigger variable. You don't want to know how long it took me to figure this one out :).
  • The PostgreSQL transaction ID is used to find corresponding row in the revinfo table - see section Who and When bellow.
  • Along with the columns of the original table, the audited table has a varchar 'op' and a integer 'rev' column, containing the operation (insert, update, delete) and the rev from revinfo respectively.

The following script assigns the trigger:

Element collections

Auditing of updates to element collections works out of the box. The respective auditing tables need to have the two columns (op, rev) and the trigger must be assigned.

Who and when

The original goal was to know which user made the change and when. We ripped off Envers and introduced a table like the revinfo table. It contains one row for every transaction that modifies an audited table, together with the who and the when:
The table contains information about the application user, the time and the transaction ID. The transaction ID allows us to link auditing entries to a revinfo row. We cannot use transaction ID directly. It is unique only within one PostgreSQL installation. If we migrate the database to a different server old txid entries might clash with its txid sequence. With our solution, we only need to set txid to null and then migrate the database.

The problem that remains is inserting a row into this table for every write transaction. Ideally, we would insert it only when a transaction modifies an audited entity. So, we added an interface, @AuditedEntity, to mark the audited entities and we registered a listener on the very same events Envers does. The listener then inserts a row into the revinfo table when a transaction performs the first write operation to an audited entity.
A few points to note:

  • The insert must happen before the entity change due to the referential integrity from auditing table to the revinfo table.
  • We need to perform the revinfo insert with session.doWork(). A simple session.createSQLQuery() and then query.executeUpdate() would cause Hibernate to flush the dirty entities.
  • HQL/SQL bulk updates do not trigger Hibernate events, hence we must register the transaction manually using the register(SessionImpl session) method.
  • It is plain evil to use internal Hibernate classes like SessionImpl but we need methods from both SessionImplementor and Session interfaces. Those are combined in the EventSource but that is not available at the point of registering a transaction manually, there we only have the Session. It all boils down to the SessionImpl anyway so we bit the bullet and made the cast. We know we will suffer when we upgrade to Hibernate 5.
We register the events in a ServiceContributingIntegrator:

Conclusion

Our approach performs auditing on the database level. The application only needs to call one insert per write transaction to store user and time information under the transaction ID. The approach supports auditing of SQL/HQL bulk updates.

We haven't done any rigid performance test yet. But we do generate data for the application performance test; it results in inserting approximately 100k rows. The data generation took 50 seconds with Envers auditing, it takes only 35 seconds with database triggers so we look at the improvement of 30%.

Have we overlooked anything? Any suggestions are welcome.