Thursday, July 22, 2010

Tables, Keys and Hibernate fun

Okay, this is one of those blog posts where one has to walk the line of not saying too much for fear of giving away corporate secrets and not saying enough. So excuse any vagueness when it comes to specifics on the internals. I started digging into the corporate ODS (Operational Data Store) for a couple of reasons. I want to understand the structure of it better, what it can actually provide to us in terms of business value, to actually start utilising it and, well, just to do a bit of technical playing and get the ole gray matter working a bit on more than specs.

So, there I sat, looking at a structure which is similar to, but with some subtle differences, to what is in the day to day source system. Arming myself with Hibernate and myEclipse I decide to reverse engineer the structures - easy enough to do with the tools at hand. Imagine my surprise when after the first import the mappings showed that the unique identifier for every ODS table is a composite id of all of the columns within the table. Surprising at first considering that the source tables were built from the word go according to the guidelines and standards of the company and was designed from the ground up to have keys, referential integrity and all the good stuff that is always thrown about.

A quick investigation into the ODS tables showed that there weren't primary keys defined on any of the tables. Now, as I can't give away any corporate secrets, I've created a little sample table to work with to illustrate what happens - so I give you the SAMPLE table (created in Derby) that emulates what I saw typically in the ODS.


create table "EXAMPLE"."SAMPLE"(
"ID" INTEGER,
"TITLE" VARCHAR(128) not null,
"AVALUE" VARCHAR(255),
"AKEY" INTEGER not null,
"POSTTIME"
TIMESTAMP
);
create unique index "SAMPLEINDEX" on "EXAMPLE"."SAMPLE"("ID");


Interesting, but when one asks the myEclipse Hibernate generator to generate a mapping file, one gets the following


<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<!--
Mapping file autogenerated by MyEclipse Persistence Tools
-->
<hibernate-mapping>
<class name="za.co.passif.mappings.Sample" table="SAMPLE" schema="EXAMPLE">
<composite-id name="id" class="za.co.passif.mappings.SampleId">
<key-property name="id" type="java.lang.Integer">
<column name="ID" />
</key-property>
<key-property name="title" type="java.lang.String">
<column name="TITLE" length="128" />
</key-property>
<key-property name="avalue" type="java.lang.String">
<column name="AVALUE" />
</key-property>
<key-property name="akey" type="java.lang.Integer">
<column name="AKEY" />
</key-property>
<key-property name="posttime" type="java.sql.Timestamp">
<column name="POSTTIME" length="26" />
</key-property>
</composite-id>
</class>
</hibernate-mapping>


Not quite what I was expecting initially, but it makes sense based on the table definition. A quick visit to the ODS architect and I garnered the following:

  • Due to how the ODS is populated they will never add a primary key to a table. Entries into the ODS will always only be added, thus an update to a value on the source table will translate to a new row in the corresponding ODS table. This will enable one to see 'change over time', but does place an onus on any query using dates to derive the appropriate value as at a specific moment in time.
  • There is a column in every ODS table that sounds like, tastes like and feels like a primary key - I represented it as the AKEY column in my example. It is a unique column using a sequence to generate a unique value and can thus be used as a primary key, however it will never be indicated as such as the sequence generation is handled within code. On a personal note I would have loved it if this column could have been indicated as a primary key on table level, however personal preferences and corporate standards do not neccesarily mix.


So, how to get past this? There must be some way to instruct Hibernate that the AKEY column can be used as a primary key. And there it was, cunningly hiding in plain sight - a way to define a custom strategy for reverse engineering the mappings that will be generated.



MyEclipse has a very short, but nice overview of how to configure and set up a custom strategy (just select the Browse button and follow the link).

So, at this stage, what did I know?

  • Every ODS table has a column that can be used to uniquely identify the row called AKEY

  • The AKEY column uses a sequence to generate a unique id

  • I can use the AKEY column as a primary key


Armed with this knowledge the solution proved to be quite simple. One of the methods on the DelegationReverseEngineeringStrategy class is the getPrimaryKeyColumnNames
method. It returns a list of the all columns that make up the primary key for the specified table.

public List getPrimaryKeyColumnNames(TableIdentifier identifier)


I overrode the method with the following - not the most elegant but it will have the desired result based on the above information


package za.co.passif.tools;

import java.util.ArrayList;
import java.util.List;
import org.hibernate.cfg.reveng.DelegatingReverseEngineeringStrategy;
import org.hibernate.cfg.reveng.ReverseEngineeringStrategy;
import org.hibernate.cfg.reveng.TableIdentifier;

public class SampleStrategy extends DelegatingReverseEngineeringStrategy {

public SampleStrategy(ReverseEngineeringStrategy delegate) {
super(delegate);
}

/**
* Forces the AKEY column to be returned as the primary key for any table
* being reverse engineered
*/
public List getPrimaryKeyColumnNames(TableIdentifier identifier) {
ArrayList aList = new ArrayList();
aList.add(new String("AKEY"));
return aList;
}
}


After rerunning the reverse engineering my mapping file now looks much more like what I was expecting in the first place.


<hibernate-mapping>
<class schema="EXAMPLE" table="SAMPLE"name="za.co.passif.mappings.Sample">
<id name="akey" type="java.lang.Integer">
<column name="AKEY">
<generator class="assigned">
</id>
<property name="id" type="java.lang.Integer">
<column name="ID" unique="true">
</property>
<property name="title" type="java.lang.String">
<column name="TITLE" length="128" null="true">
</property>
<property name="avalue" type="java.lang.String">
<column name="AVALUE">
</property>
<property name="posttime" type="java.sql.Timestamp">
<column name="POSTTIME" length="26">
</property>
</class>
</hibernate-mapping>


Mission accomplished. The mapping files now define the intent of the ODS table appropriately and correctly. Overkill for a single table, but when you look at thirty or so tables the internal lazy developer kicks in. Or maybe that is just Brian's influence...

As an aside if one chooses to also generate Id classes, the id class itself does not contain all of the definitions for the data structure (this originally started me off on this road to explore the Strategy). One of the comments I came across was that 'myEclipse only works with what it knows' which is true, but it was nice to discover one can dictate its behaviour and get consistency for an approach.

And now for this old dog to learn how the syntax highlighting works. Darn new-fangled technology - in my day a marker and paper was enough... grumble...grumble...

No comments: