Hibernate: Difference between revisions

From Elvanör's Technical Wiki
Jump to navigation Jump to search
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Links and documentation =
= Installation, configuration, documentation =


Some nice Hibernate documentation links:
== Installation ==
 
* [http://www.hibernate.org/hib_docs/v3/reference/en/html/index.html Hibernate reference documentation]
* [http://www.hibernate.org/hib_docs/annotations/reference/en/html_single/ Hibernate annotations documentation]
* [http://java.sun.com/javaee/5/docs/api/index.html?javax/persistence/package-summary.html Java Persistence API Javadocs (useful for annotations)]
* [http://www.hibernate.org/hib_docs/annotations/api/ Hibernate Annotations API Javadocs]
 
Useful books:
 
* Java Persistence with Hibernate (Manning)
* Pro Hibernate 3 (Apress)
 
= Installation and Configuration =


* List of mandatory JARs:
* List of mandatory JARs:
Line 29: Line 17:
   
   
Without these jars on the classpath, Hibernate won't work well (or at all). There are lots of optional jars (for second level caching, etc) as well.
Without these jars on the classpath, Hibernate won't work well (or at all). There are lots of optional jars (for second level caching, etc) as well.
== Configuration ==


* Configuration of Hibernate is simple: you must basically create a SessionFactory object. Normally the best way to configure the session factory is to  
* Configuration of Hibernate is simple: you must basically create a SessionFactory object. Normally the best way to configure the session factory is to  
Line 34: Line 24:


  SessionFactory sessionFactory = new AnnotationConfiguration().configure().buildSessionFactory();
  SessionFactory sessionFactory = new AnnotationConfiguration().configure().buildSessionFactory();
== Links and documentation ==
Some nice Hibernate documentation links:
* [http://www.hibernate.org/hib_docs/v3/reference/en/html/index.html Hibernate reference documentation]
* [http://www.hibernate.org/hib_docs/annotations/reference/en/html_single/ Hibernate annotations documentation]
* [http://java.sun.com/javaee/5/docs/api/index.html?javax/persistence/package-summary.html Java Persistence API Javadocs (useful for annotations)]
* [http://www.hibernate.org/hib_docs/annotations/api/ Hibernate Annotations API Javadocs]
Useful books:
* Java Persistence with Hibernate (Manning)
* Pro Hibernate 3 (Apress)
=== Examples ===
* In the Hibernate documentation, there are examples with Address and Person. This is not very clear; but actually the address has many persons. In the many-to-one / one-to-many example, the address is the side owning multiple persons. The many-to-one mapping is thus defined on Person (note that the column holding the foreign key will be Person too).


= Mappings =
= Mappings =
Line 45: Line 54:
* The owning side in the case of a bidirectional association (one-to-one) is the side that will update the relationship when saved to the DB. The join column holding the foreign key will be present in the '''owner side.''' Note that the non-owning side should *NOT* have a column referring to the owner - in Grails there is this column, but it is useless.
* The owning side in the case of a bidirectional association (one-to-one) is the side that will update the relationship when saved to the DB. The join column holding the foreign key will be present in the '''owner side.''' Note that the non-owning side should *NOT* have a column referring to the owner - in Grails there is this column, but it is useless.


* Note that the concept of bidirectionality usually implies (for a many-to-one mapping) than one side has a reference (simple link) to another object, and this object has a collection of objects. The bidirectionality is optional in the sense that using collections on an object is not necessarily. You could always write queries to find all the objects that are the child of a given entities. However, using a collection explicitly states at the Java object models the bidirectional relationship.
* Note that the concept of bidirectionality usually implies (for a many-to-one mapping) than one side has a reference (simple link) to another object, and this object has a collection of objects. The bidirectionality is optional in the sense that using collections on an object is not necessary. You could always write queries to find all the objects that are the child of a given entities. However, using a collection explicitly states that the Java object models the bidirectional relationship.


* This is also the reason for which many-to-one unidirectional associations are frequent, but one-to-many (eg a collection with no inverse links) are less used.
* This is also the reason for which many-to-one unidirectional associations are frequent, but one-to-many (eg a collection with no inverse links) are less used. Unidirectional one-to-many are discouraged by the Hibernate docs and are mapped by Grails with a join table (Hibernate also recommends doing this, although it is possible to map with a foreign key).


* Hibernate differentiates between entities and components. Entities have their own lifecycle, and can support shared references, while components are similar to value types and their lifespan is equal to the lifespan of their containing parent. Components can thus be useful when the object is clearly bound to another object. Relatively speaking, components are of lesser importance than full domain entities.
* Hibernate differentiates between entities and components. Entities have their own lifecycle, and can support shared references, while components are similar to value types and their lifespan is equal to the lifespan of their containing parent. Components can thus be useful when the object is clearly bound to another object. Relatively speaking, components are of lesser importance than full domain entities.
* A true OneToOne should be enforced with the same primary key for both objects. This can be done with a special ID generator (there's an example of that in the Hibernate book). Most OneToOnes, the ones that use a foreign key, are not really true OneToOne (they should be called ToOne associations), even with an unique constraint on the foreign key.


== Embeddable Components ==
== Embeddable Components ==
Line 68: Line 79:


* Note that you cannot have two one-to-one bidirectional mappings to the same class (this is also true for many-to-one). This is because there is no true bidirectionality in the relationship, in the sense that the target object cannot know to which property it belongs in the object that has two links. At the database level, the table of one object will hold two join columns. From this object you can correctly find the target objects by looking at the join columns which will hold the target objects ids. But from a sample target object, you cannot go back to the other object as there is no way for you to know which one of the join columns you must query.
* Note that you cannot have two one-to-one bidirectional mappings to the same class (this is also true for many-to-one). This is because there is no true bidirectionality in the relationship, in the sense that the target object cannot know to which property it belongs in the object that has two links. At the database level, the table of one object will hold two join columns. From this object you can correctly find the target objects by looking at the join columns which will hold the target objects ids. But from a sample target object, you cannot go back to the other object as there is no way for you to know which one of the join columns you must query.
* Even if you add a column on a target object to store the owner's id, this does not achieve true bidirectionality: you can find the owner object, but you cannot know (from the point of view of the target) to which link it refers.
* Even if you add a column on a target object to store the owner's id, this does not achieve true bidirectionality: you can find the owner object, but you cannot know (from the point of view of the target) to which link it refers. Note that it can work if one of the association is marked as unidirectional.
* You cannot always enforce non-nullability constraints directly via Hibernate because of the SQL constraints (SQL constraints are generally enforced by databases for each statement, and are not deferred). So sometimes an (optional = false) property must be omitted even if the property cannot be null, just because that would prevent the underlying SQL model to accept insertions. In this case, the best is to enforce the business logic constraint at another level. Several options exist, one is to use a event listener and write the constraint enforcement there when an event is triggered for example.


== SQL Types ==
== SQL Types ==
Line 90: Line 102:
* You should map an enum property with the @Enumerated annotation. I am not sure this is totally required though - it seems to work fine without.
* You should map an enum property with the @Enumerated annotation. I am not sure this is totally required though - it seems to work fine without.
* If you need to map a BigDecimal (or even Double) that may be equal to Infinity, the best thing to do in many cases is to assume a null value corresponds to infinity. This maps well to the DB; if you use a Double and the object is actually the Java constant for infinity, Hibernate will have trouble putting this in the DB and especially getting it back as a proper infinity object.
* If you need to map a BigDecimal (or even Double) that may be equal to Infinity, the best thing to do in many cases is to assume a null value corresponds to infinity. This maps well to the DB; if you use a Double and the object is actually the Java constant for infinity, Hibernate will have trouble putting this in the DB and especially getting it back as a proper infinity object.
* When you add a Nullable Date to your Java model, be careful that Hibernate will update a MySQL table with a new column that is indeed a NULL DATETIME, but all the old instances will not get a value of NULL! They will have a strange date of "00-00-00". You will need to manually update all rows with a null value on this column.
== Examples ==
* Case with an owner side that has both a Set (one-to-many) and an association (one-to-one) to the same target class. In this case, for reasons not totally clear to me, you must not define a cascade on the one-to-one association (cascading on the one-to-many is thus necessary, and works fine).
* Note that you cannot have optional = false on the homeComputer, as it would make inserting an object impossible because of the various SQL constraints.
<pre>
import javax.persistence.*
import org.hibernate.annotations.*
@Entity
@Table(name="user_hibernate")
class UserHibernate
{
@Id
@GeneratedValue(strategy = javax.persistence.GenerationType.IDENTITY)
Long id
@Version
@Column(nullable = false)
Long version
String name
// Variant 1 (bidirectionality is on the association, not the list)
//@OneToOne(cascade=CascadeType.ALL)
@OneToOne()
@JoinColumn(name="home_computer_id")
ComputerHibernate homeComputer
@OneToMany(cascade=CascadeType.ALL, fetch=FetchType.EAGER, targetEntity=ComputerHibernate.class)
        @JoinColumn(name="owner_id", nullable=false)
Set computers
// Variant 2
/*@OneToOne()
@JoinColumn(name="home_computer_id")
ComputerHibernate homeComputer
@OneToMany(cascade=CascadeType.ALL, targetEntity=ComputerHibernate.class)
@JoinColumn(name="owner_id", nullable=false)
Set computers*/
}
@Entity
@Table(name="computer_hibernate")
class ComputerHibernate
{
@Id
@GeneratedValue(strategy = javax.persistence.GenerationType.IDENTITY)
Long id
@Version
@Column(nullable = false)
Long version
String model
// Variant 1
@OneToOne(mappedBy = "homeComputer", optional=false)
UserHibernate owner
// Variant 2
/*@ManyToOne(optional=false)
@JoinColumn(name="owner_id", insertable=false, updatable=false, nullable=false)
UserHibernate owner*/
}
</pre>


= Managing objects and their lifecycles =
= Managing objects and their lifecycles =
Line 112: Line 197:


* [http://www.hibernate.org/42.html This page explains sessions and transactions well.]
* [http://www.hibernate.org/42.html This page explains sessions and transactions well.]
=== Transactions with Spring ===
* Normally if you use Spring, Spring transactions wrap Hibernate ones via an HibernateTransactionManager class. At a lower level, Hibernate transactions still wrap JDBC ones. Note that the use of HibernateTransactionManager implies you are not using JTA or container managed transactions. Still, Spring transactions can be done via annotations, so without any kind of programmatic code: they are closer to CMT.
* Grails use Spring, so the same concepts apply to Grails. You can find more information [[Spring#Transaction_Support|on this page.]]
== Concurrency control ==
* Concurrency control is extremely important. If you build a web service that relies on Hibernate, at some point when traffic increases you will get concurrency problems (when updating your objects). The typical exception that Hibernate will start to throw in those cases is StaleObjectStateException. You access one object in a transaction, try to update it, while in another transaction the same object was already modified. Note that most applications not relying on Hibernate (PHP code, for instance) will also have these problems, but usually developers don't care and implicitly don't do any locking, which means that collisions happen and last commit wins.
* Note that if you don't have a lot of traffic, such problems are typically due to JavaScript. For instance in Grails, since there is a session per request model, this would mean that two HTTP requests were fired at the same time. This exception can also happen if you have a detached instance that you try to reattach, but the SQL data row was deleted in the meantime.
=== Three methods to deal with concurrent accesses ===
* Last commit wins strategy: this actually means "do not do any kind of locking". This is what Hibernate does by default, but not Grails. In this case objects are not checked at all for mid-air collisions and the last transaction commit will always overwrite the others. In many cases, this is what is the most appropriate strategy.
* Optimistic locking: this is Grails default strategy, and the general recommended approach. In this case Hibernate will check at commit time (via a SQL select to the DB) that a special field (usually named version) was not incremented between the initial load of the object in the session and commit time. If the object was modified, the StaleObjectStateException is thrown (this will in turn, in Spring / Grails, throw a org.springframework.dao.OptimisticLockingFailureException). When this happens, you have to decide what to do, and this really depends on your application.
** You could abort everything and display an error message to the user;
** Or you could restart the service and the transaction (and even do that several times). Note that once an exception is thrown, the Hibernate session should always be discarded and a new one obtained.
** Hibernate allows you to turn off optimistic locking for particular properties of your classes, which can be very useful (sometimes it is just better to use last commit wins for some fields, and optimistic locking for others). Unfortunately, Grails only allows you to turn off optimistic locking globally at the class level. This may be improved soon though.
** However, note that Hibernate does not allow you to turn off optimistic locking for a particular transaction. Once configured, optimistic locking is always (or never) used.
* Pessimistic locking: with pessimistic locking, a lock is obtained at the DB level and any reads should block until the update is done. This translates to a SELECT ... FOR UPDATE statement at the SQL level. This obviously has a very big performance penalty, but may sometimes be necessary.
** Note that I was not able to use pessimistic locking with a Grails / Spring / Hibernate / MySQL setup. The correct SQL statement would be issued but other requests would not block... this has to be investigated and is probably due to a bug in one of the layers, so I should start at the lowest level (JDBC) and add layers one at a time to see where is the problem.
=== Database isolation level ===
* An isolation level is defined at the database level. I am not yet very knowledgeable with this. It seems to be related to reads more than writes, and essentially tells the DB if you allow dirty reads, phantom reads, and such. However this can also apparently influence pessimistic locking (as database implement locking in relation to isolation).
* This is defined at the database level, so in a JEE architecture, at the JDBC level.
=== Database level deadlocks ===
* If things go really bad, you can have a deadlock at the database level. This throws a JDBC exception. I don't know yet how to deal with these kinds of problems.
=== Troubleshooting ===
* To troubleshoot StaleObjectStateException, the best is to use Event listeners, on preLoad / postLoad and preUpdate. By printing timestamps of operations, you usually get an understanding of what's happening.


== Designing a Java Hibernate object ==
== Designing a Java Hibernate object ==
Line 120: Line 243:


* Note that Groovy objects are already Java beans so will work fine with Hibernate.
* Note that Groovy objects are already Java beans so will work fine with Hibernate.
== Deleting an object ==
* Be careful when deleting an object. Many times the consistency checks of the Hibernate constraints won't be valid anymore. This is OK when deleting, but you have to make sure you don't do anything else with the object once you removed the associations to allow the deletion. If you do something like a criteria with invalid constraints, Hibernate will throw an exception.


== Modifying an object's state ==
== Modifying an object's state ==
Line 132: Line 259:
* If you don't want to issue an UPDATE statement, you can use lock(object, LockMode.NONE). In this case, you must be sure however that the detached instance has not been modified.
* If you don't want to issue an UPDATE statement, you can use lock(object, LockMode.NONE). In this case, you must be sure however that the detached instance has not been modified.


== Locking ==
== Collections ==


* If you get StaleObjectExceptions in your code, this means that you have a concurrency problem. Typically, you access one object in a transaction, try to update it, while in another transaction the same object was already modified. In Grails, since there is a session per request model, this typically means that two HTTP requests were fired at the same time. Thus it is most probably a JavaScript error.
* Never try to associate a collection reference to more than one object. This will cause lots of errors. If you need to, you can copy all the objects from a collection to another one. This applies even if you deleted the object containing the collection reference first.
* Don't call empty() on a Hibernate collection! The correct Java method name is "isEmpty", but the empty() method exists for the Hibernate implementation. It should not be used though. Usually this won't be a problem in Java, but it can be harder to spot in Groovy.
* If you use a Set but still want to save more than one item in the set in a single transaction, and have the items saved in a precise order, you can use a SortedSet. Some Groovy code that works:
<pre>
// this represents a Member, with a one-to-many to designs


* This can also happens if you have a detached instance that you try to reattach, but the SQL data row was deleted in the meantime.
designs = new TreeSet(["compare": {a, b -> a.id <=> b.id}] as Comparator)


== Collections ==
Design design = new Design()
design.id = 1
design.member = this
designs.add(design)


* Never try to associate a collection reference to more than one object. This will cause lots of errors. If you need to, you can copy all the objects from a collection to another one. This applies even if you deleted the object containing the collection reference first.
Design design = new Design()
* Don't call empty() on a Hibernate collection! The correct Java method name is "isEmpty", but the empty() method exists for the Hibernate implementation. It should not be used though. Usually this won't be a problem in Java, but it can be harder to spot in Groovy.
design.id = 2
design.member = this
designs.add(design)
</pre>


== Hibernate objects in an HTTP session ==
== Hibernate objects in an HTTP session ==
Line 244: Line 381:


* Open source connection managers include commons-dbcp (Apache), C3P0 (apparently better with very busy sites) and Proxool. If used with Spring, you can configure data source beans that will be later referenced by the session factory.
* Open source connection managers include commons-dbcp (Apache), C3P0 (apparently better with very busy sites) and Proxool. If used with Spring, you can configure data source beans that will be later referenced by the session factory.
== Transactions ==
* It seems very important to define the boundaries of a transaction when batch inserting entities into the DB. Without transaction demarkation (for instance in a Grails controller), insert performance is on the order of 5-10 entities / second. With a transaction, 1000 entities / second can be processed. I suspect that without a transaction, MySQL creates a transaction for every insert (or auto commits every SQL statement).
== Large read operations ==
* If you only need read-access, specifying this to Hibernate will result in a 25-30% improvement.
* Using the second level cache in a large read operation (100,000 entities for instance) can degrade performance from 5 to 10 times. This actually happens only after the first request, since the cache is empty at first. This seems to mean that Hibernate second level cache is by default far slower than the DB...


= Referential Integrity =
= Referential Integrity =
Line 302: Line 448:


* "keys" is a reserved SQL keyword. Thus no variable of a persistent class (or embedded component) should be named "keys" (or at the very least, you need to use the @Column annotation to specify a column name).
* "keys" is a reserved SQL keyword. Thus no variable of a persistent class (or embedded component) should be named "keys" (or at the very least, you need to use the @Column annotation to specify a column name).
== Multithreading environment ==
* Hibernate does not work well in a multithreaded environment. Eg, the Hibernate session is not thread-safe. This means that you cannot use the same Hibernate session on multiple threads. The actual errors are linked to the use of proxies; if an entity is loaded at some point in a thread but accessed in another, you will encounter errors (Hibernate does not synchronize the threads and block them until the entity has been loaded).
* One way to deal with these painful problems is to disable lazy loading and load everything before the parallelized computations take place. Another is to use a session per thread rather than the same session in multiple threads. This should work but can cause performance problems (data loaded several times from the DB, etc).
= Internals =
* Setting the log level to TRACE can be very useful when debugging a hard Hibernate problem. DEBUG is not enough. With TRACE you get a complete trace of the cascades that are taking place, the SQL statements that are prepared, etc.
* To build Hibernate (at least versions 3.3 to 3.5), you need Maven. Use the commands maven compile / maven install and set your JDK to a 1.5 one for the first time. After the first compilation, you can set it back to a 1.6.x JDK as the problematic classes will be already compiled.
* The mappings are basically represented by classes in the org.hibernate.mapping package. Mostly, the mappings are a list of PersistentClass. They constitute a meta-model that is created when Hibernate starts by the Configuration (org.hibernate.cfg.Configuration).
* The persisters (org.hibernate.persister package) build the SQL statements, based on the mappings / meta-model. Note that many SQL strings are built only once and reused later; there is no need to build the statements at every operation during runtime.
* Internally, a lot of things happen via event listeners. For instance, when an instance should be deleted, there is a default internal listener that will listen to a delete event (the session will trigger that event).
* An interesting class is org.hibernate.event.def.AbstractFlushingEventListener. It prepares and orders SQL statements, so if there seems to be a bug in the sheduling of SQL statements, this is a good place to start looking / debugging.

Latest revision as of 13:21, 27 March 2014

Installation, configuration, documentation

Installation

  • List of mandatory JARs:
 antlr.jar
 cglib.jar
 asm.jar
 asm-attrs.jars
 commons-collections.jar
 commons-logging.jar
 hibernate3.jar
 jta.jar
 dom4j.jar
 log4j.jar 

Without these jars on the classpath, Hibernate won't work well (or at all). There are lots of optional jars (for second level caching, etc) as well.

Configuration

  • Configuration of Hibernate is simple: you must basically create a SessionFactory object. Normally the best way to configure the session factory is to

use an XML file (hibernate.cfg.xml). If you use the Hibernate annotations to map your classes, create an AnnotationConfiguration object instead of the Configuration object:

SessionFactory sessionFactory = new AnnotationConfiguration().configure().buildSessionFactory();

Links and documentation

Some nice Hibernate documentation links:

Useful books:

  • Java Persistence with Hibernate (Manning)
  • Pro Hibernate 3 (Apress)

Examples

  • In the Hibernate documentation, there are examples with Address and Person. This is not very clear; but actually the address has many persons. In the many-to-one / one-to-many example, the address is the side owning multiple persons. The many-to-one mapping is thus defined on Person (note that the column holding the foreign key will be Person too).


Mappings

  • Mappings and especially associations can be hard to understand at first.

Concepts

  • The concept of bidirectionality is *very* important. A bidirectional association means that links exist at both sides - the link can be navigated in both directions. Note that sometimes things that are possible with unidirectional associations are hard or impossible to do with bidirectional associations (usually because actually, in the Java underlying model the association is not truly "bidirectional", eg. there is some missing information on one side of the link).
  • The owning side in the case of a bidirectional association (one-to-one) is the side that will update the relationship when saved to the DB. The join column holding the foreign key will be present in the owner side. Note that the non-owning side should *NOT* have a column referring to the owner - in Grails there is this column, but it is useless.
  • Note that the concept of bidirectionality usually implies (for a many-to-one mapping) than one side has a reference (simple link) to another object, and this object has a collection of objects. The bidirectionality is optional in the sense that using collections on an object is not necessary. You could always write queries to find all the objects that are the child of a given entities. However, using a collection explicitly states that the Java object models the bidirectional relationship.
  • This is also the reason for which many-to-one unidirectional associations are frequent, but one-to-many (eg a collection with no inverse links) are less used. Unidirectional one-to-many are discouraged by the Hibernate docs and are mapped by Grails with a join table (Hibernate also recommends doing this, although it is possible to map with a foreign key).
  • Hibernate differentiates between entities and components. Entities have their own lifecycle, and can support shared references, while components are similar to value types and their lifespan is equal to the lifespan of their containing parent. Components can thus be useful when the object is clearly bound to another object. Relatively speaking, components are of lesser importance than full domain entities.
  • A true OneToOne should be enforced with the same primary key for both objects. This can be done with a special ID generator (there's an example of that in the Hibernate book). Most OneToOnes, the ones that use a foreign key, are not really true OneToOne (they should be called ToOne associations), even with an unique constraint on the foreign key.

Embeddable Components

  • The concept of component is useful but has several limitations.
  • First, an entity can have a collection of components, however the embedded components cannot have a collection in that case.
  • If used as a key in a Map collection mapping, an embedded component also cannot own collections. It seems to me that for an embeddable component used as a Map key, the @MapKey annotation does not work fully as expected. In particular it seems the column attribute of this annotation has no effect. Hibernate will create the key column name based on the name of the properties of the embedded component.
  • A component used as Map key is a rather complex case; most of the times I think using one of the target entity (or component) property as a key is better. This is perfectly possible in Hibernate.
  • A component owning a collection can only be embedded into a class once. It seems embedding into several different classes is also not possible.
  • With entities these problems are not present, and with the delete-orphan cascade option entities behave almost like embeddable components.

Limitations

  • Note that you cannot have two one-to-one bidirectional mappings to the same class (this is also true for many-to-one). This is because there is no true bidirectionality in the relationship, in the sense that the target object cannot know to which property it belongs in the object that has two links. At the database level, the table of one object will hold two join columns. From this object you can correctly find the target objects by looking at the join columns which will hold the target objects ids. But from a sample target object, you cannot go back to the other object as there is no way for you to know which one of the join columns you must query.
  • Even if you add a column on a target object to store the owner's id, this does not achieve true bidirectionality: you can find the owner object, but you cannot know (from the point of view of the target) to which link it refers. Note that it can work if one of the association is marked as unidirectional.
  • You cannot always enforce non-nullability constraints directly via Hibernate because of the SQL constraints (SQL constraints are generally enforced by databases for each statement, and are not deferred). So sometimes an (optional = false) property must be omitted even if the property cannot be null, just because that would prevent the underlying SQL model to accept insertions. In this case, the best is to enforce the business logic constraint at another level. Several options exist, one is to use a event listener and write the constraint enforcement there when an event is triggered for example.

SQL Types

  • If you need to store large Strings, you need a TEXT column definition (at least in MySQL). This can be achieved via the following annotation:
@Column(name = "message", columnDefinition = "TEXT")

I am not sure this is portable between various DBs though.

  • Note that the length annotation works as expected and generates a VARCHAR column of the given length. But there is a maximum for VARCHAR columns, and don't forget that if you use an UTF-8 engine, MySQL needs up to 3 bytes per character, thus the maximum (65536) is actually divided by 3.
  • In MySQL, even the TEXT column is quite limited (it seems like it is 65K). You may have to use MEDIUMTEXT or LONGTEXT.
  • Do not use Float objects in Java as they are mapped to the FLOAT SQL type. The precision is then very low and will cause some problems. Use Double, or even better, most of the times you should use a BigDecimal in Java (will be mapped to a DECIMAL SQL type in MySQL). BigDecimal allows you to do precise computations.

Remarks

  • If you specify a one-to-many or many-to-many mapping, be sure to use the same mapping as the Java collection interface. A Set mapping won't work with a List collection, for example.
  • When using inheritance with a table-per-hierarchy strategy, the class name is saved as a string in a class column on the table. Be careful that the full class name is saved, with the package, so if later you change the package, you have to make some manual SQL modifications. It's thus better to carefully plan the name and package in advance.
  • You should map an enum property with the @Enumerated annotation. I am not sure this is totally required though - it seems to work fine without.
  • If you need to map a BigDecimal (or even Double) that may be equal to Infinity, the best thing to do in many cases is to assume a null value corresponds to infinity. This maps well to the DB; if you use a Double and the object is actually the Java constant for infinity, Hibernate will have trouble putting this in the DB and especially getting it back as a proper infinity object.
  • When you add a Nullable Date to your Java model, be careful that Hibernate will update a MySQL table with a new column that is indeed a NULL DATETIME, but all the old instances will not get a value of NULL! They will have a strange date of "00-00-00". You will need to manually update all rows with a null value on this column.

Examples

  • Case with an owner side that has both a Set (one-to-many) and an association (one-to-one) to the same target class. In this case, for reasons not totally clear to me, you must not define a cascade on the one-to-one association (cascading on the one-to-many is thus necessary, and works fine).
  • Note that you cannot have optional = false on the homeComputer, as it would make inserting an object impossible because of the various SQL constraints.
import javax.persistence.*
import org.hibernate.annotations.*

@Entity
@Table(name="user_hibernate")
class UserHibernate
{
	@Id
	@GeneratedValue(strategy = javax.persistence.GenerationType.IDENTITY)
	Long id

	@Version
	@Column(nullable = false)
	Long version

	String name

	// Variant 1 (bidirectionality is on the association, not the list)

	//@OneToOne(cascade=CascadeType.ALL)
	@OneToOne()
	@JoinColumn(name="home_computer_id")
	ComputerHibernate homeComputer

	@OneToMany(cascade=CascadeType.ALL, fetch=FetchType.EAGER, targetEntity=ComputerHibernate.class)
        @JoinColumn(name="owner_id", nullable=false)
	Set computers

	// Variant 2

	/*@OneToOne()
	@JoinColumn(name="home_computer_id")
	ComputerHibernate homeComputer

	@OneToMany(cascade=CascadeType.ALL, targetEntity=ComputerHibernate.class)
	@JoinColumn(name="owner_id", nullable=false)
	Set computers*/
} 

@Entity
@Table(name="computer_hibernate")
class ComputerHibernate
{
	@Id
	@GeneratedValue(strategy = javax.persistence.GenerationType.IDENTITY)
	Long id

	@Version
	@Column(nullable = false)
	Long version

	String model

	// Variant 1

	@OneToOne(mappedBy = "homeComputer", optional=false)
	UserHibernate owner

	// Variant 2

	/*@ManyToOne(optional=false)
	@JoinColumn(name="owner_id", insertable=false, updatable=false, nullable=false)
	UserHibernate owner*/
}

Managing objects and their lifecycles

Using a session and transactions

  • When you close a session, it is *not* automatically flushed. You must either flush it manually or wrap your operations in a transaction (which is the recommanded way).
  • It is recommended to wrap all database operations in Hibernate transactions. The Hibernate transaction API then wraps other transactions. If you run in a non-managed container (like Tomcat), JDBC transactions will be used. If you run in a full JEE server with Java JTA support, you should then use JTA transactions (and I think Hibernate transactions will wrap JTA ones anyway).
  • Note that using getCurrentSession() on a SessionFactory essentially means that you bind your session to a transaction. If it is a JTA transaction, nothing special is required (you open the JTA transaction, then just call getCurrentSession() whenever you need to use the session). If you don't use JTA (this means you use JDBC transactions), you must first call getCurrentSession() to obtain a session and then immediately create a transaction from that session (using Hibernate API). It's impossible to do anything with a session obtained via getCurrentSession() until you open a transaction.
  • If using JDBC, you must also specify a configuration option (following example in XML):
<property name="current_session_context_class">thread</property>

The jta value would bind the session to JTA transactions.

  • Implementing long sessions (sessions that span multiple transactions) is only possible if you manage the session yourself - thus you don't use getCurrentSession(). You must use sessionFactory.openSession().
  • Finally, note that you can have fully managed EJB transactions (no programmatic demarcation necessary; demarcation occurs via annotations). This is called CMT (container managed transactions) as opposed to BMT when you directly use (programmatically) JTA.

Transactions with Spring

  • Normally if you use Spring, Spring transactions wrap Hibernate ones via an HibernateTransactionManager class. At a lower level, Hibernate transactions still wrap JDBC ones. Note that the use of HibernateTransactionManager implies you are not using JTA or container managed transactions. Still, Spring transactions can be done via annotations, so without any kind of programmatic code: they are closer to CMT.
  • Grails use Spring, so the same concepts apply to Grails. You can find more information on this page.

Concurrency control

  • Concurrency control is extremely important. If you build a web service that relies on Hibernate, at some point when traffic increases you will get concurrency problems (when updating your objects). The typical exception that Hibernate will start to throw in those cases is StaleObjectStateException. You access one object in a transaction, try to update it, while in another transaction the same object was already modified. Note that most applications not relying on Hibernate (PHP code, for instance) will also have these problems, but usually developers don't care and implicitly don't do any locking, which means that collisions happen and last commit wins.
  • Note that if you don't have a lot of traffic, such problems are typically due to JavaScript. For instance in Grails, since there is a session per request model, this would mean that two HTTP requests were fired at the same time. This exception can also happen if you have a detached instance that you try to reattach, but the SQL data row was deleted in the meantime.

Three methods to deal with concurrent accesses

  • Last commit wins strategy: this actually means "do not do any kind of locking". This is what Hibernate does by default, but not Grails. In this case objects are not checked at all for mid-air collisions and the last transaction commit will always overwrite the others. In many cases, this is what is the most appropriate strategy.
  • Optimistic locking: this is Grails default strategy, and the general recommended approach. In this case Hibernate will check at commit time (via a SQL select to the DB) that a special field (usually named version) was not incremented between the initial load of the object in the session and commit time. If the object was modified, the StaleObjectStateException is thrown (this will in turn, in Spring / Grails, throw a org.springframework.dao.OptimisticLockingFailureException). When this happens, you have to decide what to do, and this really depends on your application.
    • You could abort everything and display an error message to the user;
    • Or you could restart the service and the transaction (and even do that several times). Note that once an exception is thrown, the Hibernate session should always be discarded and a new one obtained.
    • Hibernate allows you to turn off optimistic locking for particular properties of your classes, which can be very useful (sometimes it is just better to use last commit wins for some fields, and optimistic locking for others). Unfortunately, Grails only allows you to turn off optimistic locking globally at the class level. This may be improved soon though.
    • However, note that Hibernate does not allow you to turn off optimistic locking for a particular transaction. Once configured, optimistic locking is always (or never) used.
  • Pessimistic locking: with pessimistic locking, a lock is obtained at the DB level and any reads should block until the update is done. This translates to a SELECT ... FOR UPDATE statement at the SQL level. This obviously has a very big performance penalty, but may sometimes be necessary.
    • Note that I was not able to use pessimistic locking with a Grails / Spring / Hibernate / MySQL setup. The correct SQL statement would be issued but other requests would not block... this has to be investigated and is probably due to a bug in one of the layers, so I should start at the lowest level (JDBC) and add layers one at a time to see where is the problem.

Database isolation level

  • An isolation level is defined at the database level. I am not yet very knowledgeable with this. It seems to be related to reads more than writes, and essentially tells the DB if you allow dirty reads, phantom reads, and such. However this can also apparently influence pessimistic locking (as database implement locking in relation to isolation).
  • This is defined at the database level, so in a JEE architecture, at the JDBC level.

Database level deadlocks

  • If things go really bad, you can have a deadlock at the database level. This throws a JDBC exception. I don't know yet how to deal with these kinds of problems.

Troubleshooting

  • To troubleshoot StaleObjectStateException, the best is to use Event listeners, on preLoad / postLoad and preUpdate. By printing timestamps of operations, you usually get an understanding of what's happening.

Designing a Java Hibernate object

  • Hibernate objects must be Java beans (eg, proper setters and getters, and a no-argument constructor). You should avoid to have code in the constructor, as this code will be executed each time an object will get hydrated by Hibernate from the DB. In particular, avoid setting relationships and values in the constructor - this won't be an error, but will be totally useless as Hibernate will override this every time.
  • The correct way is to have a setup() function that you call everytime you create a new Hibernate object in your client code. In fact, the constructor should remain the default one.
  • Note that Groovy objects are already Java beans so will work fine with Hibernate.

Deleting an object

  • Be careful when deleting an object. Many times the consistency checks of the Hibernate constraints won't be valid anymore. This is OK when deleting, but you have to make sure you don't do anything else with the object once you removed the associations to allow the deletion. If you do something like a criteria with invalid constraints, Hibernate will throw an exception.

Modifying an object's state

  • Using evict() on an object and then resaving (updating) the object via saveOrUpdate() seems to produce strange results. It's better not to use evict() if you plan to update the object afterwards.

Detached instances

  • To reattach a detached instance to the session, you can use either update(), which will schedule an SQL UPDATE statement when the session is flushed, or merge(). update() will only work if the session does not contains already the object, while merge() will always work regardless of the session state.
    • Be careful, as the object may in fact already be loaded via other objects into the session! Thus it's hard to be always sure that your object is not yet in the session.
  • If you don't want to issue an UPDATE statement, you can use lock(object, LockMode.NONE). In this case, you must be sure however that the detached instance has not been modified.

Collections

  • Never try to associate a collection reference to more than one object. This will cause lots of errors. If you need to, you can copy all the objects from a collection to another one. This applies even if you deleted the object containing the collection reference first.
  • Don't call empty() on a Hibernate collection! The correct Java method name is "isEmpty", but the empty() method exists for the Hibernate implementation. It should not be used though. Usually this won't be a problem in Java, but it can be harder to spot in Groovy.
  • If you use a Set but still want to save more than one item in the set in a single transaction, and have the items saved in a precise order, you can use a SortedSet. Some Groovy code that works:
	// this represents a Member, with a one-to-many to designs

	designs = new TreeSet(["compare": {a, b -> a.id <=> b.id}] as Comparator)

	Design design = new Design()
	design.id = 1
	design.member = this
	designs.add(design)

	Design design = new Design()
	design.id = 2
	design.member = this
	designs.add(design)

Hibernate objects in an HTTP session

  • It is not recommended to store Hibernate managed objects in the HTTP session, if you implement the "session per HTTP request" pattern. When you will work with your objects later, these will be considered as detached instances and this will cause all kinds of pain (lazy initialization errors for their collections, if the collection was not originally fetched).
  • Reattaching the instances is therefore mandatory; using get() is not a good option as all the fetching work will take place again (better then to store only the object id in the session). Calling update() or merge() is not appropriate either as they will also pull data from the DB. The only nice option is to call lock(LockMode.NONE). This won't schedule an INSERT or UPDATE from the DB.
  • Another big disadvantage is that objects stored in the HTTP session may change after a rolled back transaction. Even if the underlying object/row in the DB won't be touched, any operation on the HTTP session object cannot of course be rolled back, so you would have to clean the mess yourself.
  • The other option is to only store in the session "light objects", that are not Hibernate managed instances. This has some nice advantages and should be considered as a solution every time.

Casting an object to one of its subclass

  • To understand why this is not possible directly in Hibernate, consider the fact that this is not possible in the underlying layer (the JVM). In Java you cannot change dynamically the class of an object at runtime. The same holds for Hibernate: there is no easy way to change the class of a persistent object (even to one of its subclasses).
  • So, in order to actually achieve this, proceed as you would with a Java object: create a new object and copy the properties from the previous object to this new one. In Hibernate you also need to delete the old object as you accomplish that (generally flushing the session before saving the new one, because of unique constraints). Wrap this in a transaction as you want to revert and avoid any change to the DB should an exception occurs.
  • This kind of situation should be avoided (or rarely used). Once set, an object should not change its class. If need be, redesign your data model.
  • Note that such a cast can be however accomplished easily using HQL / SQL... If you use a table per hierarchy mapping strategy, you can easily change the type of the object by changing the column keeping the class name. Other difficulties may arise though, you may need to update other fields as well.

Older remarks

  • If you have an object and a many-to-one mapping on this object (such as Item that belongs to a Category), and you happen to have the ID of the Category but not the associated Java object, in order to store the item on the database the fastest way is to create a new virtual Category and call setID(known_id) on this newly created object. Then, once you save the object, the virtual Category will not be saved (updated) as it already exists. For this to happen (eg, for the object not to be overriden), you must use the ID of the Object as the equals() method. This is not recommended. Note that by default Hibernate distinguishes between objects by using their addresses in memory.
  • Update: The previous remark would work, but is not the recommended way. The correct way is to use session.load(Category.class, category_id) and use a proxy. This means the object won't actually be fetched from the DB, but you'll be able to use it as a Java object. This is nice. By default, a mapping uses a proxy.

Cascade

Concept and Notes

  • The concept of cascade implies somehow that one object "belongs" to another one. When saving, updating, or deleting the parent object, the child will also be saved, updated or deleted.
  • Apparently, it is not possible to cascade in the other direction, that is, from the child to the parent. Saving a child when the parent is not yet saved won't persist the parent.
  • If you use cascading deletes in a circular way (eg, A has link to B, which links to C, which link back to A), you may get into problems. For example, if you want to delete A, hoping that all Bs and Cs will get deleted too, you will run into an exception. When C will try to get deleted, A will be already transient, and Hibernate will complain. You thus have to delete C manually before deleting A in this case. Another possibility would be to use for example Grails event system, using a beforeDelete event on B to first remove the Cs.
  • The "merge" cascade option seems to not always work as expected. Normally merge() returns a new instance, so I would have expected that the associated (referenced) objects merge() cascades to would also be new instances. However it does not seem to be always the case (bug maybe); eg. the referenced object from the new instance is the same object (in-memory) as the one referenced from the initial instance merge() was called upon.

Cascade Options

  • Here is the list of all possible cascade options:
    • none: no automatic action on the referenced object takes place. This is the default if no cascade behaviour is set.
    • persist: Cascade any persist() operation across this relationship. Note that there is a error in the reference manual where this is called create.
    • merge: Cascade any merge() operation across this relationship.
    • lock: Cascade any lock() operation across this relationship.
    • evict: Cascade any evict() operation across this relationship.
    • replicate: Cascade any replicate() operation across this relationship.
    • refresh: Cascade any refresh() operation across this relationship.
    • save-update: If save(), update() or saveOrUpdate(), is called on the referencing object, automatically call saveOrUpdate() on all referenced objects.
    • delete: automatically delete the referenced object(s) when delete() is called on the referencing object. Note that, if the referencing object is not deleted but merely removes its reference to the referenced object, then this option will not do anything and, potentially, a garbage (or orphan) object will be left in the database.
    • delete-orphan: automatically delete any object who has been removed from a collection. This option is only available for collections, not associations, and implies that there are no shared references to the object removed from the collection. This effect can happen only when removing an object from a collection.
    • all: cascade all operations, but do not take the action of delete-orphan.
    • all-delete-orphan: cascade all operations, and take the action of delete-orphan as well.

Querying

Criterias

  • Criteria offer a very object oriented querying API. In particular, it can easily query collections (eg, test if an element of a collection matches certain conditions). The Grails criteria builder is also very convenient.
  • Note than when using a nested criteria (to define a constraint on a collection), you cannot use simple object equality. You cannot express simply: find all the instances of this class that have this element in their collection. You must use idEq which uses the object id.

HQL

  • HQL is similar to SQL but much more adapted to Hibernate. HQL can do almost anything: the need to revert to existing SQL should be extremely, extremely rare. When you cannot use Criteria (complex query where you may need to use operators such as coalesce or such), use HQL.
  • Here is an example of an inner join, with a select clause (select fetches a property of the initial domain class, via the join):
"select decorations from LayoutElement as element inner join 
element.decorations as decorations where decorations.shopFront = ? and element = ?"
  • You can access Maps or other collections values by using the [] notation in HQL, but this works only in the where clause.
"from Decoration decoration where decoration.myMap['hello'] = 30"
  • You can refer to the index (or key) of a collection by using the index() HQL function:
"from Decoration decoration where index(decoration.myMap) = 20"
  • Unfortunately, bug #1930 apparently prevents the use of index() in a with clause on a join.

Subqueries

  • count() cannot be used directly on a subquery. Instead you need code such as:
from com.shoopz.shopengine.ReferenceItem as reference where (select count (*) from ShopItem as item where item.reference = reference) > 1")

Query API

  • If you want to bind a parameter corresponding to an Enum class, use setParameter(), not setEntity().

Performance Tuning

  • In Hibernate, lazy loading is on by default for collections. This means the collections are fetched from the database only when they are used (never if they are not used). Warning: if the loading appears when the Hibernate session is already closed, it will fail.

Connection Pooling

  • It is intended that Hibernate will use a third party connection pool manager. A connection pool manager manages JDBC connections to the database. By default, Hibernate pool manager is not very performant (according to the documentation).
  • Open source connection managers include commons-dbcp (Apache), C3P0 (apparently better with very busy sites) and Proxool. If used with Spring, you can configure data source beans that will be later referenced by the session factory.

Transactions

  • It seems very important to define the boundaries of a transaction when batch inserting entities into the DB. Without transaction demarkation (for instance in a Grails controller), insert performance is on the order of 5-10 entities / second. With a transaction, 1000 entities / second can be processed. I suspect that without a transaction, MySQL creates a transaction for every insert (or auto commits every SQL statement).

Large read operations

  • If you only need read-access, specifying this to Hibernate will result in a 25-30% improvement.
  • Using the second level cache in a large read operation (100,000 entities for instance) can degrade performance from 5 to 10 times. This actually happens only after the first request, since the cache is empty at first. This seems to mean that Hibernate second level cache is by default far slower than the DB...

Referential Integrity

Database Requirements

  • The InnoDB engine supports referential integrity (database consistency) by the use of foreign keys constraints. MyISAM does not support this feature. When a FK constraint is in place, the database will prevent you from deleting some rows (or will prevent wrong UPDATEs etc). Normally you can specify Hibernate which FK constraint to use.
  • Note that if you convert your DB from MyISAM to InnoDB Hibernate won't recreate all the foreign key constraints automatically. You need to update them manually (or to drop all the tables and let Hibernate recreate them). I am not entirely sure of this; it seems in some cases the constraints are recreated.

Creation of database structure by Hibernate

  • If used with an underlying DB engine that supports this, Hibernate creates some foreign key constraints when started up. It appears the name of these constraints are based on the fully-qualified name of the mapped class. This means that if the class packages changes, or the name changes (without a remapping to another table), the foreign keys will be recreated automatically by Hibernate, and the previous ones (based on the old name) will be left around. To keep the DB clean you should remove the old constraints.

Transactions

  • Real database transactions are only possible if the underlying DB supports them. This means that with MySQL MyISAM tables, transactions are very limited. In an Hibernate context, nothing is written to the DB unless the session is flushed. When it is, however, everything is written and cannot be rolled back later.
  • Bottom-line: use InnoDB if you need transactions (and transactions are almost mandatory when using an ORM I would say).

Hibernate & EJB3 Annotations

  • The EJB specification indicates whether the property should be accessed by field or by getter method. This depends on where you put the annotation. You should put it before the field if you want field access, and before the getter method to get property access.
  • The best documentation for Hibernate annotations is the reference Hibernate documentation, but also the javax.persistence API which documents all possible official EJB annotations.
  • Here is an example of a (Groovy) annotated class:
@Entity()
@Table(name="shop_category")
class ShopCategory
{
	@Id
	Long id
	
	@ManyToOne
	ShopCategory parentCategory
	
	String name
	
	@OneToMany(targetEntity = ShopItem.class, mappedBy="category")	
	Set items
	
	@OneToMany(cascade = [javax.persistence.CascadeType.ALL], targetEntity = ShopCategory.class, mappedBy="parentCategory")
	Set subCategories
}

Bidirectional One-to-one

  • If the association is mandatory, use optional = false on the @OneToOne annotation. This will make the foreign key column not null (no need to add nullable=false on the @JoinColumn annotation). It will also set an UNIQUE constraint on the foreign key column (this is strange, I think it should be always set, not only for optional associations - you can add it via unique=true if the association is optional).
@OneToOne(cascade = [CascadeType.ALL], optional = false)
@JoinColumn(name = "passport_id")
PassportAnnotated passport

Warnings

  • "keys" is a reserved SQL keyword. Thus no variable of a persistent class (or embedded component) should be named "keys" (or at the very least, you need to use the @Column annotation to specify a column name).

Multithreading environment

  • Hibernate does not work well in a multithreaded environment. Eg, the Hibernate session is not thread-safe. This means that you cannot use the same Hibernate session on multiple threads. The actual errors are linked to the use of proxies; if an entity is loaded at some point in a thread but accessed in another, you will encounter errors (Hibernate does not synchronize the threads and block them until the entity has been loaded).
  • One way to deal with these painful problems is to disable lazy loading and load everything before the parallelized computations take place. Another is to use a session per thread rather than the same session in multiple threads. This should work but can cause performance problems (data loaded several times from the DB, etc).

Internals

  • Setting the log level to TRACE can be very useful when debugging a hard Hibernate problem. DEBUG is not enough. With TRACE you get a complete trace of the cascades that are taking place, the SQL statements that are prepared, etc.
  • To build Hibernate (at least versions 3.3 to 3.5), you need Maven. Use the commands maven compile / maven install and set your JDK to a 1.5 one for the first time. After the first compilation, you can set it back to a 1.6.x JDK as the problematic classes will be already compiled.
  • The mappings are basically represented by classes in the org.hibernate.mapping package. Mostly, the mappings are a list of PersistentClass. They constitute a meta-model that is created when Hibernate starts by the Configuration (org.hibernate.cfg.Configuration).
  • The persisters (org.hibernate.persister package) build the SQL statements, based on the mappings / meta-model. Note that many SQL strings are built only once and reused later; there is no need to build the statements at every operation during runtime.
  • Internally, a lot of things happen via event listeners. For instance, when an instance should be deleted, there is a default internal listener that will listen to a delete event (the session will trigger that event).
  • An interesting class is org.hibernate.event.def.AbstractFlushingEventListener. It prepares and orders SQL statements, so if there seems to be a bug in the sheduling of SQL statements, this is a good place to start looking / debugging.