Hibernate: Difference between revisions

From Elvanör's Technical Wiki
Jump to navigation Jump to search
Line 68: Line 68:


* Note that you cannot have two one-to-one bidirectional mappings to the same class (this is also true for many-to-one). This is because there is no true bidirectionality in the relationship, in the sense that the target object cannot know to which property it belongs in the object that has two links. At the database level, the table of one object will hold two join columns. From this object you can correctly find the target objects by looking at the join columns which will hold the target objects ids. But from a sample target object, you cannot go back to the other object as there is no way for you to know which one of the join columns you must query.
* Note that you cannot have two one-to-one bidirectional mappings to the same class (this is also true for many-to-one). This is because there is no true bidirectionality in the relationship, in the sense that the target object cannot know to which property it belongs in the object that has two links. At the database level, the table of one object will hold two join columns. From this object you can correctly find the target objects by looking at the join columns which will hold the target objects ids. But from a sample target object, you cannot go back to the other object as there is no way for you to know which one of the join columns you must query.
* Even if you add a column on a target object to store the owner's id, this does not achieve true bidirectionality: you can find the owner object, but you cannot know (from the point of view of the target) to which link it refers.


== SQL Types ==
== SQL Types ==

Revision as of 02:52, 24 November 2009

Links and documentation

Some nice Hibernate documentation links:

Useful books:

  • Java Persistence with Hibernate (Manning)
  • Pro Hibernate 3 (Apress)

Installation and Configuration

  • List of mandatory JARs:
 antlr.jar
 cglib.jar
 asm.jar
 asm-attrs.jars
 commons-collections.jar
 commons-logging.jar
 hibernate3.jar
 jta.jar
 dom4j.jar
 log4j.jar 

Without these jars on the classpath, Hibernate won't work well (or at all). There are lots of optional jars (for second level caching, etc) as well.

  • Configuration of Hibernate is simple: you must basically create a SessionFactory object. Normally the best way to configure the session factory is to

use an XML file (hibernate.cfg.xml). If you use the Hibernate annotations to map your classes, create an AnnotationConfiguration object instead of the Configuration object:

SessionFactory sessionFactory = new AnnotationConfiguration().configure().buildSessionFactory();

Mappings

  • Mappings and especially associations can be hard to understand at first.

Concepts

  • The concept of bidirectionality is *very* important. A bidirectional association means that links exist at both sides - the link can be navigated in both directions. Note that sometimes things that are possible with unidirectional associations are hard or impossible to do with bidirectional associations (usually because actually, in the Java underlying model the association is not truly "bidirectional", eg. there is some missing information on one side of the link).
  • The owning side in the case of a bidirectional association (one-to-one) is the side that will update the relationship when saved to the DB. The join column holding the foreign key will be present in the owner side. Note that the non-owning side should *NOT* have a column referring to the owner - in Grails there is this column, but it is useless.
  • Note that the concept of bidirectionality usually implies (for a many-to-one mapping) than one side has a reference (simple link) to another object, and this object has a collection of objects. The bidirectionality is optional in the sense that using collections on an object is not necessarily. You could always write queries to find all the objects that are the child of a given entities. However, using a collection explicitly states at the Java object models the bidirectional relationship.
  • This is also the reason for which many-to-one unidirectional associations are frequent, but one-to-many (eg a collection with no inverse links) are less used.
  • Hibernate differentiates between entities and components. Entities have their own lifecycle, and can support shared references, while components are similar to value types and their lifespan is equal to the lifespan of their containing parent. Components can thus be useful when the object is clearly bound to another object. Relatively speaking, components are of lesser importance than full domain entities.

Embeddable Components

  • The concept of component is useful but has several limitations.
  • First, an entity can have a collection of components, however the embedded components cannot have a collection in that case.
  • If used as a key in a Map collection mapping, an embedded component also cannot own collections. It seems to me that for an embeddable component used as a Map key, the @MapKey annotation does not work fully as expected. In particular it seems the column attribute of this annotation has no effect. Hibernate will create the key column name based on the name of the properties of the embedded component.
  • A component used as Map key is a rather complex case; most of the times I think using one of the target entity (or component) property as a key is better. This is perfectly possible in Hibernate.
  • A component owning a collection can only be embedded into a class once. It seems embedding into several different classes is also not possible.
  • With entities these problems are not present, and with the delete-orphan cascade option entities behave almost like embeddable components.

Limitations

  • Note that you cannot have two one-to-one bidirectional mappings to the same class (this is also true for many-to-one). This is because there is no true bidirectionality in the relationship, in the sense that the target object cannot know to which property it belongs in the object that has two links. At the database level, the table of one object will hold two join columns. From this object you can correctly find the target objects by looking at the join columns which will hold the target objects ids. But from a sample target object, you cannot go back to the other object as there is no way for you to know which one of the join columns you must query.
  • Even if you add a column on a target object to store the owner's id, this does not achieve true bidirectionality: you can find the owner object, but you cannot know (from the point of view of the target) to which link it refers.

SQL Types

  • If you need to store large Strings, you need a TEXT column definition (at least in MySQL). This can be achieved via the following annotation:
@Column(name = "message", columnDefinition = "TEXT")

I am not sure this is portable between various DBs though.

  • Note that the length annotation works as expected and generates a VARCHAR column of the given length. But there is a maximum for VARCHAR columns, and don't forget that if you use an UTF-8 engine, MySQL needs up to 3 bytes per character, thus the maximum (65536) is actually divided by 3.
  • In MySQL, even the TEXT column is quite limited (it seems like it is 65K). You may have to use MEDIUMTEXT or LONGTEXT.
  • Do not use Float objects in Java as they are mapped to the FLOAT SQL type. The precision is then very low and will cause some problems. Use Double, or even better, most of the times you should use a BigDecimal in Java (will be mapped to a DECIMAL SQL type in MySQL). BigDecimal allows you to do precise computations.

Remarks

  • If you specify a one-to-many or many-to-many mapping, be sure to use the same mapping as the Java collection interface. A Set mapping won't work with a List collection, for example.
  • When using inheritance with a table-per-hierarchy strategy, the class name is saved as a string in a class column on the table. Be careful that the full class name is saved, with the package, so if later you change the package, you have to make some manual SQL modifications. It's thus better to carefully plan the name and package in advance.
  • You should map an enum property with the @Enumerated annotation. I am not sure this is totally required though - it seems to work fine without.
  • If you need to map a BigDecimal (or even Double) that may be equal to Infinity, the best thing to do in many cases is to assume a null value corresponds to infinity. This maps well to the DB; if you use a Double and the object is actually the Java constant for infinity, Hibernate will have trouble putting this in the DB and especially getting it back as a proper infinity object.

Managing objects and their lifecycles

Using a session and transactions

  • When you close a session, it is *not* automatically flushed. You must either flush it manually or wrap your operations in a transaction (which is the recommanded way).
  • It is recommended to wrap all database operations in Hibernate transactions. The Hibernate transaction API then wraps other transactions. If you run in a non-managed container (like Tomcat), JDBC transactions will be used. If you run in a full JEE server with Java JTA support, you should then use JTA transactions (and I think Hibernate transactions will wrap JTA ones anyway).
  • Note that using getCurrentSession() on a SessionFactory essentially means that you bind your session to a transaction. If it is a JTA transaction, nothing special is required (you open the JTA transaction, then just call getCurrentSession() whenever you need to use the session). If you don't use JTA (this means you use JDBC transactions), you must first call getCurrentSession() to obtain a session and then immediately create a transaction from that session (using Hibernate API). It's impossible to do anything with a session obtained via getCurrentSession() until you open a transaction.
  • If using JDBC, you must also specify a configuration option (following example in XML):
<property name="current_session_context_class">thread</property>

The jta value would bind the session to JTA transactions.

  • Implementing long sessions (sessions that span multiple transactions) is only possible if you manage the session yourself - thus you don't use getCurrentSession(). You must use sessionFactory.openSession().
  • Finally, note that you can have fully managed EJB transactions (no programmatic demarcation necessary; demarcation occurs via annotations). This is called CMT (container managed transactions) as opposed to BMT when you directly use (programmatically) JTA.

Designing a Java Hibernate object

  • Hibernate objects must be Java beans (eg, proper setters and getters, and a no-argument constructor). You should avoid to have code in the constructor, as this code will be executed each time an object will get hydrated by Hibernate from the DB. In particular, avoid setting relationships and values in the constructor - this won't be an error, but will be totally useless as Hibernate will override this every time.
  • The correct way is to have a setup() function that you call everytime you create a new Hibernate object in your client code. In fact, the constructor should remain the default one.
  • Note that Groovy objects are already Java beans so will work fine with Hibernate.

Modifying an object's state

  • Using evict() on an object and then resaving (updating) the object via saveOrUpdate() seems to produce strange results. It's better not to use evict() if you plan to update the object afterwards.

Detached instances

  • To reattach a detached instance to the session, you can use either update(), which will schedule an SQL UPDATE statement when the session is flushed, or merge(). update() will only work if the session does not contains already the object, while merge() will always work regardless of the session state.
    • Be careful, as the object may in fact already be loaded via other objects into the session! Thus it's hard to be always sure that your object is not yet in the session.
  • If you don't want to issue an UPDATE statement, you can use lock(object, LockMode.NONE). In this case, you must be sure however that the detached instance has not been modified.

Locking

  • If you get StaleObjectExceptions in your code, this means that you have a concurrency problem. Typically, you access one object in a transaction, try to update it, while in another transaction the same object was already modified. In Grails, since there is a session per request model, this typically means that two HTTP requests were fired at the same time. Thus it is most probably a JavaScript error.
  • This can also happens if you have a detached instance that you try to reattach, but the SQL data row was deleted in the meantime.

Collections

  • Never try to associate a collection reference to more than one object. This will cause lots of errors. If you need to, you can copy all the objects from a collection to another one. This applies even if you deleted the object containing the collection reference first.
  • Don't call empty() on a Hibernate collection! The correct Java method name is "isEmpty", but the empty() method exists for the Hibernate implementation. It should not be used though. Usually this won't be a problem in Java, but it can be harder to spot in Groovy.

Hibernate objects in an HTTP session

  • It is not recommended to store Hibernate managed objects in the HTTP session, if you implement the "session per HTTP request" pattern. When you will work with your objects later, these will be considered as detached instances and this will cause all kinds of pain (lazy initialization errors for their collections, if the collection was not originally fetched).
  • Reattaching the instances is therefore mandatory; using get() is not a good option as all the fetching work will take place again (better then to store only the object id in the session). Calling update() or merge() is not appropriate either as they will also pull data from the DB. The only nice option is to call lock(LockMode.NONE). This won't schedule an INSERT or UPDATE from the DB.
  • Another big disadvantage is that objects stored in the HTTP session may change after a rolled back transaction. Even if the underlying object/row in the DB won't be touched, any operation on the HTTP session object cannot of course be rolled back, so you would have to clean the mess yourself.
  • The other option is to only store in the session "light objects", that are not Hibernate managed instances. This has some nice advantages and should be considered as a solution every time.

Casting an object to one of its subclass

  • To understand why this is not possible directly in Hibernate, consider the fact that this is not possible in the underlying layer (the JVM). In Java you cannot change dynamically the class of an object at runtime. The same holds for Hibernate: there is no easy way to change the class of a persistent object (even to one of its subclasses).
  • So, in order to actually achieve this, proceed as you would with a Java object: create a new object and copy the properties from the previous object to this new one. In Hibernate you also need to delete the old object as you accomplish that (generally flushing the session before saving the new one, because of unique constraints). Wrap this in a transaction as you want to revert and avoid any change to the DB should an exception occurs.
  • This kind of situation should be avoided (or rarely used). Once set, an object should not change its class. If need be, redesign your data model.
  • Note that such a cast can be however accomplished easily using HQL / SQL... If you use a table per hierarchy mapping strategy, you can easily change the type of the object by changing the column keeping the class name. Other difficulties may arise though, you may need to update other fields as well.

Older remarks

  • If you have an object and a many-to-one mapping on this object (such as Item that belongs to a Category), and you happen to have the ID of the Category but not the associated Java object, in order to store the item on the database the fastest way is to create a new virtual Category and call setID(known_id) on this newly created object. Then, once you save the object, the virtual Category will not be saved (updated) as it already exists. For this to happen (eg, for the object not to be overriden), you must use the ID of the Object as the equals() method. This is not recommended. Note that by default Hibernate distinguishes between objects by using their addresses in memory.
  • Update: The previous remark would work, but is not the recommended way. The correct way is to use session.load(Category.class, category_id) and use a proxy. This means the object won't actually be fetched from the DB, but you'll be able to use it as a Java object. This is nice. By default, a mapping uses a proxy.

Cascade

Concept and Notes

  • The concept of cascade implies somehow that one object "belongs" to another one. When saving, updating, or deleting the parent object, the child will also be saved, updated or deleted.
  • Apparently, it is not possible to cascade in the other direction, that is, from the child to the parent. Saving a child when the parent is not yet saved won't persist the parent.
  • If you use cascading deletes in a circular way (eg, A has link to B, which links to C, which link back to A), you may get into problems. For example, if you want to delete A, hoping that all Bs and Cs will get deleted too, you will run into an exception. When C will try to get deleted, A will be already transient, and Hibernate will complain. You thus have to delete C manually before deleting A in this case. Another possibility would be to use for example Grails event system, using a beforeDelete event on B to first remove the Cs.
  • The "merge" cascade option seems to not always work as expected. Normally merge() returns a new instance, so I would have expected that the associated (referenced) objects merge() cascades to would also be new instances. However it does not seem to be always the case (bug maybe); eg. the referenced object from the new instance is the same object (in-memory) as the one referenced from the initial instance merge() was called upon.

Cascade Options

  • Here is the list of all possible cascade options:
    • none: no automatic action on the referenced object takes place. This is the default if no cascade behaviour is set.
    • persist: Cascade any persist() operation across this relationship. Note that there is a error in the reference manual where this is called create.
    • merge: Cascade any merge() operation across this relationship.
    • lock: Cascade any lock() operation across this relationship.
    • evict: Cascade any evict() operation across this relationship.
    • replicate: Cascade any replicate() operation across this relationship.
    • refresh: Cascade any refresh() operation across this relationship.
    • save-update: If save(), update() or saveOrUpdate(), is called on the referencing object, automatically call saveOrUpdate() on all referenced objects.
    • delete: automatically delete the referenced object(s) when delete() is called on the referencing object. Note that, if the referencing object is not deleted but merely removes its reference to the referenced object, then this option will not do anything and, potentially, a garbage (or orphan) object will be left in the database.
    • delete-orphan: automatically delete any object who has been removed from a collection. This option is only available for collections, not associations, and implies that there are no shared references to the object removed from the collection. This effect can happen only when removing an object from a collection.
    • all: cascade all operations, but do not take the action of delete-orphan.
    • all-delete-orphan: cascade all operations, and take the action of delete-orphan as well.

Querying

Criterias

  • Criteria offer a very object oriented querying API. In particular, it can easily query collections (eg, test if an element of a collection matches certain conditions). The Grails criteria builder is also very convenient.
  • Note than when using a nested criteria (to define a constraint on a collection), you cannot use simple object equality. You cannot express simply: find all the instances of this class that have this element in their collection. You must use idEq which uses the object id.

HQL

  • HQL is similar to SQL but much more adapted to Hibernate. HQL can do almost anything: the need to revert to existing SQL should be extremely, extremely rare. When you cannot use Criteria (complex query where you may need to use operators such as coalesce or such), use HQL.
  • Here is an example of an inner join, with a select clause (select fetches a property of the initial domain class, via the join):
"select decorations from LayoutElement as element inner join 
element.decorations as decorations where decorations.shopFront = ? and element = ?"
  • You can access Maps or other collections values by using the [] notation in HQL, but this works only in the where clause.
"from Decoration decoration where decoration.myMap['hello'] = 30"
  • You can refer to the index (or key) of a collection by using the index() HQL function:
"from Decoration decoration where index(decoration.myMap) = 20"
  • Unfortunately, bug #1930 apparently prevents the use of index() in a with clause on a join.

Subqueries

  • count() cannot be used directly on a subquery. Instead you need code such as:
from com.shoopz.shopengine.ReferenceItem as reference where (select count (*) from ShopItem as item where item.reference = reference) > 1")

Query API

  • If you want to bind a parameter corresponding to an Enum class, use setParameter(), not setEntity().

Performance Tuning

  • In Hibernate, lazy loading is on by default for collections. This means the collections are fetched from the database only when they are used (never if they are not used). Warning: if the loading appears when the Hibernate session is already closed, it will fail.

Connection Pooling

  • It is intended that Hibernate will use a third party connection pool manager. A connection pool manager manages JDBC connections to the database. By default, Hibernate pool manager is not very performant (according to the documentation).
  • Open source connection managers include commons-dbcp (Apache), C3P0 (apparently better with very busy sites) and Proxool. If used with Spring, you can configure data source beans that will be later referenced by the session factory.

Referential Integrity

Database Requirements

  • The InnoDB engine supports referential integrity (database consistency) by the use of foreign keys constraints. MyISAM does not support this feature. When a FK constraint is in place, the database will prevent you from deleting some rows (or will prevent wrong UPDATEs etc). Normally you can specify Hibernate which FK constraint to use.
  • Note that if you convert your DB from MyISAM to InnoDB Hibernate won't recreate all the foreign key constraints automatically. You need to update them manually (or to drop all the tables and let Hibernate recreate them). I am not entirely sure of this; it seems in some cases the constraints are recreated.

Creation of database structure by Hibernate

  • If used with an underlying DB engine that supports this, Hibernate creates some foreign key constraints when started up. It appears the name of these constraints are based on the fully-qualified name of the mapped class. This means that if the class packages changes, or the name changes (without a remapping to another table), the foreign keys will be recreated automatically by Hibernate, and the previous ones (based on the old name) will be left around. To keep the DB clean you should remove the old constraints.

Transactions

  • Real database transactions are only possible if the underlying DB supports them. This means that with MySQL MyISAM tables, transactions are very limited. In an Hibernate context, nothing is written to the DB unless the session is flushed. When it is, however, everything is written and cannot be rolled back later.
  • Bottom-line: use InnoDB if you need transactions (and transactions are almost mandatory when using an ORM I would say).

Hibernate & EJB3 Annotations

  • The EJB specification indicates whether the property should be accessed by field or by getter method. This depends on where you put the annotation. You should put it before the field if you want field access, and before the getter method to get property access.
  • The best documentation for Hibernate annotations is the reference Hibernate documentation, but also the javax.persistence API which documents all possible official EJB annotations.
  • Here is an example of a (Groovy) annotated class:
@Entity()
@Table(name="shop_category")
class ShopCategory
{
	@Id
	Long id
	
	@ManyToOne
	ShopCategory parentCategory
	
	String name
	
	@OneToMany(targetEntity = ShopItem.class, mappedBy="category")	
	Set items
	
	@OneToMany(cascade = [javax.persistence.CascadeType.ALL], targetEntity = ShopCategory.class, mappedBy="parentCategory")
	Set subCategories
}

Bidirectional One-to-one

  • If the association is mandatory, use optional = false on the @OneToOne annotation. This will make the foreign key column not null (no need to add nullable=false on the @JoinColumn annotation). It will also set an UNIQUE constraint on the foreign key column (this is strange, I think it should be always set, not only for optional associations - you can add it via unique=true if the association is optional).
@OneToOne(cascade = [CascadeType.ALL], optional = false)
@JoinColumn(name = "passport_id")
PassportAnnotated passport

Warnings

  • "keys" is a reserved SQL keyword. Thus no variable of a persistent class (or embedded component) should be named "keys" (or at the very least, you need to use the @Column annotation to specify a column name).