March 12, 2010
In recent performance tuning of some EJBQL search queries, I’ve had a lot of discussions with other developers on database pagination. There are some definite nuances that you have to be aware of when using Hibernate’s pagination feature, so I thought I would explain them here.
Quick Introduction to Database Pagination with JPA
Database pagination allows you to step through a result set in manageable chunks (say 5 at a time). This is an important feature when a result set is large. Imagine if the user selects the first result of 1000. Essentially 999 out of 1000 results were wasted. This is wasteful in terms of CPU cycles on the database server, network usage, CPU cycles on the application server, and memory allocation. On the other hand, if we only loaded 10 results into memory, we’ve only wasted 9 results. As the result set grows, this problem becomes more important to address.
Database pagination with JPA is quite simple through the javax.persistence.Query. The following method invocations retrieve the first 10 results for the query:
javax.persistence.Query query =
em.createQuery("select order from Order as order
left join order.customer as customer
where customer.name like '%' || :name || '%'");
query.setParameter("name", name);
query.setFirstResult(0);
query.setMaxResults(10);
// returns 10 or less orders
List<Order> orders = query.getResultList();
The max number of results to retrieve at one time can be any number you choose. As the user pages through the data, we alter the setFirstResult(int) to retrieve the next set of results.
Query Tuning with Fetch Joins
When paging through a result-set, you may be interested in performing fetch joins to enhance query performance. This avoids the N+1 select problem when walking lazy relationships for displaying data. For example, let’s say we are working with an order management system. This order management system allows users to search for orders that have been placed by customers. Our domain would look something like the following, where an Order has one Customer and a Customer can be associated to many Orders.

This relationship could be described in the Order entity as:
@Entity
public class Order
{
// ... ...
@ManyToOne
private Customer customer;
// ... ...
}
In the search results, the users want to see both Order and Customer information on the page. Lazily loading the Customer results in a query being executed to retrieve the Customer for each Order displayed. To avoid this, we can perform a fetch join on the Customer when retrieving the Order results. Here is the resulting EJBQL:
select order from Order as order
left join fetch order.customer as customer
where customer.name = ‘%’ || :name || ‘%’
This ensures that only a single query is executed to load both the Order and the Customer results. An example of what the SQL result set might look like in this case would be:
| order_id | cust_id | cust_name |
----------------------------------------
| 1 | 1 | Jacob Orshalick |
| 2 | 2 | Nirav Assar |
| 3 | 3 | John Doe |
As you can see, each Order is associated to a single Customer which ensures a unique result set. In this case we are guaranteed that limiting the result set to 5 will always result in 5 or less unique Order results. This is generally the right solution for a @OneToOne or a @ManyToOne relationship.
Fetching One-to-many or Many-to-many Relationships
Fetching one-to-many or many-to-many relationships gets a bit tricky. The moment you introduce a fetch join for a one-to-many or many-to-many relationship, Hibernate will load all results into memory and then only return you the max number of results you requested. This is due to the semantics of SQL queries.
Going back to our example, we will likely have a list of LineItem entries for each order that tell us what Products the Customer purchased on the Order.

And the Order entity would now look like:
@Entity
public class Order
{
// ... ...
@ManyToOne
private Customer customer;
@OneToMany
private List<LineItem> lineItems;
// Getters and Setters
}
The users request that we display the LineItem entries below each Order in the search results. So we can just do another fetch join and load this data as well right? Here is the resulting EBJQL:
select distinct order from Order as order
left join fetch order.customer as customer
left join fetch order.lineItems
where customer.name = '%' || :name || '%'
Once you introduce this additional fetch into the query, Hibernate will present the following message in the log:
[org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults
specified with collection fetch; applying in memory!
This message is telling you that Hibernate is retrieving all results from the database, and then only returning the first 10 results (or the number of max results you specified). So why does Hibernate do this? Let’s have a look at an example of what the SQL result set generated from this query might look like.
| order_id | cust_id | cust_name | line_id | product_sku |
----------------------------------------------------------------
| 1 | 1 | Jacob Orshalick | 1 | 1403-1209 |
| 1 | 1 | Jacob Orshalick | 2 | 1405-1333 |
| 2 | 2 | Nirav Assar | 3 | 1300-1222 |
| 3 | 3 | John Doe | 4 | 1400-3029 |
| 3 | 3 | John Doe | 5 | 1401-1000 |
| 3 | 3 | John Doe | 6 | 1200-1000 |
Each database has it’s own SQL syntax for limiting the result set, but assuming we limit the result-set to 5 results on the database side we would only get the first 5 results. As you can see, the result set returned duplicates the Order and Customer information for each LineItem on the Order. Thanks to the way Hibernate processes these results, we would still see the 3 expected orders (order_id = 1, 2, 3), but the database would only return us 2 of the LineItem entries for John Doe’s order. This is an incorrect result from the user’s point-of-view.
Knowing this, Hibernate rightfully retrieves all results in this case and then returns you the 3 Order results with all associated LineItem entries. But, to ensure correctness, you lose the value of pagination. So will we always face the N+1 select problem when using pagination with @OneToMany or @ManyToMany relationships? Not if you consider other options from a user experience perspective.
Other Options for one-to-many Relationships
There are a number of ways to enhance performance without losing the advantages of database pagination.
Display LineItem Entries only when Requested
Technology combinations like RichFaces and Seam make this simple. Basically you can walk the lazy relationship only when the user requests this information through an AJAX request. Through use of a <rich:togglePanel> a link can be provided to expand the Order data for the user. Because Seam allows an EntityManager to span requests lazily loading this data is simple.
Another simple option is using REST and JSON to retrieve the LineItem entries through an AJAX request when accessed by the user. A simple RESTful invocation (http://my-server/order/1/lineItems) allows the LineItem entries to be retrieved for an Order and we can then parse the results and display them back to the user. RESTEasy makes this simple for any Java application.
Display the LineItem Entries on a Details Page
This is the easiest and most obvious solution. Just display high-level Order information on the search results and the user can then access a details page that provides additional details. In general, this is the solution I generally push users toward for simplicity.
Display a High-level LineItem Summary Information
Another option is to give high-level information (e.g. number of LineItem entries) on the search page, and then display all information on a detail page. With the flexibility of EJBQL, you can use aggregate functions (e.g. count(lineItem.id) ) with a group-by clause to avoid the issues with a one-to-many. But, this also generally requires introduction of DTOs to hold the query result data or additional parsing of the result set.
Performance Tuning Always has Trade-offs
As I always say when discussing performance tuning, there are always trade-offs. Whether it’s additional complexity or changes to user experience, we always have to consider the implications of tuning our applications.
Enjoyed this post? Share it! |
Posted in Hibernate, JPA, Java
1 Comment »
August 4, 2009
It seems common for developers to look for faults in established technologies when an issue is encountered. As a developer struggling to learn an unfamiliar technology, it is easy to make mistakes. These mistakes lead to frustration which, in turn, can lead the developer to blame the technology rather than taking the time to analyze the problem and identify the real issue.
Anyone can fall into this trap and I still feel myself being pulled by this temptation at times. When you are working with an established technology, it’s always necessary to realize that the problem is almost undoubtedly in your code, not the technology you are using. As we learn the technology and continue to increase our depth of knowledge, this temptation lessens with each problem encountered. Let’s take driving as an example.
When I was learning to drive, I was a poor driver (most teenagers are). To the frustration of my parents, I had several accidents, fortunately all were minor. After having an accident I was frustrated with myself for the mistakes I made while operating the vehicle. I didn’t have the gall to blame the vehicle, claiming faulty brakes or the steering went out… Sure, these were possibilities, but unlikely. It was much more likely that my inexperience in operating the vehicle led me to make poor driving decisions.
With this mind set I continued to drive and improve my skills through experience behind the wheel. Over time I built a kind of intuition while driving that helped me avoid those mistakes. At some point I may realize that a different vehicle may be easier to drive, more suited to accident avoidance, safer, etc, but I can only come to that realization as an experienced driver.
So by taking this same stance when learning an unfamiliar technology, we can become better developers. Struggling through the mistakes and the frustration while gaining experience helps build the intuition of a top-notch developer. So, the next time you find yourself blaming the unfamiliar technology you are working with out of frustration, stop and think about it. Are you really frustrated by your inexperience or is the technology really to blame?
So let’s review some techniques that can help us when attempting to solve problems encountered with an unfamiliar technology. I also covered several of the Agile practices mentioned here in my previous article, 5 steps to improving your development process.
Read the Manual
I can’t tell you how many times I have solved a problem by simply reading the portion of the manual describing the API I am attempting to use. This may seem like common sense, but unfortunately time constraints often seem to get the best of developers leading them to copy-and-paste Programming by Coincidence.
Isolate the Problem
While this may be an obvious approach to problem-solving, it may be easier said than done depending on your approach to development. TDD (Test-driven development) is a very effective technique to isolating the problem. If you know what you have already done is working as designed, you can be reasonably sure that the issue is isolated to the code just added. Now try to write a test to prove the issue with the newly added code which can drive you to the root cause of the issue.
Commit Regularly
Committing regularly to your VCS (Version Control System) is a necessity to obtain the benefits of continuous integration. This technique is also effective for allowing you to revert back to a working revision if you find yourself too far down the rabbit hole. I often use this approach to attempt various hypotheses to solving the problem. If a hypothesis proves incorrect, simply roll back and try the next hypothesis.
Logging is your Friend
I read an article recently on DZone by Bharath Ganesh that reviewed this basic principle. Look at the logs! While stack traces don’t always provide the most obvious indication of the issue, they at least give you something to search for. For an established technology, simply Google’ing the exception message will often provide at least an inkling of what the issue is related to.
Maintain a Positive Attitude
Sometimes this is easier said than done, but it is important to remain determined to solve the problem. This may require taking a break, discussing the problem with a co-worker, sleeping on it, etc.
What techniques do you recommend for problem solving with an unfamiliar technology?
Enjoyed this post? Share it! |
Posted in Agile
4 Comments »
April 6, 2009
I keep reading discussions regarding the performance of Seam applications. These discussions are generally centered around the performance overhead of the interception techniques used by Seam. While this is definitely a valid issue in certain scenarios, see this excellent forum discussion started by Tobias Hill, many tend to blame Seam too quickly for their performance issues. If it is taking many seconds or even minutes to load a page, in most cases your application is more likely to blame than Seam.
In my experience, most performance issues stem from data access. Improperly tuned queries (a common culprit) and not using the second-level cache of your ORM provider when appropriate can lead to some serious performance implications in your application. While second-level caching is nothing new, here I will describe why it is important to a Seam application and how you can improve performance using Hibernate’s second-level cache provider.
Before I go any further, note that second-level caching is not the only caching solution you have available if you are using Seam. Seam provides a multi-layer caching solution that allows you to cache page fragments and objects easily while abstracting away the details. You can read all about Seam’s multi-layer caching solution in Chapter 34 of Seam Framework: Experience the Evolution of Java EE.
Loading Reference Data
Seam provides an elegant solution to the common problem of associating entities based on a dropdown selection. Take the common booking example with Seam. We are attempting to book a Hotel and we need to input credit card information. The type of credit card is likely to be a dropdown, but that dropdown is going to need to associate to a CreditCardType entity.
@Entity
public class CreditCardType implements Serializable
{
@Id
private Long providerId;
private String description;
// ... ...
}
Our Booking class then needs a reference to the CreditCardType class.
@Entity
public class Booking implements Serializable
{
@Id
private Long id;
// ... ...
@ManyToOne
private CreditCard creditCard;
// ... ...
}
To make this task simple, Seam provides the <s:entityConverter /> component which ensures that the user selection is converted to an entity for association with your object.
<h:selectOneMenu id="creditCard" value="#{booking.creditCard}"
required="true">
<s:selectItems noSelectionLabel="" var="type"
value="#{creditCardTypes}"
itemLabel=”#{type.description}” />
<s:convertEntity />
</s:selectItems>
</h:selectOneMenu>
As you can see this is quite simple, but we need to load the creditCardTypes into the conversation context in order to associate an instance to our entity. This is because the creditCardTypes need to be managed instances in the conversation-scoped persistence context. It is quite simple to accomplish this through a @Factory method scoped to the conversation.
@Name(“bookingAction”)
@Scope(CONVERSATION)
public class BookingAction implements Serializable {
// ... ...
@In private EntityManager entityManager;
@Factory(“creditCardTypes”)
public List<creditcard> loadCreditCardTypes()
{
return entityManager.createQuery("select c from " +
"CreditCardType as c order by c.description").getResultList();
}
// ... ...
}
Great, so now we can load our entities into the context and associate them using a dropdown, so what’s the catch? The factory method only executes once, right? The problem is that the query that loads the CreditCardType instances into the conversation context executes every time a new conversation requests the dropdown list. This can cause the initial page load to lag.
This may not be a problem in this simple case as we only have this one dropdown, but what if we have many dropdowns on the screen? Even further, what if this dropdown list is used by several conversations? Doesn’t it seem wasteful to hit the database every time we need it? We can avoid the database hit and still achieve the same benefits by using second-level caching.
Second-level caching with Hibernate
Second-level caching is intended for data that is read-mostly. It allows you to store the entity and query data in-memory so that this data can be retrieved without the overhead of returning to the database. You can configure the cache expiration policy, which determines when the data will be refreshed in the cache (e.g. 1 hour, 2 hours, 1 day, etc.) according to the requirements for that entity. An entity like CreditCardType is certainly read-mostly so it is definitely a good candidate for the second-level cache.
Using Hibernate, it is quite simple to cache an entity by using the @Cache annotation.
@Entity
@Cache(usage = CacheConcurrencyStrategy.READ_ONLY)
public class CreditCardType implements Serializable {
// ... ...
}
We then need to include the jars necessary for a second-level cache provider. I tend to use Ehcache as I find it simple to use and it is fully supported by Seam’s multi-layered caching solution.
Once you include the appropriate jars, you must configure Hibernate to use second-level caching. In your persistence.xml file, add the following properties for your persistence-unit definition.
<persistence-unit name="myBookingDS">
... ...
<properties>
<property name="hibernate.cache.provider_class"
value="org.hibernate.cache.EhCacheProvider" />
<property name="hibernate.cache.use_second_level_cache"
value="true" />
<property name="hibernate.cache.use_query_cache"
value="true" />
... ...
</properties>
<persistence-unit>
The hibernate.cache.provider_class should be specific to the cache provider you are using. Hibernate supports a number of implementations as described in the reference documentation.
Notice that we also set hibernate.cache.use_query_cache to true. This allows us to take the caching a step further by caching the query itself and not just the entities. In order to cache the query, we can take two approaches: use the Hibernate Session API or the Hibernate @NamedQuery annotation. Let’s look at the Hibernate Session API approach first. Our factory method above changes to the following:
@Name(“bookingAction”)
@Scope(CONVERSATION)
public class BookingAction implements Serializable {
// ... ...
@In private EntityManager entityManager;
@Factory(“creditCardTypes”)
public List<CreditCard> loadCreditCardTypes()
{
Session session = (Session) entityManager.getDelegate();
Query query = session.createQuery("select c from " +
"CreditCard as c order by c.description");
query.setCacheable(true);
return query.list();
}
// ... ...
}
Now you will notice in the logs that once the creditCardTypes have been loaded, even a new conversation does not cause a database call the next time these entities are requested. The query and the entities are loaded directly from the second-level cache in-memory.
The other approach is to use the Hibernate @NamedQuery annotation which gives the option to cache your query.
@Entity
@NamedQuery(name="getCreditCardTypes",
query="select c from CreditCard as c " +
"order by c.description",
cacheable=true)
public class CreditCardType implements Serializable
{
@Id
private Long id;
private String description;
// ... ...
}
The @NamedQuery can then be retrieved through the createNamedQuery() method in the EntityManager API.
While we are only showing one scenario here, there are many cases where second-level caching can be applied in your application.
No silver bullet
By no means am I claiming here that second-level caching is the solution for every scenario. Performance tuning is somewhat of an art. It is definitely handy to know the various potential hot spots when tuning an application, but a solution that works in one case may not work in others. Simply read up on the various approaches and techniques to tune your application so that you can apply each technique when the time is right.
Enjoyed this post? Share it! |
Posted in JBoss Seam
10 Comments »
March 31, 2009
Michael Yuan and I will be answering questions about the book and Seam in general at JavaRanch this week in the JBoss forum. If you would like to ask us a question feel free to stop by! They will be selecting four random posters in the forum to win a free copy of the book provided by Prentice Hall. We look forward to a good week of questions and hope to see you there!
Enjoyed this post? Share it! |
Posted in JBoss Seam
No Comments »
February 23, 2009
As a follow-up to the Core Seam Refcard, DZone has now released my companion reference for using Seam with JSF. The Seam UI Refcard has now been released through the DZone Refcardz site and includes:
- Simplifying JSF
- Page Navigation
- JSF Component Annotations
- JSF Component Tags
- Hot Tips and more…
So download the Seam UI Refcard here and please send your comments and feedback to refcardz@dzone.com. For in-depth coverage of Seam 2.1, you can also purchase the just released Seam Framework: Experience the Evolution of Java EE.
In a related story, JavaLobby posted an interview with me to coincide the release of the reference card. Check it out!
Enjoyed this post? Share it! |
Posted in JBoss Seam, JBoss Seam 2E News
7 Comments »
February 19, 2009
I just got my hands on a hard copy of my book, Seam Framework: Experience the Evolution of Java EE. For those who have been awaiting the paperback release, you can order your copy today from Amazon or Barnes and Noble. The book is the second edition of the best-selling JBoss Seam covering the latest and greatest features of Seam 2.1 and Web Beans (JSR-299). Check it out!
Enjoyed this post? Share it! |
Posted in JBoss Seam, JBoss Seam 2E News
4 Comments »
February 4, 2009
Improving your development process takes time and effort, but always pays off in code quality and professionalism. When consulting for an organization I always recommend the following 5 steps that I feel are essential to any development process. You may already be using these recommendations in practice as they are all well-known techniques and tools so I certainly applaud anyone who can check off each step. Note that each step requires learning, practice, and discipline, so if you are introducing something new, take it a step at a time. Process improvement doesn’t happen over night.
1. Write tests for your code, then write more tests

I won’t get into the religious battles generally surrounding Test Driven Development (TDD), but regardless of what you believe you should be writing tests for your code. Professional software developers write tests and run the test suite after making logic changes. As Robert C. Martin put it in Clean Code, “Code is never clean unless it has tests.” Enough said.
What it requires:
- A unit testing tool such as JUnit or TestNG. I also recommend a code coverage analysis tool for measuring how well you’re doing with your testing such as Cobertura or Emma.
- Learning to write testable code (e.g. small methods that generally have a single responsibility).
2. Don’t let your code changes become stale, commit regularly

Don’t hang onto your changes locally for an extended period of time. For continous integration to flush out issues early, you must integrate your code with other developers often. If you follow the technique of continuous commits, it will re-enforce itself on other members of your development team over time.
Merging is a pain to say the least. In a continuous commit environment, the longer you hold onto changes, the more likely you will have to merge. Other developers will soon realize that if they do not follow this approach, you are going to make their life miserable.
What it requires:
- Breaking your work into smaller chunks that can be completed in reasonable time periods.
- Developing the habit of committing as you finish a logical chunk of code.
3. Setup a Continuous Integration (CI) server

Continuous integration (CI) is nothing new, but I’m always surprised at how many environments I encounter that are still not using this vital development technique. Setting up a CI server is easier than ever thanks to some outstanding open-source projects.
What it requires:
- A build that is autonomous, i.e. doesn’t rely on external input such as compiling the source files in Eclipse, to achieve the goal of building the project artifact(s).
- A CI server implementation and a build machine to install the server on. Popular implementation choices include Hudson and CruiseControl (the original).
4. Automate scheduled releases to a production environment clone

Aka: Release Early, Release Often. The CI server can deploy releases to a production environment clone based on the current stable code in the VCS. If possible, this deployment should be fully automated and should generally occur at a specific time.
To avoid downtime for user testing it is not recommended that this occur on each commit. As the size of the team grows this can become quite a hindrance with continuous commits. Deploying a release somewhere between nightly and weekly at a time when usage is low is generally reasonable.
What it requires:
- A way to deploy the application from the CI server to the production environment clone. With the hot deployment features of JBoss, I generally use SFTP and Hudson provides a great plugin for this.
5. Add a project artifact repository

This is probably the most controversial of the steps outlined here. The concept of an artifact repository is probably familiar to most people given the broadened use of Maven, but may still be unfamiliar to some. Maven repositories allow you to store project artifacts including POMs, jars, wars, ears, etc. in a structured format that incorporates naming conventions and versioning. This allows you to archive versions of your software products over time while automating dependency management as dependencies and versions change.
What it requires:
- A server to install the artifact repository with the disk space necessary to store archives over time. Popular server choices include Artifactory and Nexus.
- Integration of dependency management into your build. If you are using Ant, the Ant Tasks for Maven are useful or you can use Maven itself. Ivy is also a popular choice.
So what would you recommend? What processes and tools do you feel are essential to all development teams?
Enjoyed this post? Share it! |
Posted in Agile
5 Comments »
November 30, 2008
All early access chapters for Seam Framework: Experience the Evolution of Java EE have now been released through Safari Rough Cuts. The only book specifically covering Seam 2.1 now includes new chapters on Seam 2.1 security, an introduction to Web Beans (JSR-299), multi-layer caching, using Maven with Seam, and much more! You can download the source code or learn more here.
These chapters are still undergoing copy-editing so if you would like to pre-order the final print edition, you can order from amazon today. The final edition is scheduled for print release in February. Check it out!
Enjoyed this post? Share it! |
Posted in JBoss Seam, JBoss Seam 2E News
No Comments »
November 24, 2008
Have you been searching for a quick reference for setting up and configuring your Seam applications? Well search no more! The Core Seam Refcard has now been released through the DZone Refcardz site and includes:
- Component annotations
- Seam-gen commands and configuration
- Conversation management
- Common components.xml configuration
- Seam security
- The Seam application framework
- Hot tips and more…
So download the Core Seam Refcard here and please send your comments and feedback to refcardz@dzone.com. For in-depth coverage of Seam 2.1, you can also purchase the upcoming Seam Framework: Experience the Evolution of Java EE. Enjoy!
Enjoyed this post? Share it! |
Posted in JBoss Seam, JBoss Seam 2E News
3 Comments »
September 29, 2008
Just a quick note to let you know that I will be speaking next week on October 8 at the Java MUG in Dallas. I will be talking about how Seam has simplified JEE web development and influenced the revolutionary Web Beans specification (JSR-299). Hope to see you there!
[Presentation Slides]
Enjoyed this post? Share it! |
Posted in JBoss Seam, Web Beans
6 Comments »
Comments
March 12, 2010
Nice article. Good work.
June 17, 2008
Awesome, thanks for posting that here Maciej! I would recommend creating a JIRA...
June 17, 2008
I am also a great fun of reCaptcha project and Jboss seam, so I tried to integrate them...
December 13, 2007
As you’ve mentioned above, the event org.jboss.seam.conversationTim eout...
December 13, 2007
Hi again, Thanks for the answer. However, this does not allow me to get the conversation...