For some time now, most of the main implementations of JPA, like Hibernate, EclipseLink, OpenJPA or DataNucleus, offered ways to generate database schema objects. These include generation of tables, primary keys, foreign keys, indexes and other objects. Unfortunately, these are not standard between the implementations, when dealing with multiple environments. Only in the latest JPA 2.1 specification, the Schema Generation standardization was introduced.
From now on, if you are using Java EE 7, you don’t have to worry about the differences between the providers. Just use the new standard properties and you are done. Of course, you might be thinking that these are not needed at all, since database schemas for environments should not be managed like this. Still, these are very useful for development or testing purposes.
Schema Generation
Properties
If you wish to use the new standards for Schema Generation, just add any of the following properties to your properties
section of the persistence.xml
:
Property | Values |
---|
javax.persistence.schema-generation.database.action
Specifies the action to be taken regarding to the database schema. Possible values are self-explanatory. If this property is not specific no actions are performed in the database. | none, create, drop-and-create, drop |
javax.persistence.schema-generation.create-source
Specifies how the database schema should be created. It can be by just using the annotation metadata specified in the application entities, by executing a SQL script or a combination of both. You can also define the order. This property does not need to be specified for schema generation to occur. The default value is metadata. You need to be careful if you use a combination of create actions. The resulting actions may generate unexpected behaviour in the database schema and lead to failure. | metadata, script, metadata-than-script, script-then-metadata |
javax.persistence.schema-generation.drop-source
Same as javax.persistence.schema-generation.create-source, but for drop actions. | metadata, script, metadata-than-script, script-then-metadata |
javax.persistence.schema-generation.create-script-source, javax.persistence.schema-generation.drop-script-source
Specifies the target location to a SQL script file to execute on create or drop of the database schema. | String for the file URL to execute |
javax.persistence.sql-load-script-source
Specifies the target location to a SQL file to load data into the database. | String for the file URL to execute |
Additionally, it’s also possible to generate SQL scripts with the Schema Generation actions:
Property | Values |
---|
javax.persistence.schema-generation.scripts.action
Specifies which SQL scripts should be generated. Scripts are only generated if the corresponding generation location targets are specified. | none, create, drop-and-create, drop |
javax.persistence.schema-generation.scripts.create-target, javax.persistence.schema-generation.scripts.drop-target
Specifies the target location to generate the SQL script file to create or drop of the database schema. | String for the file URL to execute |
Samples
The following sample, drops and creates the database schema objects needed by the JPA application. Relies on the annotations metadata of the entities and also executes an arbitrary SQL file named load.sql
.
| <?xml version="1.0" encoding="UTF-8"?> <persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd"> <persistence-unit name="MyPU" transaction-type="JTA"> <properties> <property name="javax.persistence.schema-generation.database.action" value="drop-and-create"/> <property name="javax.persistence.schema-generation.create-source" value="metadata"/> <property name="javax.persistence.schema-generation.drop-source" value="metadata"/> <property name="javax.persistence.sql-load-script-source" value="META-INF/load.sql"/> </properties> </persistence-unit> </persistence> |
Another sample that generates the database schema objects to be created and dropped in the target locations:
| <?xml version="1.0" encoding="UTF-8"?> <persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd"> <persistence-unit name="MyPU" transaction-type="JTA"> <properties> <property name="javax.persistence.schema-generation.scripts.action" value="drop-and-create"/> <property name="javax.persistence.schema-generation.scripts.create-target" value="file:/tmp/create.sql"/> <property name="javax.persistence.schema-generation.scripts.drop-target" value="file:/tmp/drop.sql"/> </properties> </persistence-unit> </persistence> |
Both samples can also be combined for dropping and creating the database objects and generating the corresponding scripts that perform these operations. You can find these and other samples in the Java EE Samples project hosted on Github.
Limitations
As I mentioned before, I recommend that you use these properties for development or testing purposes only. A wrong setting, might easily destroy or mess with your production database.
There are no actions to update or just validate the schema. I couldn’t find the reason why they didn’t make it into the specification, but here is an issue with the improvement suggestion.
The database schema actions are only performed on the application deployment in a Java EE environment. For development, you might want to perform the actions on the server restart.
Support
Both Hibernate and EclipseLink, which are bundled with Wildfly and Glassfish support these properties.
OpenJPA, currently does not support these properties, but I’ve been working in the OpenJPA support for standard Schema Generation. If you’re curious or want to follow the progress, check my Github repo, here. This was actually my main motivation to write this post, since I’m a bit involved in the implementation of the feature.
I hope you enjoyed the post 🙂
In one way or another, every developer has come in touch with an API. Either integrating a major system for a big corporation, producing some fancy charts with the latest graph library, or simply by interacting with his favorite programming language. The truth is that APIs are everywhere! They actually represent a fundamental building block of the nowadays Internet, playing a fundamental role in the data exchange process that takes place between different systems and devices. From the simple weather widget on your mobile phone to a credit card payment you perform on an online shop, all of these wouldn’t be possible if those systems wouldn’t communicate with each other by calling one another’s APIs.
So with the ever growing eco-system of heterogeneous devices connected to the internet, APIs are put a new set of demanding challenges. While they must continue to perform in a reliable and secure manner, they must also be compatible with all these devices that can range from a wristwatch to the most advanced server in a data-center.
REST to the rescue
One of the most widely used technologies for building such APIs are the so called RESTÂ APIs. These APIs aim to provide a generic and standardize way of communication between heterogeneous systems. Because they heavily rely on standard communication protocols and data representation – like HTTP, XML or JSONÂ – it’s quite easy to provide client side implementations on most programming languages, thus making them compatible with the vast majority of systems and devices.
So while these REST APIs can be compatible with most devices and technologies out there, they also must evolve. And the problem with evolution is that you sometimes have to maintain retro-compatibility with old client versions.
Let’s build up an example.
Let’s imagine an appointment system where you have an API to create and retrieve appointments. To simplify things let’s imagine our appointment object with a date and a guest name. Something like this:
| public class AppointmentDTO { public Long id; public Date date; public String guestName; } |
A very simple REST API would look like this:
| @Path("/api/appointments") public class AppointmentsAPI { @GET @Path("/{id}") public AppointmentDTO getAppointment(@PathParam("id") String id) { ... } @POST public void createAppointment(AppointmentDTO appointment) { ... } } |
Let’s assume this plain simple API works and is being used on mobile phones, tablets and various websites that allow for booking and displaying appointments. So far so good.
At some point, you decide it would be very interesting to start gathering some statistics about your appointment system. To keep things simple you just want to know who’s the person who booked most times. For this you would need to correlate guest between themselves and decide you need to add an unique identifier to each guest. Let’s use Email. So now your object model would look like something like this:
| public class AppointmentDTO { public Long id; public Date date; public GuestDTO guest; } public class GuestDTO { public String email; public String name; } |
So our object model changed slightly which means we will have to adapt the business logic on our api.
The Problem
While adapting the API to store and retrieve the new object types should be a no brainer, the problem is that all your current clients are using the old model and will continue to do so until they update. One can argue that you shouldn’t have to worry about this, and that customers should update to the newer version, but the truth is that you can’t really force an update from night to day. There will always be a time window where you have to keep both models running, which means your api must be retro-compatible.
This is where your problems start.
So back to our example, in this case it means that our API will have to handle both object models and be able to store and retrieve those models depending on the client. So let’s add back the guestName to our object to maintain compatibility with the old clients:
| public class AppointmentDTO { public Long id; public Date date; @Deprecated //For retro compatibility purposes public String guestName; public GuestDTO guest; } |
Remember a good thumb rule on API objects is that you should never delete fields. Adding new ones usually won’t break any client implementations (assuming they follow a good thumb rule of ignoring new fields), but removing fields is usually a road to nightmares.
Now for maintaining the API compatible, there are a few different options. Let’s look at some of the alternatives:
- Duplication: pure and simple. Create a new method for the new clients and have the old ones using the same one.
- Query parameters: introduce a flag to control the behavior. Something like useGuests=true.
- API Versioning: Introduce a version in your URL path to control which method version to call.
So all these alternatives have their pros and cons. While duplication can be plain simple, it can easily turn your API classes into a bowl of duplicated code.
Query parameters can (and should) be used for behavior control (for example to add pagination to a listing) but we should avoid using them for actual API evolutions, since these are usually of a permanent kind and therefore you don’t want to make it optional for the consumer.
Versioning seems like a good idea. It allows for a clean way to evolve the API, it keeps old clients separated from new ones and provides a generic base from all kinds of changes that will occur during your API lifespan. On the other hand it also introduces a bit of complexity, specially if you will have different calls at different versions. Your clients would end up having to manage your API evolution themselves by upgrading a call, instead of the API. It’s like instead of upgrading a library to the next version, you would upgrade only a certain class of that library. This can easily turn into a version nightmare…
To overcome this we must ensure that our versions cover the whole API. This means that I should be able to call every available method on /v1 using /v2. Of course that if a newer version on a given method exists on v2 it should be run on the /v2 call. However, if a given method hasn’t changed in v2, I expect that the v1 version would seamlessly be called.
Inheritance based API Versioning
In order to achieve this we can take advantage of Java objects polymorphic capabilities. We can build up API versions in a hierarchical way so that older version methods can be overridden by newer, and calls to a newer version of an unchanged method can be seamlessly fallen back to it’s earlier version.
So back to our example we could build up a new version of the create method so that the API would look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | @Path("/api/v1/appointments") //We add a version to our base path public class AppointmentsAPIv1 { //We add the version to our API classes @GET @Path("/{id}") public AppointmentDTO getAppointment(@PathParam("id") String id) { ... } @POST public void createAppointment(AppointmentDTO appointment) { //Your old way of creating Appointments only with names } } //New API class that extends the previous version @Path("/api/v2/appointments") public class AppointmentsAPIv2 extends AppointmentsAPIv1 { @POST @Override public void createAppointment(AppointmentDTO appointment) { //Your new way of creating appointments with guests } } |
So now we have 2 working versions of our API. While all  the old clients that didn’t yet upgrade to the new version will continue to use v1 – and will see no changes – all your new consumers can now use the latest v2. Note that all these calls are valid:
Call | Result |
---|
GET /api/v1/appointments/123 | Will run getAppointment on the v1 class |
GET /api/v2/appointments/123 | Will run getAppointment on the v1 class |
POST /api/v1/appointments | Will run createAppointment on the v1 class |
POSTÂ /api/v2/appointments | Will run createAppointment on the v2 class |
This way any consumers that want to start using the latest version will only have to update their base URLs to the corresponding version, and all of the API will seamlessly shift to the most recent implementations, while keeping the old unchanged ones.
Caveat
For the keen eye there is an immediate caveat with this approach. If your API consists of tenths of different classes, a newer version would imply duplicating them all to an upper version even for those where you don’t actually have any changes. It’s a bit of boiler plate code that can be mostly auto-generated. Still annoying though.
Although there is no quick way to overcome this, the use of interfaces could help. Instead of creating a new implementation class you could simply create a new Path annotated interface and have it implemented in your current implementing class. Although you would sill have to create one interface per API class, it is a bit cleaner. It helps a little bit, but it’s still a caveat.
Final thoughts
API versioning seems to be a current hot topic. Lot of different angles and opinions exists but there seems to be a lack of standard best practices. While this post doesn’t aim to provide such I hope that it helps to achieve a better API structure and contribute to it’s maintainability.
A final word goes to Roberto Cortez for encouraging and allowing this post on his blog. This is actually my first blog post so load the cannons and fire at will. 😉