Monthly archives "October 2014"

Configure JBoss / Wildfly Datasource with Maven

posted by Roberto Cortez on

Most Java EE applications use database access in their business logic, so developers are often faced with the need to configure drivers and database connection properties in the application server. In this post, we are going to automate that task for JBoss / Wildfly and a Postgre database using Maven. The work is based on my World of Warcraft Auctions Batch application from the previous post.

Maven Configuration

Let’s start by adding the following to our pom.xml:

We are going to use the Wildfly Maven Plugin to execute scripts with commands in the application server. Note that we also added a dependency to the Postgre driver. This is for Maven to download the dependency, because we are going to need it later to add it to the server. There is also a ${cli.file} property that is going to be assigned to a profile. This is to indicate which script we want to execute.

Let’s also add the following to the pom.xml:

With the Resources Maven Plugin we are going to filter the script files contained in the src/main/resources/scripts and replace them with the properties contained in ${basedir}/src/main/resources/configuration.properties file.

Finally lets add a few Maven profiles to the pom.xml, with the scripts that we want to run:

Wildfly Script Files

Add Driver

The scripts with the commands to add a Driver:

wildfly-install-postgre-driver.cli

Database drivers are added to Wildfly as a module. In this was, the driver is widely available to all the applications deployed in the server. With ${settings.localRepository} we are pointing into the database driver jar downloaded to your local Maven repository. Remember the dependency that we added into the Wildfly Maven Plugin? It’s to download the driver when you run the plugin and add it to the server. Now, to run the script we execute (you need to have the application server running):

mvn process-resources wildfly:execute-commands -P "install-driver"

The process-resources lifecycle is needed to replace the properties in the script file. In my case ${settings.localRepository} is replaced by /Users/radcortez/.m3/repository/. Check the target/scripts folder. After running the command, you should see the following output in the Maven log:

And on the server:

wildfly-remove-postgre-driver.cli

This script is to remove the driver from the application server. Execute mvn wildfly:execute-commands -P "remove-driver". You don’t need process-resources if you already executed the command before, unless you change the scripts.

Add Datasource

wow-auctions-install.cli
The scripts with the commands to add a Datasource:

We also need a a file to define the properties:

configuration.properties

Default Java EE 7 Datasource

Java EE 7, specifies that the container should provide a default Datasource. Instead of defining a Datasource with the JNDI name java:/datasources/WowAuctionsDS in the application, we are going to point our newly created datasource to the default one with /subsystem=ee/service=default-bindings:write-attribute(name="datasource", value="${datasource.jndi}"). In this way, we don’t need to change anything in the application. Execute the script with mvn wildfly:execute-commands -P "install-wow-auctions". You should get the following Maven output:

And on the server:

wow-auctions-remove.cli

This is the script to remove the Datasource and revert the Java EE 7 default Datasource. Run it by executing mvn wildfly:execute-commands -P "remove-wow-auctions"

Conclusion

This post demonstrated how to automate add / remove Drivers to Wildfly instances and also add / remove Datasources. This is useful if you want to switch between databases or if you’re configuring a server from the ground up. Think about CI environments. These scripts are also easily adjustable to other drivers.

You can get the code from the WoW Auctions Github repo, which uses this setup. Enjoy!

Java EE 7 Batch Processing and World of Warcraft – Part 1

posted by Roberto Cortez on
tags: ,

This was one of my sessions at the last JavaOne. This post is going to expand the subject and look into a real application using the Batch JSR-352 API. This application integrates with the MMORPG World of Warcraft.

Since the JSR-352 is a new specification in the Java EE world, I think that many people don’t know how to use it properly. It may also be a challenge to identify the use cases to which this specification apply. Hopefully this example can help you understand better the use cases.

Abstract

World of Warcraft is a game played by more than 8 million players worldwide. The service is offered by region: United States (US), Europe (EU), China and Korea. Each region has a set of servers called Realm that you use to connect to be able to play the game. For this example, we are only looking into the US and EU regions.

World of Warcraft Horde Auction House

One of the most interesting features about the game is that allows you to buy and sell in-game goods called Items, using an Auction House. Each Realm has two Auction House’s. On average each Realm trades around 70.000 Items. Let’s crunch some numbers:

  • 512 Realm’s (US and EU)
  • 70 K Item’s per Realm
  • More than 35 M Item’s overall

The Data

Another cool thing about World of Warcraft is that the developers provide a REST API to access most of the in-game information, including the Auction House’s data. Check here the complete API.

The Auction House’s data is obtained in two steps. First we need to query the correspondent Auction House Realm REST endpoint to get a reference to a JSON file. Next we need to access this URL and download the file with all the Auction House Item’s information. Here is an example:

http://eu.battle.net/api/wow/auction/data/aggra-portugues

The Application

Our objective here is to build an application that downloads the Auction House’s, process it and extract metrics. These metrics are going to build a history of the Items price evolution through time. Who knows? Maybe with this information we can predict price fluctuation and buy or sell Items at the best times.

The Setup

For the setup, we’re going to use a few extra things to Java EE 7

Jobs

The main work it’s going to be performed by Batch JSR-352 Jobs. A Job is an entity that encapsulates an entire batch process. A Job will be wired together via a Job Specification Language. With JSR-352, a Job is simply a container for the steps. It combines multiple steps that belong logically together in a flow.

We’re going to split the business login into three jobs:

  • Prepare – Creates all the supporting data needed. List Realms, create folders to copy files.
  • Files – Query realms to check for new files to process.
  • Process – Downloads the file, process the data, extract metrics.

The Code

Back-end – Java EE 7 with Java 8

Most of the code is going to be in the back-end. We need Batch JSR-352, but we are also going to use a lot of other technologies from Java EE: like JPA, JAX-RS, CDI and JSON-P.

Since the Prepare Job is only to initialize application resources for the processing, I’m skipping it and dive into the most interesting parts.

Files Job

The Files Job is an implementation of AbstractBatchlet. A Batchlet is the simplest processing style available in the Batch specification. It’s a task oriented step where the task is invoked once, executes, and returns an exit status. This type is most useful for performing a variety of tasks that are not item-oriented, such as executing a command or doing file transfer. In this case, our Batchlet is going to iterate on every Realm make a REST request to each one and retrieve an URL with the file containing the data that we want to process. Here is the code:

A cool thing about this is the use of Java 8. With parallelStream() invoking multiple REST request at once is easy as pie! You can really notice the difference. If you want to try it out, just run the sample and replace parallelStream() with stream() and check it out. On my machine, using parallelStream() makes the task execute around 5 or 6 times faster.

Update
Usually, I would not use this approach. I’ve done it, because part of the logic involves invoking slow REST requests and parallelStreams really shine here. Doing this using batch partitions is possible, but hard to implement. We also need to pool the servers for new data every time, so it’s not terrible if we skip a file or two. Keep in mind that if you don’t want to miss a single record a Chunk processing style is more suitable. Thank you to Simon Martinelli for bringing this to my attention.

Since the Realms of US and EU require different REST endpoints to invoke, these are perfect to partitioned. Partitioning means that the task is going to run into multiple threads. One thread per partition. In this case we have two partitions.

To complete the job definition we need to provide a JoB XML file. This needs to be placed in the META-INF/batch-jobs directory. Here is the files-job.xml for this job:

In the files-job.xml we need to define our Batchlet in batchlet element. For the partitions just define the partition element and assign different properties to each plan. These properties can then be used to late bind the value into the LoadAuctionFilesBatchlet with the expressions #{partitionPlan['region']} and #{partitionPlan['target']}. This is a very simple expression binding mechanism and only works for simple properties and Strings.

Process Job

Now we want to process the Realm Auction Data file. Using the information from the previous job, we can now download the file and do something with the data. The JSON file has the following structure:

The file has a list of the Auction’s from the Realm it was downloaded from. In each record we can check the item for sale, prices, seller and time left until the end of the auction. Auction’s are algo aggregated by Auction House type: Alliance and Horde.

For the process-job we want to read the JSON file, transform the data and save it to a database. This can be achieved by Chunk Processing. A Chunk is an ETL (Extract – Transform – Load) style of processing which is suitable for handling large amounts of data. A Chunk reads the data one item at a time, and creates chunks that will be written out, within a transaction. One item is read in from an ItemReader, handed to an ItemProcessor, and aggregated. Once the number of items read equals the commit interval, the entire chunk is written out via the ItemWriter, and then the transaction is committed.

ItemReader

The real files are so big that they cannot be loaded entirely into memory or you may end up running out of it. Instead we use JSON-P API to parse the data in a streaming way.

To open a JSON Parse stream we need Json.createParser and pass a reference of an inputstream. To read elements we just need to call the hasNext() and next() methods. This returns a JsonParser.Event that allows us to check the position of the parser in the stream. Elements are read and returned in the readItem() method from the Batch API ItemReader. When no more elements are available to read, return null to finish the processing. Note that we also implements the method open and close from ItemReader. These are used to initialize and clean up resources. They only execute once.

ItemProcessor

The ItemProcessor is optional. It’s used to transform the data that was read. In this case we need to add additional information to the Auction.

ItemWriter

Finally we just need to write the data down to a database:

The entire process with a file of 70 k record takes around 20 seconds on my machine. I did notice something very interesting. Before this code, I was using an injected EJB that called a method with the persist operation. This was taking 30 seconds in total, so injecting the EntityManager and performing the persist directly saved me a third of the processing time. I can only speculate that the delay is due to an increase of the stack call, with EJB interceptors in the middle. This was happening in Wildfly. I will investigate this further.

To define the chunk we need to add it to a process-job.xml file:

In the item-count property we define how many elements fit into each chunk of processing. This means that for every 100 the transaction is committed. This is useful to keep the transaction size low and to checkpoint the data. If we need to stop and then restart the operation we can do it without having to process every item again. We have to code that logic ourselves. This is not included in the sample, but I will do it in the future.

Running

To run a job we need to get a reference to a JobOperator. The JobOperator provides an interface to manage all aspects of job processing, including operational commands, such as start, restart, and stop, as well as job repository related commands, such as retrieval of job and step executions.

To run the previous files-job.xml Job we execute:

Note that we use the name of job xml file without the extension into the JobOperator.

Next Steps

We still need to aggregate the data to extract metrics and display it into a web page. This post is already long, so I will describe the following steps in a future post. Anyway, the code for that part is already in the Github repo. Check the Resources section.

Resources

You can clone a full working copy from my github repository and deploy it to Wildfly. You can find instructions there to deploy it.

World of Warcraft Auctions

Check also the Java EE samples project, with a lot of batch examples, fully documented.

Java One 2014 – Create the Future


JavaOne
I spent the last week in San Francisco to attend JavaOne 2014. This was my third time attending JavaOne, so I was already familiarized with the conference. Anyway, this year was different since I was going as a speaker for the first time.

Create the Future

“Create the Future” was the theme of JavaOne this year. The last few years have been very exciting for the Java community. After many years without evolution, we see now Java 8 with lambdas and streams, Java EE 7 with new specifications and simplifications)and a huge effort to unify and support Java for embeddable devices. Java 9 is already in the pipeline which promises modular Java (project Jigsaw). Java EE 8 is going to improve a lot of specifications and bring new ones like MVC, JSON-B and the much awaited JCache. Now it’s the time to contribute by Adopting a JSR.

During the last few years we heard a lot of voices claiming that Java is dead. Looking at what’s happening now, it doesn’t seem that way. The platform is evolving, a lot of new developers are joining the JVM ecosystem, and the conference was vibrating with energy. By the way, Java is turning 20 years in 2015. Let’s see what is going to happen in 20 years from now. Let’s hope that this blog is still around!

Keynote

The opening Keynote was a recap on what’s happening in the last few years. You can find all the videos here. Just a few notes:

  • Coimbra JUG shows on the map of the new JUG’s:

    JavaOne - Coimbra JUG

  • The technical Keynote was interrupted because of lack of time. This also happened to me in one of my sessions. I understand that there is a time frame, but this was not the best way to kick out the conference. I’m pretty sure that most attendees would prefer to shorten up the Strategy Keynote for the Technical one.
  • I was referenced in the Community Keynote, because of my work at the Java EE 7 Hackergarten. Thank you Heather VanCura. Count me in with future contributions!

Venue

The event was split between the Moscone Center, The Hilton Hotel and the Parc 55 Hotel. I’m not from the time where JavaOne was completely held in the Moscone Center, so I can’t compare. Because of the layout of the hotels, you need to run sometimes from session to session and the corridors are not the best place to have groups of people chatting. A few of the rooms also have columns in the middle which makes difficult for the attendees and the speaker to be aware of everything.

In my session Development Horror Stories [BOF4223] I had to run with Simon Maple, to get there on time. The problem was that the previous slot sessions were held at the Hilton and then moved to the Moscone, which is a 15 minutes walk. By the way, no taxi wanted to take us because it was too close.

Food

Not even going to comment about it. Yeah the lunch sucked, and yeah I’m weird with the food.

Sessions

There is so much stuff going on, that it’s impossible to attend every session that you want to go. I probably only attended half of the sessions that I’ve signed up for. I had to split some of my time between the sessions, the Demogrounds, the Hackergarten and also a bit of personal time for the last details of my sessions. Not all sessions had video recording, but all of them should have audio and be available via Parleys.

These are my top 3 sessions (from the ones I have attended):

My Sessions

I’m relatively happy with my performance delivering the sessions, but I can improve much more. I do have to say, that I didn’t feel any nervousness. I guess that I’m feeling more comfortable on public speaking, plus preparing everything with a few weeks in advance also helped. Moving forward!

Development Horror Stories [BOF4223]

with Simon Maple
We had around 150+ people signed up, but only 50 or so showed up. I think this was related to the switch venues problem I described earlier. At the same time there was also an Oracle Tech Party with food, drinks and music. I guess that didn’t help either.

Anyway, me and Simon kicked out the BOF with a few of our own stories where things went terribly wrong. The crowd was really into it, so our plan to ask people for the audience to share their own stories worked perfectly. We probably had around 10+ people stepping up the stage. In the end we had a Java 8 In Action book give away signed by the author, for the best story voted by the audience. The winning story belong to Jan when he wrote a few scripts to clear and insert data into a database for tests. Unfortunately he executed it in a production environment by accident!

Development Horror Stories BOF

I think people enjoyed the BOF and this can work in pretty much everywhere. I’ll submit it in the future to other conferences. BOF’s don’t really need slides, but we did some anyway:

Java EE 7 Batch Processing in the Real World [CON2818]

with Ivan Ivanov
This session was the first one of the day at 8.30 in the morning and was packed with people. It was surprising to see so many so early. Me and Ivan started the session with an introduction on Batch, origins, applications and so on. Next we went through the JSR-352 API to prepare for our demo at the end. The demo is based around World of Warcraft and we used the Batch API to download, process and extract metrics from the game Auction House’s (they are like eBay in the game). Stay tuned for a future post describing the entire sample.

Batch Processing Real World Session

Unfortunately we run out of time and we couldn’t show everything that we wanted, or at least go into more details about the demo. We allowed people to ask questions anytime, and we had a lot o them. I’m not complaining about it. I prefer doing it this way, since it makes the session more interactive. On the other hand, you end up using more time and is not very predictable. We will reorganize the session to perform the demo in the middle and everything should be fine like that.

And the check the session code here.

CON4255 – The 5 people in your organization that grow legacy code

I’m pretty happy with how this session go. Considering that it was the last day of the conference and also one of the last sessions of the day, I had probably around 80+ people. I’m also happy because it was video recorded, so I can check it properly later.

Legacy Code Session

I’m not going to spoil the content, but I think the attendees really enjoyed the session and had many moments to laugh about the content. I’ll just leave you with the slides:

Final Words

The event was huge, so I’m probably writing another post about it, since I don’t want to write a very long boring post. Next one is going to focus a little more on other sessions, activities and community!

I would like to thank everyone that attended my sessions and send a few specials ones: to Reza Rahman for helping me in the submission process, to Heather VanCura for the Hackergarten invite and for my co-speakers Ivan Ivanov and Simon Maple. Thanks everyone!