Batch Processing is referred to processing a series of jobs without manual intervention. These are in contrast with OLTP processes that require some input from the users to initiate a process. All input parameters are predefined through scripts, job control language control files, etc. These tasks (jobs) often process large amounts of data from a range of sources.
Java EE Batch Processing Framework (JSR-352) provides the batch execution infrastructure common to all batch applications. This enables us, the developers, to concentrate on the business logic of these batch processes. The batch framework consists of a job specific XML based language, as set of batch annotations and some interfaces to implement the business logic, a batch container that manages the bath jobs and a set of APIs to interact with the batch container.
In this first JSR-352 post, I will concentrate on the Batchlet. I have chosen this because this is the simplest way of getting up and running with the JSR-352 Batch Processing Framework. We will use a Timer to trigger the Batchlet and a RESTful Web Service to create the Timer that will trigger the Batchlet. Let's start with the Service first.
This is how my the service looks like:
package com.mybatchproject; import java.util.Calendar; import java.util.Date; import java.util.GregorianCalendar; import javax.ejb.EJB; import javax.ejb.Timer; import javax.ws.rs.GET; import javax.ws.rs.Path; import javax.ws.rs.Produces; import javax.ws.rs.core.MediaType; /** * http://localhost:8080/BatchJobServer/resource/test/timerservice * * @author Raza Abidi * */ @Path("test") public class BatchJobService { @EJB MyEJBTimer timerEJB; @GET @Path("/timerservice") @Produces(MediaType.TEXT_XML) public TimerInfo generateTimeStamp() { Date schedule; try { // Add a delay of one minute Calendar cl = new GregorianCalendar(); cl.add(Calendar.MINUTE, 1); schedule = cl.getTime(); } catch (Exception e) { e.printStackTrace(); return null; } TimerInfo to = new TimerInfo(); to.setJobName("my-test-batch-job"); // same as the xml name to.setJobSchedule(schedule); to.setJobMessage("Job scheduled sucessfully"); Timer timer; try{ timer = timerEJB.createNewTimer(schedule, to); }catch(Exception e){ to.setJobMessage("Failed to create timer"); e.printStackTrace(); return to; } to = (TimerInfo) timer.getInfo(); System.out.println("Timer created successfully: " + to); return to; } }
All we are doing here is preparing a simple POJO TimerInfo
and populating it with some data. We are then creating a simple Timer using the MyEJBTimer
helper EJB and saving the POJO in the Timer. This is to illustrate that a Timer can contain data that can be used when the Timer expires. It will make more sense later on.
This is what the POJO TimerInfo
looks like:
package com.mybatchproject; import java.io.Serializable; import java.util.Date; import javax.xml.bind.annotation.XmlRootElement; /** * @author Raza Abidi * @date 13 Nov 2014 15:45:22 */ @XmlRootElement(name="TimerInfo") public class TimerInfo implements Serializable{ private static final long serialVersionUID = -6478588792874137803L; private String jobName; private Date jobSchedule; private String jobMessage; @Override public String toString() { StringBuilder sb = new StringBuilder(); sb.append("\n").append("Name: ").append(jobName); sb.append("\n").append("Schedule: ").append(jobSchedule); sb.append("\n").append("Message: ").append(jobMessage).append("\n"); return sb.toString(); } public String getJobName() { return jobName; } public void setJobName(String jobName) { this.jobName = jobName; } public Date getJobSchedule() { return jobSchedule; } public void setJobSchedule(Date jobSchedule) { this.jobSchedule = jobSchedule; } public String getJobMessage() { return jobMessage; } public void setJobMessage(String jobMessage) { this.jobMessage = jobMessage; } }
Once the POJO is populated with data then we use the timerEJB.createNewTimer(schedule, to);
method of our helper EJB to create a timer. The method takes two parameters, schedule and info. The schedule is a date object representing the date/time when the timer is scheduled to expire and the info is an instance of TimerInfo
class that contains some necessary data that we can retrieve from the timer when it gets expired. Note that we are already creating the Scheduled Object with a delay of 1 minute. This timer will expire one minute after it is created.
Let’s see what the MyEJBTimer
helper class looks like:
package com.mybatchproject; import java.util.Date; import javax.annotation.Resource; import javax.ejb.Singleton; import javax.ejb.Timeout; import javax.ejb.Timer; import javax.ejb.TimerService; /** * @author Raza Abidi * @date 13 Nov 2014 15:36:58 */ @Singleton public class MyEJBTimer { @Resource private TimerService ts; public Timer createNewTimer(Date date, TimerInfo timerInfo) { Timer timer = ts.createTimer(date, timerInfo); return timer; } @Timeout public void timerExpired(Timer timer){ System.out.println("Timer Expired @ :" + new Date()); TimerInfo to = (TimerInfo) timer.getInfo(); to.setJobMessage("Timer Executed Successfully"); System.out.println("Timer Info: " + to); BatchJobHandler jobHandler = new BatchJobHandler(); jobHandler.startJob(to); } }
This is a simple EJB providing the implementation of what needs to be done when a Timer gets expired. The TimerService
is injected as a Resource
in this EJB and used by the two methods to create and use Timers created using the resource.
The first method createNewTimer(Date date, TimerInfo timerInfo)
is the one called by the RESTFul service to create a Timer. This takes two parameters and uses the TimerService
resource to create the Timer in the system. Upons successful creation of a Timer it will return the Timer object to the caller.
The second method timerExpired(Timer timer)
annotated with the @Timeout
annotation is the one that gets triggered when a timer gets expired. This is where we have implemented the logic to trigger the Batchlet
using the BatchJobHandler
helper class. Let’s see that that class looks like.
package com.mybatchproject; import java.util.Properties; import javax.batch.operations.JobOperator; import javax.batch.runtime.BatchRuntime; /** * @author Raza Abidi * @date 13 Nov 2014 15:40:11 */ public class BatchJobHandler { public void startJob(TimerInfo to) { String jobName = to.getJobName(); JobOperator jobOperator = BatchRuntime.getJobOperator(); Properties jp = new Properties(); long executionId = jobOperator.start(jobName, jp); System.out.println("Job Started: " + jobName + " Execution ID:" + executionId); } }
Here we are getting a reference to the JobOperator
from the BatchRuntime
and then using the jobOperator.start(jobName, jp);
method to invoke the Batch Process. The method returns an Execution ID which is the ID of that Job and can be used later to pause, stop, restart the job using the methods provided by the Batch API.
Note that we used the TimerInfo
object to get the name of the job that we want to start.
Now we need to create the actual Batch Job. All Batch Jobs are defined in XML and they must be located inside the META-INF/batch-jobs
folder of your application server. I created an XML document to describe my Batch Job at:
META-INF/batch-jobs/my-test-batch-job.xml
Notice the name of the file, it is exactly the same as what we passed to the to.setJobName("my-test-batch-job");
. The Batch Runtime uses the name of the XML file to identify whcih jobs it needs to start when you invoke the jobOperator.start(jobName, jp);
method.
This is what the my-test-batch-job.xml
file contains:
<?xml version="1.0" encoding="UTF-8"?> <job id="my-test-batch-job" xmlns="http://xmlns.jcp.org/xml/ns/javaee" version="1.0"> <step id="my-test-batch-step"> <batchlet ref="MyBatchJobBatchlet"/> </step> </job>
Now we need to create a class that represents the Batch Job. This class must implement the Batchlet
Interface and provide the implementation of at least the process()
method. This how my class looks like:
package com.mybatchproject; import javax.batch.api.Batchlet; import javax.enterprise.context.Dependent; import javax.inject.Named; /** * @author Raza Abidi * @date 13 Nov 2014 15:50:56 */ @Dependent @Named("MyBatchJobBatchlet") public class MyBatchJobBatchlet implements Batchlet { @Override public String process() throws Exception { System.out.println("Starting the Batch Job"); for (int i = 0; i < 5; i++) { try{ Thread.sleep(1000); }catch(Exception e){ e.printStackTrace(); } System.out.println("Processing the Batch Job : " + i); } System.out.println("Finished the Batch Job"); return null; } @Override public void stop() throws Exception { } }
Notice the @Named
annotation and the batchlet
tag in the XML file. Both of these are referring to the same name, i.e., MyBatchJobBatchlet
to bind the job definition with the job.
This Batchlet is not doing anything useful really; it is simply printing a message and then waiting for one second before printing the next message in a loop that will iterate 5 times. This is only to illustrate that the jobs will be ruining for a long period of time without any manual intervention.
When you type in the URL for this service ina browser, you will get teh following response:
<?xml version="1.0" encoding="UTF-8" standalone="true"?> <TimerInfo> <jobMessage>Job scheduled sucessfully</jobMessage> <jobName>my-test-batch-job</jobName> <jobSchedule>2014-11-13T17:52:16.475Z</jobSchedule> </TimerInfo>
And this is what you should see on the console output of your Application Server.
INFO: Initiating Jersey application, version Jersey: 2.0 2013-05-14 20:07:34... INFO: Timer created successfully: Name: my-test-batch-job Schedule: Thu Nov 13 17:22:24 GMT 2014 Message: Job scheduled sucessfully INFO: Timer Expired @ :Thu Nov 13 17:22:24 GMT 2014 INFO: Timer Info: Name: my-test-batch-job Schedule: Thu Nov 13 17:22:24 GMT 2014 Message: Timer Executed Successfully INFO: Job Started: my-test-batch-job Execution ID:48 INFO: Starting the Batch Job INFO: Processing the Batch Job : 0 INFO: Processing the Batch Job : 1 INFO: Processing the Batch Job : 2 INFO: Processing the Batch Job : 3 INFO: Processing the Batch Job : 4 INFO: Finished the Batch Job
As you can see, the service created a Timer and sent a response back to the browser streight away. After 1 minute, the timer expired and started the Batch Job.
Last but not least, if you are using Maven to build your project then you need to add the Maven dependency for JSR-352 to our Java EE project.
<dependency> <groupId>javax.batch</groupId> <artifactId>javax.batch-api</artifactId> <version>1.0</version> <scope>provided</scope> </dependency>
If you are using Glassfish as your application server then you can use the admin console to view the Bath Jobs executing in the server. You can view them under the "Monitoring Data" area and then the "Batch" tab on the Monitoring Data screen. At the moment you can only view the status of teh jobs running in the system, btu hopefully in later versions there will be options to interact with the jobs.
You can also use the methods ptovided by the Batch Runtime API to interact with the batch runtime and provide your own implementation of the admin interface to interact with the Bath Jobs if you need to. It is not as diffcult as it sounds really. If I manage to find some time, I may write another post on how to create a management interface for the Batch Runtime. :)