Getting Started with Spring Boot Batch Processing

July 7, 2021

Spring Batch is a lightweight, open-source framework created to develop scalable batch processing applications. Batch processing is mostly used by applications that process a large quantity of data at a given time. For example, payroll systems use batch processing to send out payments to employees at a given time of the month.

Spring Batch does not include an in-built scheduling framework. It can be used with Quartz or Control-M scheduling frameworks to process data at a scheduled time.

In this tutorial, we will be developing a Spring Boot application that reads data from a CSV file and stores it in an SQL database (H2 database).

Table of contents

Prerequisites

  1. Java Development kit (JDK) installed on your computer.
  2. Some knowledge in Spring Boot.

Application setup

  • On your browser, navigate to spring intializr.
  • Set the project name to springbatch.
  • Add lombok, spring web, h2 database, spring data jpa, and spring batch as the project dependencies.
  • Click on generate to download the generated project zip file.
  • Decompress the downloaded file and open it on your preferred IDE.

Data layer

  • Create a new package named domain in the root project package.
  • In the domain package created above, create a file named Customer and add the code below.
@Entity(name = "person")
@Getter // Lombok annotation to generate Getters for the fields
@Setter // Lombok annotation to generate Setters for the fields
@AllArgsConstructor // Lombok annotation to generate a constructor will all of the fields in the class
@NoArgsConstructor // Lombok annotation to generate an empty constructor for the class
@EntityListeners(AuditingEntityListener.class)
public class Customer {
    @Id // Sets the id field as the primary key in the database table
    @Column(name = "id") // sets the column name for the id property
    @GeneratedValue(strategy = GenerationType.AUTO) // States that the id field should be autogenerated
    private Long id;

    @Column(name = "last_name")
    private String lastName;
    @Column(name = "first_name")
    private String firstName;

    // A method that returns firstName and Lastname when an object of the class is logged
    @Override
    public String toString() {
        return "firstName: " + firstName + ", lastName: " + lastName;
    }
}

The class above has an id field for the primary key in the database, lastName and firstName fields that we will be getting from the data.csv file.

Repository layer

  • Create a new package named repositories in the root project package.
  • In the repositories package created above, create an interface named CustomerRepository and add the code below.
// The interface extends JpaRepository that has the CRUD operation methods
public interface CustomerRepository extends JpaRepository<Customer, Long> {
}

Processor

  • Create a new package named processor in the root project package.
  • In the processor package, create a new Java file named CustomerProcessor then add the code below.
public class CustomerProcessor implements ItemProcessor<Customer, Customer> {
    // Creates a logger
    private static final Logger logger = LoggerFactory.getLogger(CustomerProcessor.class);
    // This method transforms data form one form to another.
    @Override
    public Customer process(final Customer customer) throws Exception {
        final String firstName = customer.getFirstName().toUpperCase();
        final String lastName = customer.getLastName().toUpperCase();
        // Creates a new instance of Person
        final Customer transformedCustomer = new Customer(1L, firstName, lastName);
        // logs the person entity to the application logs
        logger.info("Converting (" + customer + ") into (" + transformedCustomer + ")");
        return transformedCustomer;
    }
}

The class above transforms data from one form to another. The ItemProcessor<I, O> takes in the input data (I), transforms it, then returns the result as the output data (O).

In our case, we have declared the Customer entity as both the input and output, meaning our data form is maintained.

Configuration layer

  • Create a new package named config in the root project package. This package will contain all of our configurations.
  • In the config package, create a new Java file named BatchConfiguration and add the code below.
@Configuration // Informs Spring that this class contains configurations
@EnableBatchProcessing // Enables batch processing for the application
public class BatchConfiguration {

    @Autowired
    public JobBuilderFactory jobBuilderFactory;

    @Autowired
    public StepBuilderFactory stepBuilderFactory;

    @Autowired
    @Lazy
    public CustomerRepository customerRepository;

    // Reads the sample-data.csv file and creates instances of the Person entity for each person from the .csv file.
    @Bean
    public FlatFileItemReader<Customer> reader() {
        return new FlatFileItemReaderBuilder<Customer>()
        .name("customerReader")
        .resource(new ClassPathResource("data.csv"))
        .delimited()
        .names(new String[]{"firstName", "lastName"})
        .fieldSetMapper(new BeanWrapperFieldSetMapper<>() {{
        setTargetType(Customer.class);
        }})
        .build();
    }

    // Creates the Writer, configuring the repository and the method that will be used to save the data into the database
    @Bean
    public RepositoryItemWriter<Customer> writer() {
        RepositoryItemWriter<Customer> iwriter = new RepositoryItemWriter<>();
        iwriter.setRepository(customerRepository);
        iwriter.setMethodName("save");
        return iwriter;
    }

    // Creates an instance of PersonProcessor that converts one data form to another. In our case the data form is maintained.
    @Bean
    public CustomerProcessor processor() {
        return new CustomerProcessor();
    }

    // Batch jobs are built from steps. A step contains the reader, processor and the writer.
    @Bean
    public Step step1(ItemReader<Customer> itemReader, ItemWriter<Customer> itemWriter)
    throws Exception {

        return this.stepBuilderFactory.get("step1")
        .<Customer, Customer>chunk(5)
        .reader(itemReader)
        .processor(processor())
        .writer(itemWriter)
        .build();
    }

    // Executes the job, saving the data from .csv file into the database.
    @Bean
    public Job customerUpdateJob(JobCompletionNotificationListener listener, Step step1)
    throws Exception {

        return this.jobBuilderFactory.get("customerUpdateJob").incrementer(new RunIdIncrementer())
        .listener(listener).start(step1).build();
    }
}
  • In the config package, create another Java class named JobCompletionNotificationListener and add the code below.
@Component
public class JobCompletionListener extends JobExecutionListenerSupport {
    // Creates an instance of the logger
    private static final Logger log = LoggerFactory.getLogger(JobCompletionListener.class);
    private final CustomerRepository customerRepository;

    @Autowired
    public JobCompletionListener(CustomerRepository customerRepository) {
        this.customerRepository = customerRepository;
    }

    // The callback method from the Spring Batch JobExecutionListenerSupport class that is executed when the batch process is completed
    @Override
    public void afterJob(JobExecution jobExecution) {
        // When the batch process is completed the the users in the database are retrieved and logged on the application logs
        if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
        log.info("!!! JOB COMPLETED! verify the results");
        customerRepository.findAll()
        .forEach(person -> log.info("Found (" + person + ">) in the database.") );
        }
    }
}

Controller layer

  • Create a new package named controllers in the root project package.
  • In the controllers package created above, create a Java class named BatchController and add the code snippet below.
@RestController
@RequestMapping(path = "/batch")// Root path
    public class BatchController {
    @Autowired
    private JobLauncher jobLauncher;
    @Autowired
    private Job job;

    // The function below accepts a GET request to invoke the Batch Process and returns a String as response with the message "Batch Process started!!".
    @GetMapping(path = "/start") // Start batch process path
    public ResponseEntity<String> startBatch() {
        JobParameters Parameters = new JobParametersBuilder()
        .addLong("startAt", System.currentTimeMillis()).toJobParameters();
        try {
        jobLauncher.run(job, Parameters);
        } catch (JobExecutionAlreadyRunningException | JobRestartException
        | JobInstanceAlreadyCompleteException | JobParametersInvalidException e) {

        e.printStackTrace();
        }
        return new ResponseEntity<>("Batch Process started!!", HttpStatus.OK);
    }
}

Application configuration

In the resource directory, add the code below in the application.properties file.

# Sets the server port from where we can access our application
server.port=8080
# Disables our batch process from automatically running on application startup
spring.batch.job.enabled=false

Testing

Open Postman and send a GET request to http://localhost:8080/batch/start to start the batch process.

Postman GET request

After sending the GET request, we can see that the batch process running from the application logs.

Batch process running

Conclusion

Now that you have learned how to execute batch processes, configure the application we have developed to use Spring Boot Scheduler to schedule jobs that run at a given time automatically rather than sending an HTTP call to start a job.

You can download the complete source code here.

Happy coding!


Peer Review Contributions by: Odhiambo Paul


About the author

Okelo Violet

Violet is an undergraduate student pursuing a degree in Electrical and Electronics Engineering. Violet loves developing web applications, technical writing, and following up on UI/UX trends.

This article was contributed by a student member of Section's Engineering Education Program. Please report any errors or innaccuracies to enged@section.io.