Spring Batch MultiResourceItemReader example

In this tutorial, we will show you how to read items from multiple resources (multiple csv files), and write the items into a single csv file.

Tools and libraries used

  1. Maven 3
  2. Eclipse 4.2
  3. JDK 1.6
  4. Spring Core 3.2.2.RELEASE
  5. Spring Batch 2.2.0.RELEASE

P.S This example – 3 CSV files (reader) – combine into a single CSV file (writer).

1. Project Directory Structure

Review the final project structure, a standard Maven project.

spring-batch-MultiResourceItemReader-example

2. Multiple CSV Files

There are 3 csv files, later we will use MultiResourceItemReader to read it one by one.

csv/inputs/domain-1-3-2013.csv

1,facebook.com
2,yahoo.com
3,google.com
csv/inputs/domain-2-3-2013.csv

200,mkyong.com
300,stackoverflow.com
400,oracle.com
csv/inputs/domain-3-3-2013.csv

999,eclipse.org
888,baidu.com

3. Spring Batch Jobs

A job to read resources that matches this pattern csv/inputs/domain-*.csv, and write it into a single cvs file domain.all.csv.

resources/spring/batch/jobs/job-read-files.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
        xmlns:batch="http://www.springframework.org/schema/batch" 
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://www.springframework.org/schema/batch 
	http://www.springframework.org/schema/batch/spring-batch-2.2.xsd
	http://www.springframework.org/schema/beans 
	http://www.springframework.org/schema/beans/spring-beans-3.2.xsd
	">

  <import resource="../config/context.xml" />

  <bean id="domain" class="com.mkyong.Domain" />

  <job id="readMultiFileJob" xmlns="http://www.springframework.org/schema/batch">
    <step id="step1">
	<tasklet>
		<chunk reader="multiResourceReader" writer="flatFileItemWriter"
			commit-interval="1" />
	</tasklet>
    </step>
  </job>

  <bean id="multiResourceReader"
	class=" org.springframework.batch.item.file.MultiResourceItemReader">
	<property name="resources" value="file:csv/inputs/domain-*.csv" />
	<property name="delegate" ref="flatFileItemReader" />
  </bean>

  <bean id="flatFileItemReader" 
        class="org.springframework.batch.item.file.FlatFileItemReader">
	<property name="lineMapper">
	  <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
		<property name="lineTokenizer">
		    <bean 
                    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
				<property name="names" value="id, domain" />
		    </bean>
		</property>
		<property name="fieldSetMapper">
		    <bean
		    class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
				<property name="prototypeBeanName" value="domain" />
		    </bean>
		</property>
	  </bean>
	</property>
  </bean>

  <bean id="flatFileItemWriter" 
          class="org.springframework.batch.item.file.FlatFileItemWriter">
	<property name="resource" value="file:csv/outputs/domain.all.csv" />
	<property name="appendAllowed" value="true" />
	<property name="lineAggregator">
	  <bean
	  class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
		<property name="delimiter" value="," />
		<property name="fieldExtractor">
		  <bean	
                  class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
			<property name="names" value="id, domain" />
		  </bean>
		</property>
	  </bean>
	</property>
  </bean>

</beans>
resources/spring/batch/config/context.xml

<beans xmlns="http://www.springframework.org/schema/beans"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="
	http://www.springframework.org/schema/beans 
	http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">

    <!-- stored job-meta in memory --> 
    <bean id="jobRepository"
	class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
	<property name="transactionManager" ref="transactionManager" />
    </bean>
 	
    <bean id="transactionManager"
	class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
	 
    <bean id="jobLauncher"
	class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
	<property name="jobRepository" ref="jobRepository" />
    </bean>

</beans>

4. Run It

Create a Java class and run the batch job.

App.java

package com.mkyong;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;

public class App {

  public static void main(String[] args) {
	App obj = new App();
	obj.run();
  }

  private void run() {

	String[] springConfig = { "spring/batch/jobs/job-read-files.xml" };

	ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);

	JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
	Job job = (Job) context.getBean("readMultiFileJob");

	try {

		JobExecution execution = jobLauncher.run(job, new JobParameters());
		System.out.println("Exit Status : " + execution.getStatus());

	} catch (Exception e) {
		e.printStackTrace();
	}

	System.out.println("Done");

  }

}

Output. The content of three csv files is read and combine into a single csv file.

csv/outputs/domain.all.csv

1,facebook.com
2,yahoo.com
3,google.coms
200,mkyong.com
300,stackoverflow.com
400,oracle.com
999,eclipse.org
888,baidu.com

Download Source Code

References

  1. MultiResourceItemReader JavaDoc
  2. FlatFileItemWriter JavaDoc
  3. Spring batch doc – multi file input

About the Author

author image
mkyong
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

avatar
11 Comment threads
4 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
14 Comment authors
Abhinav JainJosePrabhuJerelShahab Recent comment authors
newest oldest most voted
Shahab
Guest
Shahab

Can you also provide a sample working with resourcesaware. I need to do something while looping on a folder. In folder I havev multiple files which needs after each read do something

ScienceDiscovery
Guest
ScienceDiscovery

Just one note: If you’re files are not inside the folder at the beginning of the batch, the framework will set an empty array into the resources. For example in a “real life” batch you copy your *.csv files from a FTP folder into a local folder and then start to process them with the FlatFileItemWriter. So the framework needs to set the resource property exactly after the copy mechanism and not at the beginning of the spring batch. Solution: Set the scope=”step” parameter inside the MultiResourceItemReader bean tag. Then the spring batch framework will look up the content of… Read more »

Jose
Guest
Jose

Tks for help

Abhinav Jain
Guest
Abhinav Jain

when I set the scope = step then I get Caused by: java.lang.IllegalArgumentException: Input resource must be set
any Idea?

Prabhu
Guest
Prabhu

Hi,

I need to read two different value CSV files and insert them to two different tables in DB simultaniously. I have tried with a single file it works fine for me using spring batch. But, When I use two files.. It throws invalid number of tokens exception. Please let me know how to achieve this using spring batch since I am new to spring and spring batch.

Thanks in advance.

Regards,

Prabhu

nirmitha priyadarshini
Guest
nirmitha priyadarshini

Hi mkyong , I have to read two files in a job , As i require the them in my code . So can i read multiple files in one job , but maintain seperate objects . Is it possible

Alvaro Delgado
Guest
Alvaro Delgado

Hi mkyong I found an issue in your example that can create several confusion and spent long time in solve it, when you define the Domain bean it must be defined as prototype because if not then the singleton domain will be shared in all item instances:

Example in your case:

If you define an interval of 4 (commit-interval=”4″) the output file will be:

200,mkyong.com
200,mkyong.com
200,mkyong.com
200,mkyong.com
888,baidu.com
888,baidu.com
888,baidu.com
888,baidu.com

Then the output data is not valid to fix it you need to:

To avoid share the same instance and get the correct values: generating an expected output:

1,facebook.com
2,yahoo.com
3,google.com
200,mkyong.com
300,stackoverflow.com
400,oracle.com
999,eclipse.org
888,baidu.com

vish borade
Guest
vish borade

If the job in above example is run second time, even with csv files names different than used in the first job run, it will throw an JobInstanceAlreadyCompleteException.

Jerel
Guest
Jerel

Please add an incrementer to you Job, like ” .incrementer(new RunIdIncrementer())”.

Ankit
Guest
Ankit

Is there a way to provide different delimeters for different files

Minh Khoi
Guest
Minh Khoi

Hi Mkyong,
Can you help me to read files from ftp folder?
I can read an file from ftp, but i need read all files from folder.

ScienceDiscovery
Guest
ScienceDiscovery

Hi Minh, you can use org.apache.commons.net.ftp.FTPClient or com.jcraft.jsch.ChannelSftp for accessing FTP or SFTP folders.

Santhosh
Guest
Santhosh

Hi,

I need to read multiple files, process them and write them to respective response files. Please let me know how to achive this using spring batch.

Thanks in advance.
Santhosh

Ariel
Guest
Ariel

As I create disinter files with different names in batch obtaining information from the database? I mean ..
Consult the database, obtaining a list of records.
Discriminate the information according to a field, for example, country …
country and I have to create the corresponding file with the information

Venkat
Guest
Venkat

Hi mkyong ,

I have two different jobs running in same application ,Is it possible to have the jwo jobs share the same jobrepository and same datasource in turn.

What is the recomended pratise?