In this example, we will learn how to read from the multiple files and load data into target system. Often there is need to read data from multiple source system (could be CSV, XML, XLSX, Relational DB or NOSQL) and load all the data into target system (could be again CSV, XML, XLSX, Relational DB or NOSQL).
Maven Dependency: All the required dependencies are mentioned. We used latest version of Spring Boot 2.2.3.RELEASE as of today.
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.2</version>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-test</artifactId>
<scope>test</scope>
</dependency></dependencies>
JobConfig — This is the main configuration class where details about batch job has been configured.
MultiResourceItemReader — Reads items from multiple resources sequentially — resource list is given by setResources(Resource []), the actual reading is delegated to setDelegate(ResourceAwareItemReaderItemStream). Input resources are ordered using setComparator(Comparator) to make sure resource ordering is preserved between job runs in restart scenario.
FlatFileItemReader — Restartable ItemReader that reads lines from input setResource(Resource). Line is defined by the setRecordSeparatorPolicy(RecordSeparatorPolicy) and mapped to item using setLineMapper(LineMapper). If an exception is thrown during line mapping it is rethrown as FlatFileParseException adding information about the problematic line and its line number.
ItemWriter — Basic interface for generic output operations. Class implementing this interface will be responsible for serializing objects as necessary. Generally, it is responsibility of implementing class to decide which technology to use for mapping and how it should be configured. The write method is responsible for making sure that any internal buffers are flushed. If a transaction is active it will also usually be necessary to discard the output on a subsequent rollback. The resource to which the writer is sending data should normally be able to handle this itself.
Step — Batch domain interface representing the configuration of a step. As with the Job, a Step is meant to explicitly represent the configuration of a step by a developer, but also the ability to execute the step.
Job — Batch domain object representing a job. Job is an explicit abstraction representing the configuration of a job specified by a developer. It should be noted that restart policy is applied to the job as a whole and not to a step.
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.MultiResourceItemReader;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;import com.example.domain.Customer;
import com.example.mapper.CustomerFieldSetMapper;@Configuration
public class JobConfig {
@Autowired
private JobBuilderFactory jobBuilderFactory; @Autowired
private StepBuilderFactory stepBuilderFactory; @Value("classpath*:/data/customer*.csv")
private Resource[] inputFiles; @Bean
public MultiResourceItemReader<Customer> multiResourceItemreader() {
MultiResourceItemReader<Customer> reader = new MultiResourceItemReader<>();
reader.setDelegate(customerItemReader());
reader.setResources(inputFiles);
return reader;
}
@Bean
public FlatFileItemReader<Customer> customerItemReader() {
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setNames(new String[] { "id", "firstName", "lastName", "birthdate" }); DefaultLineMapper<Customer> customerLineMapper = new DefaultLineMapper<>();
customerLineMapper.setLineTokenizer(tokenizer);
customerLineMapper.setFieldSetMapper(new CustomerFieldSetMapper());
customerLineMapper.afterPropertiesSet(); FlatFileItemReader<Customer> reader = new FlatFileItemReader<>();
reader.setLineMapper(customerLineMapper);
return reader;
}
@Bean
public ItemWriter<Customer> customerItemWriter(){
return items -> {
for (Customer customer : items) {
System.out.println(customer.toString());
}
};
}
@Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<Customer, Customer>chunk(10)
.reader(multiResourceItemreader())
.writer(customerItemWriter())
.build();
}
@Bean
public Job job() {
return jobBuilderFactory.get("job")
.start(step1())
.build();
}
}
Customer — Domain class to hold Customer details.
@AllArgsConstructor
@NoArgsConstructor
@Builder
@Data
public class Customer implements ResourceAware{
private Long id;
private String firstName;
private String lastName;
private LocalDateTime birthdate;
private Resource resource;}
CustomerFieldSetMapper — Interface that is used to map data obtained from a FieldSet into an object.
public class CustomerFieldSetMapper implements FieldSetMapper<Customer> {
private static final DateTimeFormatter DT_FORMAT = DateTimeFormatter.ofPattern("dd-MM-yyyy HH:mm:ss");
@Override
public Customer mapFieldSet(FieldSet fieldSet) throws BindException {
//@// @formatter:off
return Customer.builder()
.id(fieldSet.readLong("id"))
.firstName(fieldSet.readString("firstName"))
.lastName(fieldSet.readString("lastName"))
.birthdate(LocalDateTime.parse(fieldSet.readString("birthdate"), DT_FORMAT))
.build();
// @formatter:on
}}
customer1.csv — This is sample data will be read.
1,John,Doe,10-10-1952 10:10:10
2,Amy,Eugene,05-07-1985 17:10:00
3,Laverne,Mann,11-12-1988 10:10:10
4,Janice,Preston,19-02-1960 10:10:10
5,Pauline,Rios,29-08-1977 10:10:10
6,Perry,Burnside,10-03-1981 10:10:10
7,Todd,Kinsey,14-12-1998 10:10:10
8,Jacqueline,Hyde,20-03-1983 10:10:10
9,Rico,Hale,10-10-2000 10:10:1010,Samuel,Lamm,11-11-1999 10:10:10
customer2.csv — This is sample data will be read.
11,Robert,Coster,10-10-1972 10:10:10
12,Tamara,Soler,02-01-1978 10:10:10
13,Justin,Kramer,19-11-1951 10:10:10
14,Andrea,Law,14-10-1959 10:10:10
15,Laura,Porter,12-12-2010 10:10:10
16,Michael,Cantu,11-04-1999 10:10:10
17,Andrew,Thomas,04-05-1967 10:10:10
18,Jose,Hannah,16-09-1950 10:10:10
19,Valerie,Hilbert,13-06-1966 10:10:1020,Patrick,Durham,12-10-1978 10:10:10
Conclusion — We can conclude that we’re able to successfully read the multiples files and simply able to print the data on console.
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v2.2.2.RELEASE)
2019-12-31 11:47:37.900 INFO 12656 --- [ main] c.example.MultipleFlatFilesApplication : Starting MultipleFlatFilesApplication on DESKTOP-NQ639DU with PID 12656 (E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes started by pc in E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles)
2019-12-31 11:47:37.905 INFO 12656 --- [ main] c.example.MultipleFlatFilesApplication : No active profile set, falling back to default profiles: default
2019-12-31 11:47:38.711 INFO 12656 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2019-12-31 11:47:38.876 INFO 12656 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
2019-12-31 11:47:38.969 INFO 12656 --- [ main] o.s.b.c.r.s.JobRepositoryFactoryBean : No database type set, using meta data indicating: H2
2019-12-31 11:47:39.064 INFO 12656 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
2019-12-31 11:47:39.165 INFO 12656 --- [ main] c.example.MultipleFlatFilesApplication : Started MultipleFlatFilesApplication in 1.62 seconds (JVM running for 2.655)
2019-12-31 11:47:39.167 INFO 12656 --- [ main] o.s.b.a.b.JobLauncherCommandLineRunner : Running default command line with: [--spring.output.ansi.enabled=always]
2019-12-31 11:47:39.238 INFO 12656 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=job]] launched with the following parameters: [{-spring.output.ansi.enabled=always}]
2019-12-31 11:47:39.291 INFO 12656 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step1]
Customer(id=1, firstName=John, lastName=Doe, birthdate=1952-10-10T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=2, firstName=Amy, lastName=Eugene, birthdate=1985-07-05T17:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=3, firstName=Laverne, lastName=Mann, birthdate=1988-12-11T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=4, firstName=Janice, lastName=Preston, birthdate=1960-02-19T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=5, firstName=Pauline, lastName=Rios, birthdate=1977-08-29T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=6, firstName=Perry, lastName=Burnside, birthdate=1981-03-10T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=7, firstName=Todd, lastName=Kinsey, birthdate=1998-12-14T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=8, firstName=Jacqueline, lastName=Hyde, birthdate=1983-03-20T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=9, firstName=Rico, lastName=Hale, birthdate=2000-10-10T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=10, firstName=Samuel, lastName=Lamm, birthdate=1999-11-11T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer1.csv])
Customer(id=11, firstName=Robert, lastName=Coster, birthdate=1972-10-10T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=12, firstName=Tamara, lastName=Soler, birthdate=1978-01-02T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=13, firstName=Justin, lastName=Kramer, birthdate=1951-11-19T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=14, firstName=Andrea, lastName=Law, birthdate=1959-10-14T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=15, firstName=Laura, lastName=Porter, birthdate=2010-12-12T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=16, firstName=Michael, lastName=Cantu, birthdate=1999-04-11T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=17, firstName=Andrew, lastName=Thomas, birthdate=1967-05-04T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=18, firstName=Jose, lastName=Hannah, birthdate=1950-09-16T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=19, firstName=Valerie, lastName=Hilbert, birthdate=1966-06-13T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
Customer(id=20, firstName=Patrick, lastName=Durham, birthdate=1978-10-12T10:10:10, resource=file [E:\Spring_Batch\spring-batch-latest\Minella\Spring-Batch-by-Michael-Minella\multipleFlatFiles\target\classes\data\customer2.csv])
2019-12-31 11:47:39.477 INFO 12656 --- [ main] o.s.batch.core.step.AbstractStep : Step: [step1] executed in 186ms
2019-12-31 11:47:39.483 INFO 12656 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=job]] completed with the following parameters: [{-spring.output.ansi.enabled=always}] and the following status: [COMPLETED] in 214ms
2019-12-31 11:47:39.488 INFO 12656 --- [extShutdownHook] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated...2019-12-31 11:47:39.490 INFO 12656 --- [extShutdownHook] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed.