Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing documentation about MultiResourceItemWriter not creating empty files when no data goes through delegates #4645

Closed
javaHelper opened this issue Aug 14, 2024 · 4 comments
Labels
for: backport-to-5.1.x Issues that will be back-ported to the 5.1.x line in: documentation type: enhancement
Milestone

Comments

@javaHelper
Copy link

javaHelper commented Aug 14, 2024

Bug description
Spring Batch framework not creating an emty output files when data doesn't flow through that Classifier with the MultiResourceItemWriter

Environment
Spring Boot v2.7.1, Java version - 11

Steps to reproduce
Here is the code

Employee.java

@AllArgsConstructor
@NoArgsConstructor
@Data
@Builder
public class Employee {
    private String empId;
    private String firstName;
    private String lastName;
    private String role;
	@Override
	public String toString() {
		return empId + ","+ firstName+ ","+ lastName+ ","+ role;
	}
}
package com.example;

import org.springframework.batch.item.ItemWriter;
import org.springframework.classify.Classifier;

import lombok.Setter;


@Setter
public class EmployeeClassifier implements Classifier<Employee, ItemWriter<? super Employee>> {
    private static final long serialVersionUID = 1L;
    private ItemWriter<Employee> javaDeveloperFileItemWriter;
    private ItemWriter<Employee> pythonDeveloperFileItemWriter;
    private ItemWriter<Employee> cloudDeveloperFileItemWriter;
    
    public EmployeeClassifier() {
    	
    }

    public EmployeeClassifier(ItemWriter<Employee> javaDeveloperFileItemWriter,
                              ItemWriter<Employee> pythonDeveloperFileItemWriter,
                              ItemWriter<Employee> cloudDeveloperFileItemWriter) {
        this.javaDeveloperFileItemWriter = javaDeveloperFileItemWriter;
        this.pythonDeveloperFileItemWriter = pythonDeveloperFileItemWriter;
        this.cloudDeveloperFileItemWriter = cloudDeveloperFileItemWriter;
    }

    @Override
    public ItemWriter<? super Employee> classify(Employee employee) {
        if(employee.getRole().equals("Java Developer")){
            return javaDeveloperFileItemWriter;
        }
        else if(employee.getRole().equals("Python Developer")){
            return pythonDeveloperFileItemWriter;
        }
        return cloudDeveloperFileItemWriter;
    }
}
package com.example;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

public class EmployeeFieldSetMapper implements FieldSetMapper<Employee> {
    @Override
    public Employee mapFieldSet(FieldSet fieldSet) throws BindException {
        return Employee.builder()
                .empId(fieldSet.readRawString("empId"))
                .firstName(fieldSet.readRawString("firstName"))
                .lastName(fieldSet.readRawString("lastName"))
                .role(fieldSet.readRawString("role"))
                .build();
    }
}
package com.example;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

public class EmployeeFieldSetMapper implements FieldSetMapper<Employee> {
    @Override
    public Employee mapFieldSet(FieldSet fieldSet) throws BindException {
        return Employee.builder()
                .empId(fieldSet.readRawString("empId"))
                .firstName(fieldSet.readRawString("firstName"))
                .lastName(fieldSet.readRawString("lastName"))
                .role(fieldSet.readRawString("role"))
                .build();
    }
}
package com.example;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.FlatFileItemWriter;
import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
import org.springframework.batch.item.file.builder.FlatFileItemWriterBuilder;
import org.springframework.batch.item.file.builder.MultiResourceItemWriterBuilder;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.batch.item.file.transform.PassThroughLineAggregator;
import org.springframework.batch.item.support.ClassifierCompositeItemWriter;
import org.springframework.batch.item.support.builder.ClassifierCompositeItemWriterBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.classify.Classifier;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.FileSystemResource;

@Configuration
public class MyJobConfig {

	@Autowired
	private JobBuilderFactory jobBuilderFactory;
	@Autowired
	private StepBuilderFactory stepBuilderFactory;
	
	@Bean
    public FlatFileItemReader<Employee> itemReader() {
        DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
        tokenizer.setNames("empId", "firstName", "lastName", "role");

        DefaultLineMapper<Employee> employeeLineMapper = new DefaultLineMapper<>();
        employeeLineMapper.setLineTokenizer(tokenizer);
        employeeLineMapper.setFieldSetMapper(new EmployeeFieldSetMapper());
        employeeLineMapper.afterPropertiesSet();

        return new FlatFileItemReaderBuilder<Employee>()
                .name("flatFileReader")
                .linesToSkip(1)
                .resource(new ClassPathResource("employee.csv"))
                .lineMapper(employeeLineMapper)
                .build();
    }

    @Bean
    public ClassifierCompositeItemWriter<Employee> classifierCompositeItemWriter() throws Exception {
        Classifier<Employee, ItemWriter<? super Employee>> classifier =
                new EmployeeClassifier(javaDeveloperItemWriter(), pythonDeveloperItemWriter(), cloudDeveloperItemWriter());
        return new ClassifierCompositeItemWriterBuilder<Employee>()
                .classifier(classifier)
                .build();
    }

    @Bean
    public ItemWriter<Employee> javaDeveloperItemWriter() {
        FlatFileItemWriter<Employee> itemWriter = new FlatFileItemWriterBuilder<Employee>()
                .lineAggregator(new PassThroughLineAggregator<>())
                .name("itemsWriter")
                .build();

        return new MultiResourceItemWriterBuilder<Employee>()
                .name("javaDeveloperItemWriter")
                .delegate(itemWriter)
                .resource(new FileSystemResource("javaDeveloper-employee.csv"))
                .itemCountLimitPerResource(2)
                .resourceSuffixCreator(index -> "-" + index)
                .build();
    }

    @Bean
    public ItemWriter<Employee> pythonDeveloperItemWriter() {
        FlatFileItemWriter<Employee> itemWriter = new FlatFileItemWriterBuilder<Employee>()
                .lineAggregator(new PassThroughLineAggregator<>())
                .name("itemsWriter")
                .build();

        return new MultiResourceItemWriterBuilder<Employee>()
                .name("pythonDeveloperItemWriter")
                .delegate(itemWriter)
                .resource(new FileSystemResource("pythonDeveloper-employee.csv"))
                .itemCountLimitPerResource(2)
                .resourceSuffixCreator(index -> "-" + index)
                .build();
    }

    @Bean
    public ItemWriter<Employee> cloudDeveloperItemWriter() {
        FlatFileItemWriter<Employee> itemWriter = new FlatFileItemWriterBuilder<Employee>()
                .lineAggregator(new PassThroughLineAggregator<>())
                .name("itemsWriter")
                .build();

        return new MultiResourceItemWriterBuilder<Employee>()
                .name("cloudDeveloperItemWriter")
                .delegate(itemWriter)
                .resource(new FileSystemResource("cloudDeveloper-employee.csv"))
                .itemCountLimitPerResource(2)
                .resourceSuffixCreator(index -> "-" + index)
                .build();
    }

    @Bean
    public Step step() throws Exception {
        return stepBuilderFactory.get("step")
                .<Employee, Employee>chunk(1)
                .reader(itemReader())
                .writer(classifierCompositeItemWriter())
                .build();
    }

    @Bean
    public Job job() throws Exception {
        return jobBuilderFactory.get("job")
                .start(step())
                .build();
    }
}

employee.csv

empId,firstName,lastName,role
1,John ,Doe,Java Developer
2,Jane ,Doe,Python Developer
empId,firstName,lastName,role
1,John ,Doe,Java Developer
2,Jane ,Doe,Python Developer
package com.example;

import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;

@EnableBatchProcessing
@SpringBootApplication(exclude = {DataSourceAutoConfiguration.class})
public class SpringBatchMultipleFilesWithCompositeApplication {

	public static void main(String[] args) {
		SpringApplication.run(SpringBatchMultipleFilesWithCompositeApplication.class, args);
	}
}

Expected behavior
Even though there is no data for the role cloud develoer, expectation from Spring Batch to create an empty file with headers for the cloud developer.

Or am I missing anything?

I was hoping to see cloudDeveloper-1.csv file

Screenshot 2024-08-14 at 11 20 45 AM
@javaHelper javaHelper added status: waiting-for-triage Issues that we did not analyse yet type: bug labels Aug 14, 2024
@fmbenhassine
Copy link
Contributor

Expected behavior
Even though there is no data for the role cloud develoer, expectation from Spring Batch to create an empty file with headers for the cloud developer.

This is because you are wrapping the FlatFileItemWriter in a MultiResourceItemWriter (which is designed to create an output file only when there are items to write). In your example, since there are no items routed to cloud developer writer, no file is created for that type.

This detail seems to be missing in the Javadocs of MultiResourceItemWriter, so I will plan a clarification of that in the next release.

@fmbenhassine fmbenhassine added in: documentation for: backport-to-5.1.x Issues that will be back-ported to the 5.1.x line type: enhancement and removed status: waiting-for-triage Issues that we did not analyse yet type: bug labels Sep 5, 2024
@fmbenhassine fmbenhassine added this to the 5.2.0 milestone Sep 5, 2024
@javaHelper
Copy link
Author

Ok Thanks. Please update the docs in detailed to make things clear about it.
As a work around, I develop a tasklet, which is scanning the target directories (where output files are getting created) and if any particular writer doesn't get the data, I'm creating it through code.

I still believe it would be a nice option (true/false) to provide this functionality to create empty files when no item available for that writer.

Would you mind providing some guidance on : https://stackoverflow.com/questions/78891040/spring-batch-issue-with-multiresourceitemwriter-and-classifiercompositeitemwrite

@javaHelper
Copy link
Author

@fmbenhassine - Would you mind answering questions: https://stackoverflow.com/questions/78891040/spring-batch-issue-with-multiresourceitemwriter-and-classifiercompositeitemwrite

It's not clear we should keep as chunk size as 0 has any impact on the performance? If we we keep any size like 1000 or 2000. Then I definately see it's end up writing more records into file. so looks like doesn't behaving well.

@fmbenhassine
Copy link
Contributor

@javaHelper I added an answer: https://stackoverflow.com/a/78968181/5019386. This seems like a bug in Spring Batch to me, but I need to validate that with an example. Please open a separate issue for that case.

Note: Please do not add comments on an issue to ask for support on a different issue. Thank you for your comprehension.

@fmbenhassine fmbenhassine modified the milestones: 5.2.0, 5.2.0-RC1 Sep 16, 2024
@fmbenhassine fmbenhassine changed the title Classifier and MultiResourceItemWriter not creating an empty files when no data goes through that MultiResourceItemWriter not creating an empty files when no data goes through that Oct 23, 2024
@fmbenhassine fmbenhassine changed the title MultiResourceItemWriter not creating an empty files when no data goes through that Missing documentation about MultiResourceItemWriter not creating empty files when no data goes through delegates Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for: backport-to-5.1.x Issues that will be back-ported to the 5.1.x line in: documentation type: enhancement
Projects
None yet
Development

No branches or pull requests

2 participants