Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for quotes in DelimitedLineAggregator #1139

Closed
spring-projects-issues opened this issue Jan 4, 2016 · 3 comments
Closed

Add support for quotes in DelimitedLineAggregator #1139

spring-projects-issues opened this issue Jan 4, 2016 · 3 comments

Comments

@spring-projects-issues
Copy link
Collaborator

Doug Breaux opened BATCH-2463 and commented

I can't imagine why org.springframework.batch.item.file.transform.DelimitedLineAggregator doesn't already support a quote character like the delimited tokenizer does. I can see that the StringUtils class being used doesn't provide this capability either, so I'm guessing it just wasn't trivially easy to add, but it seems like a necessary capability for working with delimited files.

I know I need it and am going to have to implement it some other way instead. (As far as I can tell, I can't simply extend an existing Batch class to add a quote delimiter around an individual field value as needed.)


2 votes, 9 watchers

@spring-projects-issues
Copy link
Collaborator Author

Kiichi Kuramoto commented

I agree in this issue, too.

CSV format is not standardized as an official specification,
I think that the RFC4180 is defact standard in general.

It is described as below to RFC 4180(Common Format and MIME Type for CSV Files).(No.5,6)

  • Each field may or may not be enclosed in double quotes.
  • Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.

In addition, these are supported in some libraries.

Handling quoted entries with embedded carriage returns (ie entries that span multiple lines).

Also, as another one of the reasons, FlatFileItemReader corresponds to the reading of delimited file the fields are enclosed in double-quotes.

@spring-projects-issues
Copy link
Collaborator Author

Doug Breaux commented

Any update?

@desprez
Copy link

desprez commented Mar 10, 2021

If it help i wrote this class based on BeanWrapperFieldExtractor

public class CSVQuotingBeanWrapperFieldExtractor<T> implements FieldExtractor<T>, InitializingBean {

		private String[] names;

		/**
		 * @param names field names to be extracted by the {@link #extract(Object)}
		 *              method.
		 */
		public void setNames(final String[] names) {
			Assert.notNull(names, "Names must be non-null");
			this.names = Arrays.asList(names).toArray(new String[names.length]);
		}

		/**
		 * @see org.springframework.batch.item.file.transform.FieldExtractor#extract(java.lang.Object)
		 */
		@Override
		public Object[] extract(final T item) {
			final List<Object> values = new ArrayList<>();

			final BeanWrapper bw = new BeanWrapperImpl(item);
			for (final String propertyName : this.names) {
				if (bw.getPropertyType(propertyName).isAssignableFrom(String.class)) {
					values.add(doublequoteIfString(bw.getPropertyValue(propertyName)));
				} else {
					values.add(bw.getPropertyValue(propertyName));
				}
			}
			return values.toArray();
		}

		@Override
		public void afterPropertiesSet() {
			Assert.notNull(names, "The 'names' property must be set.");
		}

		private String quote(final String str) {
			return str != null ? "\"" + str + "\"" : null;
		}

		private Object doublequoteIfString(final Object obj) {
			return obj instanceof String ? quote((String) obj) : obj;
		}
}

cppwfs added a commit to cppwfs/spring-batch that referenced this issue May 13, 2021
resolves spring-projects#1139

We may need to discuss escaping the quotes embedded in the elements.
I chose the triple quote method to handle resolve this.   An example would be fo"o would be replaced with "fo"""o"
cppwfs added a commit to cppwfs/spring-batch that referenced this issue May 13, 2021
resolves spring-projects#1139

We may need to discuss escaping the quotes embedded in the elements.
I chose the triple quote method to handle resolve this.   An example would be fo"o would be replaced with "fo"""o"
@fmbenhassine fmbenhassine added has: votes Issues that have votes and removed status: waiting-for-triage Issues that we did not analyse yet labels Sep 15, 2023
@fmbenhassine fmbenhassine added this to the 5.1.0-M3 milestone Sep 15, 2023
@fmbenhassine fmbenhassine changed the title DelimitedLineAggregator support string quote characters [BATCH-2463] Add support for quotes in DelimitedLineAggregator Sep 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants