-
In our project, we need to parse a csv file, but the provided sample is a little wired, all header items and values are with a double quotes. The following is a simple example to explain this problem I encountered. "Date","Name",...
"25 Mar","Test",... I created a
I have tried to use the following approaches to read the CSV file. DataFrame
.readDelim(
reader = BufferedReader(InputStreamReader(ByteArrayInputStream(data))),
format = DEFAULT.format.withQuote('"')
) or DataFrame
.readCSV(ByteArrayInputStream(data)) And got the exception like this. Can not find column `Date` in DataFrame |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
I found it is very wired, in fact the parsing is ok. I have tried to use the following fragment to convert to a data class. DataFrame
.readCSV(ByteArrayInputStream(data))
.convertTo<MyData>{} The date field(annotated with a I have tried add custom conversion like the following two approaches. But the column name DataFrame
.convert{ "Date"<String>()}.with{ MonthDay.parse(....)} // 1. this still caused exception `Can not find column `Date` in DataFrame`.
.readCSV(ByteArrayInputStream(data))
.convertTo<MyData>{
parser{MonthDay.parse(....) } // 2. this parser is no effects, all the date column in the Data classes are null
}
|
Beta Was this translation helpful? Give feedback.
Could it be that your original csv file has some invisible characters in the header?
readCSV
read everything as is, so it must come either from csv or a bug in csv parsing.. You can workaround by providing list of column names:DataFrame.readCSV(..., header = listOf("name", "date"))