Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error:Invalid multibyte sequence #438

Closed
lixinyao opened this issue Jun 16, 2016 · 5 comments
Closed

error:Invalid multibyte sequence #438

lixinyao opened this issue Jun 16, 2016 · 5 comments

Comments

@lixinyao
Copy link

Hi,
I am a chinese user of readr. Can you help me? Thanks!

> mydata1 = read_csv("/Users/lixinyao/Desktop/20160502.InputContract.SE.ByWeek.csv",
+ locale = locale(encoding="gbk"))
错误: Invalid multibyte sequence

But

> mydata = read.csv("/Users/lixinyao/Desktop/20160502.InputContract.SE.ByWeek.csv",
+                   header = TRUE,
+                   sep = ",",
+                   stringsAsFactors = FALSE,
+                   fileEncoding = "gbk")
Warning messages:
1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  输入链结'/Users/lixinyao/Desktop/20160502.InputContract.SE.ByWeek.csv'内的输入不对
2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  EOF within quoted string

It contains some specific symbols. What should I do? Thank you very much!

@hadley
Copy link
Member

hadley commented Jun 16, 2016

Can you please make a minimal version of that file available online somewhere?

@lixinyao
Copy link
Author

lixinyao commented Jun 17, 2016

Thank your very much for you reply. I just sent a email . Thanks !

@hadley
Copy link
Member

hadley commented Jun 17, 2016

It doesn't appear to be in GBK encoding but in GB18030.

This code worked for me:

read_csv("~/Desktop/20160502.InputContract.SE.ByWeek.csv", locale = locale(encoding = "GB18030"))

BTW when I print that dataset out, the column names don't seem to align with the columns. I think that's probably a bug in tibble. Is it ok to share the column names in another bug report? (but not the values of the columns)

@hadley hadley closed this as completed Jun 17, 2016
@hadley
Copy link
Member

hadley commented Jun 17, 2016

Oh, I just google translated them and they seem pretty innocuous - I think I can probably share "completion date" and "contract entry date"!

@hadley
Copy link
Member

hadley commented Jun 17, 2016

Filed issue here tidyverse/tibble#100

@lock lock bot locked and limited conversation to collaborators Sep 25, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants