Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert unicode support #1151

Merged
merged 1 commit into from
Dec 3, 2021

Conversation

garazdawi
Copy link
Contributor

The support to render unicode broke the latin1 support so until we fix both we fall back to latin1 so that we can open such files...

People really should update their files to be unicode!

For anyone wondering how to update latin1 files to unicode:

convert(Filename) ->
  {ok, Latin1} = file:read_file(Filename),
  file:write_file(Filename, unicode:characters_to_binary(re:replace(Latin1,"coding: latin-1",""), latin1, unicode)).

The support to render unicode broke the latin1 support
so until we fix both we fall back to latin1 so that we
can open such files...

People really should update their files to be unicode!
@robertoaloi robertoaloi merged commit e12f90c into erlang-ls:main Dec 3, 2021
@robertoaloi
Copy link
Member

Thanks!

@garazdawi garazdawi deleted the lukas/revert-unicode-support branch December 3, 2021 10:48
@robertoaloi
Copy link
Member

robertoaloi commented Dec 3, 2021

@garazdawi Maybe we could use unicode by default and provide a configuration option in erlang_ls.config that would allow users to specify a custom encoding. WDYT?

@garazdawi
Copy link
Contributor Author

To be honest I think that erlang-ls should only support utf-8 source files. Support for it was added in 2013 and tools such as erlfmt assume utf-8 encoding. There are more valuable things to spend time on to make work, and it will be quite a lot of work if you want to do it properly.

@robertoaloi
Copy link
Member

Long term, I agree. I was thinking short-term: we could have your original code ({encoding, unicode}) and, if a different encoding was specified in the users settings, pass that one in els_stdio.

@garazdawi
Copy link
Contributor Author

If you introduce such an option, won't there be some expectation for latin-1 to work? An option may be a good solution in the very short-term, but if we ever do any more work to get better unicode support, then the latin1 support will have to make way.

Right now it seems that both unicode and latin-1 works mostly by accident. As I mentioned erlfmt does not handle latin-1, so els_parser:parse* does not work on those files. There are also other calls (to re for example) that need to be fixed to handle unicode properly. Maybe the next thing I'll do for erlang_ls will be to add many testcases with great function names such as '어디든 가치가 있는 곳으로 가려면 지름길은 없다'.

@alvinlindstam
Copy link

Given that Erlang supports per-file encoding options, I would guess that proper support for file encodings would require per-file awareness of the encoding (based on that spec) rather than a project config.

I'm not yet familiar with the code for this project, does the file encoding have to be global?

@jesperes
Copy link

jesperes commented Dec 3, 2021

It would be sad to see latin1-support go. We want to move to utf8, but ocean-liners don't turn on a dime. :)

@garazdawi
Copy link
Contributor Author

It would be sad to see latin1-support go. We want to move to utf8, but ocean-liners don't turn on a dime. :)

If you start turning now, maybe you will be done turning by the time we've implemented proper unicode support :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants