Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf8proc -> libmojibake #7917

Merged
merged 1 commit into from
Aug 11, 2014
Merged

utf8proc -> libmojibake #7917

merged 1 commit into from
Aug 11, 2014

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Aug 8, 2014

Closes #7656, fixes #7582.

cc: @jiahao

@jiahao jiahao added this to the 0.4 milestone Aug 8, 2014
@jiahao
Copy link
Member

jiahao commented Aug 8, 2014

I can't afford to tag a lower milestone. ;)

@JeffBezanson
Copy link
Member

I admit it feels a bit sketchy, and others might be against it, but I'd kind of like to merge this. I see it as a bug fix.

@staticfloat
Copy link
Member

I think it's a bugfix that's got an awful lot of new functionality in it. I'm -1 to merging; I'd like to just do what we can with the serialization stuff and then see if no other bugs fall out so we can release.

@jiahao
Copy link
Member

jiahao commented Aug 9, 2014

Actually, the only new functionality (to the extent that this can be considered new) is in the updated Unicode 7.0.0 code point assignments. (JuliaStrings/utf8proc#1) The character width stuff hasn't been put in yet.

@ViralBShah
Copy link
Member

I am also -1 to merging now. This could be done in 0.3.1.

@Keno
Copy link
Member

Keno commented Aug 9, 2014

Yeah, -1 from me too. I agree with Viral, let's get 0.3 out the door, then we can merge this and if it holds up we can have a minor release of 0.3 that includes this.

@JeffBezanson
Copy link
Member

Ok, I figured most people would feel that way.

@nalimilan
Copy link
Member

From a packaging POV this would be relatively painful to get in 0.3 or even in 0.3.1. I'd rather get to use libmojibake only in the development version for some time, hopefully upstream will eventually respond so that the fixes can be merged back into utf8proc.

@stevengj
Copy link
Member Author

stevengj commented Aug 9, 2014

@nalimilan, what's the packaging difficulty? libmojibake.a (just like libutf8proc.a) is linked directly into julia, so it does not need to be distributed as a separate library file.

(I don't think it's reasonable to wait for an upstream merge, from the sound of things.)

@tknopp
Copy link
Contributor

tknopp commented Aug 9, 2014

This is kind of related to the discussion on LLVM 3.3 vs 3.5 taken place in #6757. If I understand it correctly, LLVM is dynamically linked in the linux distribution packages. I don't know how this is handled with utf8proc and libuv.

@ViralBShah
Copy link
Member

We can link LLVM statically too. In general, distros do not like packages linking static versions of libraries that are available as packages. Giving upstream some more time and/or time for distro maintainers to package libmojibake is also a good reason to make this part of 0.3.1 or later.

@stevengj
Copy link
Member Author

libmojibake is a tiny library, so it's not nearly as bad as shipping our own LLVM.

If we wait for libmojibake to be merged upstream and/or packaged by distros, then we could easily be looking a 6-18 months before users see Unicode 6/7 support in Julia.

JeffBezanson added a commit that referenced this pull request Aug 11, 2014
@JeffBezanson JeffBezanson merged commit 366367e into JuliaLang:master Aug 11, 2014
@nalimilan
Copy link
Member

Woops, I thought I had replied but apparently my comment did not get posted. Here it is (now you have replied again, I must say that I agree we shouldn't wait for too long if upstream does not react):

@stevengj The problem is not technical, it's more about distro policies. To use a library in Fedora, you either need to create a new package (and get it reviewed, which takes some time, and does not really make sense for libmojibake), or to get a bundling exception granted. I've obtained temporary exceptions for libuv and Rmath, but Fedora people may start thinking Julia is not yet stable enough if I ask for another exception.

I hope upstream will acknowledge the fact that they are not maintaining utf8proc anymore and let you work on it instead. I packaged utf8proc in Fedora only to use it for Julia, I'd rather not maintain two separate packages just because upstream does not reply...

@stevengj
Copy link
Member Author

If you are the maintainer of the utf8proc package for Fedora, one option would be to merge the updated data tables from libmojibake as a bugfix patch.

@nalimilan
Copy link
Member

@stevengj Yeah, I considered that too. But then I need to carry a patch against Julia to keep using utf8proc (although very simple, it will need rebasing from time to time).

@tknopp
Copy link
Contributor

tknopp commented Aug 11, 2014

@nalimilan Well until now this PR is merged to master and not to release-0.3

@stevengj
Copy link
Member Author

Upstream of utf8proc is not an individual, but a non-profit organization, probably with bylaws that make it difficult for them to officially relinquish control over utf8proc even if they wanted to. For example, they require copyright assignments for patches to even be considered. However, they have generally been supportive of us, and have linked libmojibake as a "development fork" from the utf8proc web page.

stevengj added a commit that referenced this pull request Aug 11, 2014
@stevengj
Copy link
Member Author

@nalimilan, I just pushed a patch so that you can make USE_SYSTEM_UTF8PROC=1 if you want to continue linking to -lutf8proc.

stevengj added a commit that referenced this pull request Aug 11, 2014
@nalimilan
Copy link
Member

@stevengj Thanks, this is very helpful!

@quinnj quinnj mentioned this pull request Aug 21, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
unicode Related to unicode characters and encodings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update utf8proc -> libmojibake utf8proc doesn't support unicode 6
8 participants