Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apostrophe becomes ' #81

Closed
X-Xadro opened this issue Sep 25, 2023 · 7 comments
Closed

Apostrophe becomes ' #81

X-Xadro opened this issue Sep 25, 2023 · 7 comments

Comments

@X-Xadro
Copy link

X-Xadro commented Sep 25, 2023

Whenever you change the series field with a word containing a apostrophe ( ' ) it changes into '

This is luckily, as far as i can tell, only in the Series field, if you do this in the Title field it picks the apostrophe up like it should.

Just found out that the ( ’ ) does work as intended just not the ( ' )

@benchen71
Copy link
Owner

I can confirm that this was intentional. All fields in the OPF file follow the specifications (see https://www.dublincore.org/specifications/dublin-core/dcmes-xml/). If you view the OPF file (using the button in the Advanced Tasks panel), for an EPUB with an apostrophe in the book title, you will see that the apostrophe is encoded as ' in the title also. It's just that whatever EPUB viewing software you are using correctly parses this string to show you the single character.

However, the "series" field is not a standard part of the specifications (I'm just following how Calibre implemented series). So it is arguable that my program should not follow the specifications for encoding special characters.

I don't know what is the right thing to do in this case. At this point, I am not going to change the program, unless someone can provide further guidance on the matter. I will close this issue in a month if no guidance is forthcoming.

@benchen71
Copy link
Owner

Since there has been no further discussion, I am closing this issue.

@crimsonidol
Copy link

crimsonidol commented May 27, 2024

I'm not entirely sure if it's the same as X-Xadro asked for but my I noticed a similar problem.

When I edit an epub to add the series any special XML-character that needs to get escaped gets properly escaped when saved into the opf-file but when reopening the file, the value for series doesn't get unescaped, e.g. ' doesn't turn into ' but is displays as ' inside the series-field.
And when not noticing it, it further gets escaped as &amp' when saving after performing another action. E.g. for one file it looks like this for me:
unescaped_apostrophe

However, the "series" field is not a standard part of the specifications (I'm just following how Calibre implemented series). So it is arguable that my program should not follow the specifications for encoding special characters.

The way you do it with the element belongs-to-collection is part of the standard, so everything's fine the way you write it to the file:
https://www.w3.org/TR/epub-33/#sec-belongs-to-collection
w3c/epub-specs#1356

@benchen71
Copy link
Owner

Hmm, I might need to check this out. Maybe the best answer is simply not to use the OPF specification for that field.

@benchen71 benchen71 reopened this May 27, 2024
@crimsonidol
Copy link

I think like there's a little misunderstanding. Using that field is fine but what's missing is when reading the field that it should be treated as an XMLInput:

TextBox15.Text = Mid(metadatafile, seriestitlepos, endpos - seriestitlepos)

For calibre:series and other fields (like dc:author) the assigned value looks like this:

TextBox15.Text = XMLInput(Mid(metadatafile, startpos + lenheader, endpos - startpos - lenheader))

@benchen71
Copy link
Owner

Yes, it looks like you've found a bug! I'll try and get a new version out soon...

@benchen71
Copy link
Owner

Hopefully fixed in 1.9.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants