-
-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epic: Editor Shortcodes #1541
Comments
This promises several new possibilities, such as controlling the spell checker via language and country codes, e.g. However, I would like to mention that angle brackets, i.e. a pseudo-XML syntax, offer the advantage that you can use the sax.ContentHandler for parsing, or possibly other XML tools out of the box to check the well-formedness with minimal effort. I'm currently working on this myself, |
Angle brackets are already used for the auto-complete feature, so they cannot be used for anything else since they are free form. The shortcodes are parsed by RegEx in novelWriter too, so if you want to ensure compatibility, you can find them in the nwRegEx class in: https://github.com/vkbo/novelWriter/blob/main/novelwriter/constants.py The current ones are: FMT_EI = r"(?<![\w\\])(_)(?![\s_])(.+?)(?<![\s\\])(\1)(?!\w)"
FMT_EB = r"(?<![\w\\])([\*]{2})(?![\s\*])(.+?)(?<![\s\\])(\1)(?!\w)"
FMT_ST = r"(?<![\w\\])([~]{2})(?![\s~])(.+?)(?<![\s\\])(\1)(?!\w)"
FMT_SC = r"(?i)(?<!\\)(\[[\/\!]?(?:i|b|s|u|sup|sub)\])"
FMT_SV = r"(?<!\\)(\[(?i)(?:fn|footnote):)(.+?)(?<!\\)(\])"
|
I see. In most cases, I translated shortcodes with replacement lists (*). The only pitfall when converting e.g. into ODT format were the cases where shortcode-tagged passages spanned several paragraphs (yWriter allows this). * Just for the record, a code example: odtReplacements = [
('[i]', '<text:span text:style-name="Emphasis">'),
('[/i]', '</text:span>'),
('[b]', '<text:span text:style-name="Strong_20_Emphasis">'),
('[/b]', '</text:span>'),
]
for yw, od in odtReplacements:
text = text.replace(yw, od) This is preceded by a routine that closes shortcode tags before line breaks and reopens them afterwards, like so: #--- Process markup reaching across linebreaks.
tags = ['i', 'b']
newlines = []
lines = text.split('\n')
isOpen = {}
opening = {}
closing = {}
for tag in tags:
isOpen[tag] = False
opening[tag] = f'[{tag}]'
closing[tag] = f'[/{tag}]'
for line in lines:
for tag in tags:
if isOpen[tag]:
line = f'{opening[tag]}{line}'
isOpen[tag] = False
while line.count(opening[tag]) > line.count(closing[tag]):
line = f'{line}{closing[tag]}'
isOpen[tag] = True
while line.count(closing[tag]) > line.count(opening[tag]):
line = f'{opening[tag]}{line}'
line = line.replace(f'{opening[tag]}{closing[tag]}', '')
newlines.append(line)
text = '\n'.join(newlines) Perhaps not the most efficient solution, but it ensures that the generated XML code will be well-formed as for formatting. |
Just a thought when re-reading this. How do you handle nested formats? That was my own first implementation, and if I recall correctly, the Open Document Standard allows it, but LibeOffice, which I use as reference implementation, doesn't seem to. The HTML converter in novelWriter uses a simple lookup table, since HTML handles nested formatting. Because of this issue with LibreOffice, I wrote a rather complex algorithm (which I've later rewritten to a much simple form) that detects text fragments where all characters have the same format, and use an "edge detection" approach to check where the aggregated format changes (I use binary masks to track this) and generate a new T1 format key for ODT on each unique format. I'm pretty sure this is how LibreOffice does it internally too. This is incidentally also how the text layout in Qt is implemented as well, so it's actually quite easy to serialise text formats to and from a rich text implementation using Qt. I wrote one in C++ a couple of years ago that serialised a rich text document into JSON. |
Yes, when combining italics and bold, LibreOffice (and OpenOffice) create new character styles in the When parsing the ODT format, I first create lookup tables from all automatic style formats, to see which ones contain bold or italics. Then I translate them into Strong or Emphasis, where This is a shortcode-formatted text created with yWriter:
The nested shortcode seemed to be no problem for my ywriter-ODT conversion, since OpenOffice accepts code like this: Hal Spacejock was sitting at the <text:span text:style-name="Emphasis">Black
<text:span text:style-name="Strong_20_Emphasis">Gull</text:span>'s
</text:span> flight console Update: Hal Spacejock was sitting at the <em>Black <strong>Gull</strong>'s</em> flight console As a side note: With my project, I use my own xml dialect with formatting tags similar to xhtml. 2nd Edit: Another update: here is an example where bold and italics shortcoded passages overlap:
After conversion to ODT, the first shortcode tag "wins" due to the unspecific xml closing tags: Hal Spacejock was sitting at the <text:span text:style-name="Emphasis">Black
<text:span text:style-name="Strong_20_Emphasis">Gull's
</text:span> flight</text:span> console Thus, the result looks like in the "nesting" example shown above. Conclusion: Nesting ODT xml spans works well. Converting "overlapping" shortcode formatting |
Interesting. I'm pretty sure I tried and failed to do something similar. Could be I made another mistake, or they've updated since then. In either case, novelWriter generates styles in the same way LibreOffice does, so it works. It can potentially generate a large number, that's true. I use the Python xml module to build the element tree, which requires some quirky code when using |
Sorry, I was just updating my former comment again, adding an example for overlapping shortcode.
Do you mean the elementtree module? For parsing the ODT content.xml file, I prefer an event-based sax parser. As a side note: |
I don't parse ODT, only write it. |
Ah, I see. So you build the ODT DOM tree with elementtree? |
Yes, exactly. I don't use any third party tools or libraries. Everything is built from scratch as either a flat fodt XML or the various files needed for a zipped ODT file. I only support the formatting tags actually needed by novelWriter, so it's a subset of the Open Document standard. It's still a fair bit of code: https://github.com/vkbo/novelWriter/blob/dev/novelwriter/core/toodt.py Since each formatting tag is assigned a binary bit, I can just join them with an or operation, producing an integer number for each combination of formats. They are created on first use with an incremental novelWriter/novelwriter/core/toodt.py Lines 716 to 743 in 2480c7a
|
Very nice, all these hex constants and bitmasks. When I was a student, it was said: "A real programmer writes Fortran code in any language". What is it today? ;-) |
When I first looked at the novelWriter code, I said to myself: "Java programmer". This time, I'd rather guess "C". |
Never touched the stuff!
Sure, a bit. I started on AMOS (basic for Amiga) and Visual Basic, then worked with PHP a lot, and then a fair bit of C++ and Fortran. Not a lot of C. I also did a lot of Matlab for a while. Been working with Python for years now.
C++ is so much more verbose than Python, so it takes a lot longer to write the code. I sometimes look in on the code, but while there is so much work to be done on novelWriter, I doubt I can handle another project. Maybe when I retire in 20 years! |
Looking at the issues here, with all the feature requests, I can well believe that novelWriter will become your life's work. |
I like Python. It's what I do for my day job as well. I prefer writing object oriented code, which I guess is why you suspected Java. I have given golang a go as well (no pun intended) and various other languages, but Python really is good at a lot of things. Since I have a computational physics background, it was always too slow for work I've done in the past. Only Fortran, C and C++ would do. So I'm happy that I can now work with something more straightforward. It's probably why I put in so much time on this project. A lot of the things I want to do I can achieve (at a first draft level) in a couple of hours. |
If I'm not mistaken, I saw somewhere getter and setter methods. The "Pythonic" approach would be properties with decorators. But so what? To come back to the topic: |
A lot of my code subclasses Qt objects, which are written in C++, so I tend to follow the Qt code style for consistency. That's why the code is also camelCased. Qt uses a pattern of access methods for properties and set methods for setters. So I do use Python properties a lot for internal variables though, especially in the core data classes that are not inherited from Qt C++ objects. I never particularly liked the Python setter decorator though, as I sometimes have setters that take multiple related values, and I don't want to mix the two styles.
Yes, this is the part that mimics LibreOffice. The paragraph styles work a little differently since I don't have bitmasks for those, so they do lookups based on sha256 hashes of string representation of the data dicts. It's a quick and dirty method, but it works. I occasionally think about improving that part.
I've considered adding support for ODT imports. I see the task as a little daunting, because I have a lot less control of what subset of the standard I support than when I write the doc. I would still be a nice feature even if I only parse a subset of formatting.
Is that something to perhaps implement in novelWriter? A way to support adding text in a different language? I don't necessarily intend to support spell checking in multiple languages, but it may be useful to allow excluding regions from spell checking.
|
Well, it's something I often need in my writing because I don't want to populate my user dictionaries with foreign expressions or dialect. But that's why I am developing my own writing program. It has a user interface inspired by Scrivener, a data model that has its origins in yWriter, and as an editor it uses OO/LO Writer which is way better than all I can ever create myself. A plot grid is realized via ODS export and reimport. I also support a few plotting concepts that I like at DramaQueen, and there's a connection to Zim for world building. Plus synchronization with two different timeline programs. As you can see, a fundamentally different approach to that of novelWriter. I saw novelWriter starting out as a lean application with a small learning curve, and am amazed at how you are gradually building it into a full-blown word processing program. However, the concept of plain text with markup/markdown also sets certain limits. Only you can know where they lie, but if you exceed them, the original advantages turn into disadvantages. |
It certainly is a limitation if you expect full rich text capabilities. What annoys me with most rich text editors is that there are so many editing options that it becomes incredibly hard to wrangle it into what you want. Most office apps have the same problem. I found Scrivener annoying on this point too. It's why I've used LaTeX for my academic work, and really liked Wordpad back when I used Window. I found FocusWriter for Linux that fit that niche, but it lacks the project capabilities I wanted for writing fiction. The reason novelWriter ended up as Markdown is partially that I actually wanted to just type my meta data directly into the text. For me that is a lot easier than dealing with forms and tables. That can of course be done in rich text too, but Python + Qt becomes a little sluggish for full on rich text, so I decided not to. My initial approach was, as you also seem to suggest, that (strong) emphasis is all that's strictly needed. The challenge now is that people are requesting rich text features, so I pulled in the concept of shortcodes that I know from back when I posted on discussion forums. They are a decent extension, and easy to parse. The bonus here is that these features are not at all in the way in the editor when you don't use them. It is why I am willing to add a number of rich text features for those that do need them. You still have to prefer the typed formatting approach over the point and click approach to use novelWriter. That will not work for everyone, but based on feedback, a lot of people do prefer this. So I guess it fits a certain niche of users. Which is perfectly fine. I am considering a hybrid solution where the editor is limited rich text (like Wordpad was), but where you can populate meta data by just typing as well as apply simpe formatting. This really needs to be done in C++ to be snappy and responsive. So my idea was to basically recreate novelWriter's project approach around a rich text editor. |
Yes, I understand all that. My first word processor was WordPerfect, and I loved the parallel window with the markup. Of course I wanted to have something like that too, so I made a plugin with a plain text editor that allows me to edit the xml code directly. Similar to the early HTML editors, I have menu entries and keyboard shortcuts to insert or toggle the most common format tags. Hitting Here a screenshot for inspiration. I am using the tkinter text box; probably you can do better with Qt: |
I like the multi-window approach. A limitation of novelWriter is the small project view, a single editor and a single viewer. I've considered doing something differently in my other project with a project view and independent document editors. I don't like tabs, but dockable independent windows are an option. |
Inspired by yWriter, I equipped my editor plugin with its own window manager so that I can open any number of sections from the main program. Instead of opening a section twice, an already open window is brought to the foreground and focused if required. However, I have never needed this feature in practice. In fact, I mainly use this editor to split up sections, which is another feature. |
There are a number of feature requests that would require some form of short code format to implement. This is an Epic tracking the implementation of shortcodes for extended feature support in the novelWriter editor.
The shortcode syntax is on the following form.
[x]
[/x]
[x:value]
Inline commands (self-closing) will have limited use, but for footnotes defined elsewhere, they are useful. I don't propose to add a special self-closing syntax. So, it will behave like HTML not XML/XHTML.
The syntax is compatible with the syntax already added for page breaks and vertical space. These formats can be modified to conform to the rules without breaking backwards compatibility.
Rationale
These are easy to implement in the syntax highlighter, and they are also very easy to parse. The syntax is also not too obscure, since short codes have been used for plain text formatting online for quite some time. Since novelWriter is a fiction writing app, they are also not frequently needed. However, a context or special menu for inserting short codes is probably a good idea, because most people don't want to memorise them.
Shortcode Features
The text was updated successfully, but these errors were encountered: