Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle ANSI color codes #140

Closed
mgeisler opened this issue May 22, 2018 · 7 comments · Fixed by #179
Closed

Handle ANSI color codes #140

mgeisler opened this issue May 22, 2018 · 7 comments · Fixed by #179

Comments

@mgeisler
Copy link
Owner

Textwrap currently doesn't know about ANSI color codes for colored terminal output. It simply treats these escape codes as any other character and this can mess up the output as shown here: clap-rs/clap#1246.

The custom width functionality of #128 could perhaps be used to solve this problem, but I feel it would be nice if this was a builtin feature (perhaps an optional feature depending on how invasive it will be to handle this).

@laanwj
Copy link

laanwj commented Sep 1, 2018

Yes! this would be very useful for me too, though in my case the codes are not ANSI color codes but other game-specific formatting.

A general way to mark parts of the input text as color/formatting would be nice.

Such spans would be ignored for purposes of width estimation, formatting and breaking, but would be emitted in the output in the same place as they were in the input.

@mgeisler
Copy link
Owner Author

mgeisler commented Sep 3, 2018

Hey @laanwj, thanks for the comment! It hadn't occurred to me that ignoring certain spans could be useful in other ways than simply ignoring the traditional color codes.

So we should be able to have a user-defined function tell us where the spans of text are and where the spans of ignored data is. I just tried to see if it would be possible to word-wrap Markdown text for output in a console, and it's fairly complicated:

extern crate pulldown_cmark;
extern crate textwrap;

use pulldown_cmark::{Event, Parser, Tag};
use std::io::Write;

#[derive(Debug)]
enum Span<'a> {
    Text(String),
    Markup(&'a [u8]),
}

const RESET: &'static [u8] = b"\x1b[m";
const BOLD: &'static [u8] = b"\x1b[1m";
const ITALIC: &'static [u8] = b"\x1b[3m";
const UNDERLINE: &'static [u8] = b"\x1b[4m";

fn main() {
    let s = r#"**Markdown** is a [lightweight *markup* language](https://en.wikipedia.org/wiki/Lightweight_markup_language) with plain text formatting syntax."#;

    let mut text = String::new();
    let mut spans = vec![];
    let mut markup_stack = vec![];

    println!("//// events:");
    for event in Parser::new(s) {
        println!("{:?}", event);
        match event {
            Event::Text(t) => {
                text.push_str(&t);
                spans.push(Span::Text(t.into_owned()));
            }
            Event::Start(Tag::Strong) => {
                markup_stack.push(BOLD);
                spans.push(Span::Markup(BOLD));
            }
            Event::Start(Tag::Emphasis) => {
                markup_stack.push(ITALIC);
                spans.push(Span::Markup(ITALIC));
            }
            Event::Start(Tag::Link(_, _)) => {
                markup_stack.push(UNDERLINE);
                spans.push(Span::Markup(UNDERLINE));
            }
            Event::End(Tag::Strong) | Event::End(Tag::Emphasis) | Event::End(Tag::Link(_, _)) => {
                markup_stack.pop();
                spans.push(Span::Markup(RESET));
                spans.extend(markup_stack.iter().map(|m| Span::Markup(m)));
            }
            _ => {}
        }
    }
    println!();

    println!("//// without markup:");
    let wrapper = textwrap::Wrapper::with_splitter(30, textwrap::NoHyphenation);
    let filled = wrapper.fill(&text);
    assert_eq!(filled.len(), text.len(), "fill should not change length");
    println!("{}", filled);
    println!();

    println!("//// with markup:");
    let mut offset = 0;
    let mut out = std::io::stdout();
    for span in &spans {
        match span {
            Span::Markup(m) => {
                out.write_all(m).expect("write failed!");
            }
            Span::Text(t) => {
                // This uses the fact that `filled` has the same
                // length as `text`, except that some ' ' have been
                // replaced with newlines '\n'. We can thus share
                // offsets between the two strings.
                print!("{}", &filled[offset..offset + t.len()]);
                offset += t.len();
            }
        }
    }
    println!();
}

This shows output like this:

image

This kind of code only works if there is no hyphenation going on and it probably also only works if the input text has no repeated spaces.

I'm already thinking of changing textwrap to work on a list of Word and Space tokens, and maybe that will make it easy to add a Markup token to the stream, which can then be passed through the wrapping machinery unharmed.

@laanwj
Copy link

laanwj commented Sep 8, 2018

Oh wow, hadn't thought of that: as long as the textwrap doesn't change the text, it doesn't have to know about the formatting at all, the string is divided in the same places!

It's unfortunate that this doesn't extend to hyphen-ization or multiple spaces—I guess even that could be fixed up after the fact with a reconciliation pass—though a solution that works on tokens, as you say, would be more elegant.

@mgeisler
Copy link
Owner Author

I hadn't though of it before you gave me the idea :-D I think you could take the above idea pretty far, but it'll be much nicer if there was some builtin support. I'll see what I can come up with...

@vmedea
Copy link

vmedea commented Nov 2, 2018

might be useful to take inspiration from how cursive handles this for their TextViews, as it more or less exactly implements the above;

they have a StyledString = SpannedString<Style> which consists of spans: https://github.com/gyscos/Cursive/blob/master/src/utils/markup/mod.rs#L20, these are parametrized on the struct to be used as style, cursive itself uses https://github.com/gyscos/Cursive/blob/master/src/theme/style.rs#L8
e.g.

    let mut styled = StyledString::plain("Isn't ");
    styled.append(StyledString::styled("that ", Color::Dark(BaseColor::Red)));
    styled.append(StyledString::styled(
        "cool?",
        Style::from(Color::Light(BaseColor::Blue)).combine(Effect::Bold),
    ));

then their word wrapper which operates on them is here: https://github.com/gyscos/Cursive/blob/master/src/utils/lines/spans/lines_iterator.rs#L13

zwilias added a commit to zwilias/elm-json that referenced this issue Apr 25, 2019
- add bolding for packages so they jump out
- refer to "root" as "this project"
- special case unknown packages

Bolding currently means the linewrapping is messed up. Hopefully
mgeisler/textwrap#140 will get sorted out and we can
add more liberal use of visual hints without messing up the wrapping! 🤞
@pksunkara
Copy link

@mgeisler Any update on this? Would like to get it fixed on clap.

mgeisler added a commit that referenced this issue Apr 15, 2020
ANSI escape sequences are typically used for colored text. The
sequences start with a so-called CSI, followed by some "parameter
bytes" before ending with a "final byte".

We now handle these escape sequences by simply skipping over the
bytes. This works well for escape sequences that change colors since
they don't take up space and since they continue to work across any
line breaks we insert.

See https://en.wikipedia.org/wiki/ANSI_escape_code for details.

Fixes: #140.
@mgeisler
Copy link
Owner Author

Hi @pksunkara, thanks for reminding me. I haven't looked more into this since I wrote the Markdown example above. However, I took a look now and found that it was relatively easy to add support for this, please see #179.

mgeisler added a commit that referenced this issue Apr 19, 2020
ANSI escape sequences are typically used for colored text. The
sequences start with a so-called CSI, followed by some "parameter
bytes" before ending with a "final byte".

We now handle these escape sequences by simply skipping over the
bytes. This works well for escape sequences that change colors since
they don't take up space and since they continue to work across any
line breaks we insert.

See https://en.wikipedia.org/wiki/ANSI_escape_code for details.

Fixes: #140.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants