Macro construction without `syn` #556

Plecra · 2021-11-02T16:17:58Z

Motivation
Macros are notoriously slow at compile time to include, it'd be nice to remove the tradeoff for users

Solution
The uuid macro can be implemented with const fn (Playground). This makes the decision easier for users - just move the call to a const context.

If we wish, we can also have the uuid macro to force the parsing to be const.

The text was updated successfully, but these errors were encountered:

QnnOkabayashi · 2021-11-02T19:50:45Z

I think this comes down to preference. On one hand, a proc-macro allows us to have extremely fine grain diagnostics, which I think are important. On the other, 95% of the time these string literals will be correct and so diagnostics aren't important. That being said, keep in mind most projects using uuid are probably using proc-macros from other crates (serde, tracing...), so the added time is marginal.

I'm in support of const fns once const panic is stabilized but until then, I think the tradeoff for a marginally increased compile time is worth it for the substantially better error diagnostics.

Nugine · 2021-11-03T02:26:46Z

I have optimized my const parser implementation. It's now 2x faster than Uuid::parse_str at runtime. In exchange, error details are dropped.
I can not easily port this algorithm to uuid crate now as it will break a lot of things.

https://github.com/Nugine/const-str/blob/80037ecd47e2e2d77d1f68881d9f343a5aa2f2fb/const-str/src/__ctfe/uuid.rs

use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion};
use uuid::Uuid;

#[inline(never)]
fn parse_v1(s: &str) -> Uuid {
    Uuid::parse_str(s).unwrap()
}

#[inline(never)]
fn parse_v2(s: &str) -> Uuid {
    const_str::__ctfe::parse_uuid(s)
}

pub fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("uuid_parser");

    let inputs = [
        ("simple", "67e5504410b1426f9247bb680e5fe0c8"),
        ("hyphenated", "67e55044-10b1-426f-9247-bb680e5fe0c8"),
        ("guid", "{67e55044-10b1-426f-9247-bb680e5fe0c8}"),
        ("urn", "urn:uuid:67e55044-10b1-426f-9247-bb680e5fe0c8"),
    ];

    for (tag, input) in inputs {
        group.bench_with_input(BenchmarkId::new("uuid-latest", tag), input, |b, s| {
            b.iter(|| parse_v1(s))
        });
        group.bench_with_input(BenchmarkId::new("const_str-latest", tag), input, |b, s| {
            b.iter(|| parse_v2(s))
        });
    }
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

GitHub
const-str/uuid.rs at 80037ecd47e2e2d77d1f68881d9f343a5aa2f2fb · Nugine/const-str
compile-time string operations. Contribute to Nugine/const-str development by creating an account on GitHub.

Nugine · 2021-11-03T02:29:59Z

I'm in support of const fns once const panic is stabilized but until then, I think the tradeoff for a marginally increased compile time is worth it for the substantially better error diagnostics.

I think we don't have to wait for const panic because the parser should never panic.

uuid/fuzz/fuzz_targets/fuzz_target_parse.rs

Lines 7 to 12 in 4b30c32

    
           fuzz_target!(|data: &[u8]| { 
        
               if let Ok(uuid) = str::from_utf8(data) { 
        
                   // Ensure the parser doesn't panic 
        
                   let _ = Uuid::parse_str(uuid); 
        
               } 
        
           });

KodrAus · 2021-11-03T02:34:36Z

I can not easily port this algorithm to uuid crate now as it will break a lot of things.

What kinds of things do you imagine would break @Nugine?

QnnOkabayashi · 2021-11-03T02:44:58Z

I'm in support of const fns once const panic is stabilized but until then, I think the tradeoff for a marginally increased compile time is worth it for the substantially better error diagnostics.

I think we don't have to wait for const panic because the parser should never panic.

uuid/fuzz/fuzz_targets/fuzz_target_parse.rs

Lines 7 to 12 in 4b30c32

fuzz_target!(|data: &[u8]| {

if let Ok(uuid) = str::from_utf8(data) {

// Ensure the parser doesn't panic

let _ = Uuid::parse_str(uuid);

}

});

The fuzzer never panics because the parser returns a Result (if I'm reading that correctly). The whole point is that the macro returns the Uuid type, or fails to compile, which involves panicking in a const environment.

Nugine · 2021-11-03T02:53:52Z

I can not easily port this algorithm to uuid crate now as it will break a lot of things.

What kinds of things do you imagine would break @Nugine?

Error definition, parser tests, ui tests and merge uuid_macro into uuid. I'm not sure whether a big change is suitable.

KodrAus · 2021-11-03T03:02:16Z

I can see a const Uuid::parse_str as complimentary to a uuid! macro. There are benefits to the global scoping of macros (I still use #[macro_use] extern crate x; readily myself). Since uuid! would use the same parser implementation we wouldn't end up with multiple implementations to maintain either.

The main loss holding us back at the moment would be nice diagnostics in the error and their reporting to end-users, but I don't think making parse_str const would necessarily have to be considered a big change. Eventually it really should be const, along with the formatting APIs too.

Nugine · 2021-11-03T03:02:38Z

I'm in support of const fns once const panic is stabilized but until then, I think the tradeoff for a marginally increased compile time is worth it for the substantially better error diagnostics.

I think we don't have to wait for const panic because the parser should never panic.

uuid/fuzz/fuzz_targets/fuzz_target_parse.rs

Lines 7 to 12 in 4b30c32

fuzz_target!(|data: &[u8]| {

if let Ok(uuid) = str::from_utf8(data) {

// Ensure the parser doesn't panic

let _ = Uuid::parse_str(uuid);

}

});

The fuzzer never panics because the parser returns a Result (if I'm reading that correctly). The whole point is that the macro returns the Uuid type, or fails to compile, which involves panicking in a const environment.

You are right. I missed the point. But there is a way to emit a compile error without feature(const_panic). That means we don't have to bump MSRV to 1.57.

https://docs.rs/static_assertions/1.1.0/src/static_assertions/const_assert.rs.html#52-57

const_assert.rs.html -- source
Source to the Rust file `src/const_assert.rs`.

QnnOkabayashi · 2021-11-03T03:12:47Z

What do the diagnostics look like? I'm thinking we use this implementation in a const parser to return a Result<[u8; 16], Error>, where the proc macro can then use the Error diagnostic info. This would give us great runtime parsing, and maintain good diagnostics. Thoughts?

Nugine · 2021-11-03T03:22:43Z

The const parser provides ambiguous information. Good diagnostics require a second parsing to generate more detailed error reports.

#[derive(Debug, PartialEq, Eq)]
pub struct Error(ErrorKind);

#[derive(Debug, PartialEq, Eq)]
enum ErrorKind {
    InvalidLength,
    InvalidCharacter,
}

const fn invalid_length() -> Error {
    Error(ErrorKind::InvalidLength)
}

const fn invalid_character() -> Error {
    Error(ErrorKind::InvalidCharacter)
}

KodrAus · 2021-11-03T03:29:50Z

Running a second pass on error, even just in the macro and not in Uuid::parse_str would probably be a reasonable compromise if it were still possible to share some of the implementation.

Is it possible to write const parsers without needing any additional dependencies?

Nugine · 2021-11-03T03:44:07Z

Running a second pass on error, even just in the macro and not in Uuid::parse_str would probably be a reasonable compromise if it were still possible to share some of the implementation.

Is it possible to write const parsers without needing any additional dependencies?

Yes. The only necessary dependency is rustc. We need to test MSRV in CI (currently 1.46).

KodrAus · 2021-11-03T04:02:53Z

If the MSRV needs to bump too that is something we can do. As a guide I tend to look at whatever the current version of rustc available as a package for the current LTS of Ubuntu is, which is 1.51.0, released in March. I think we could safely consider bumping to it if need be.

Plecra · 2021-11-03T15:00:38Z

The nice thing about this is that the feature flag can be kept, but doesn't need to change the API - Users can use uuid!("...") whenever they want compile-time construction, and when the macros feature is enabled, we can transparently use a proc_macro implementation with better errors.

(I'd also say that removing syn will make the biggest impact on compile times, and won't make parsing such a simple macro any harder)

QnnOkabayashi · 2021-11-03T21:00:26Z

I really like the idea of having a second parser that can go back over and provide fine-grain diagnostics. I think it's very reasonable for developers to have use cases where they know it won't fail at runtime and speed is more important, as well as cases where if something goes wrong, the developer doesn't particularly care what because they have their own error handling.

I'm currently working on making some very minor modifications to the const_str optimized version that will provide us with hints (like index of invalid byte) so that if we opt in for more detailed diagnostics, that function can be a bit faster.

Of course, the core parsing function will be entirely const, and we can have another const fn that can do the unwrap const equivalent and panic at compile time if we want.

Nugine · 2021-11-04T08:16:20Z

I try to find the fastest way to parse a uuid text. So I write another implementation using AVX2.
The parsing takes less than 10ns, which is 11x~13x faster than Uuid::parse_str.

#[inline]
pub fn parse_uuid(s: &str) -> Uuid {
    match try_parse_uuid(s) {
        Ok(b) => Uuid::from_bytes(b),
        Err(_) => panic!(),
    }
}

#[inline]
pub fn try_parse_uuid(s: &str) -> Result<[u8; 16], Error> {
    let mut s = s.as_bytes();
    let n = s.len();
    if n == 32 {
        return unsafe { parse_uuid_simple(s) };
    }
    match n {
        // hyphenated UUID
        36 => {}
        // URN prefixed UUID
        45 => match s.strip_prefix(b"urn:uuid:") {
            Some(val) => s = val,
            None => return Err(invalid_length()),
        },
        // Microsoft GUID
        38 => match s {
            [b'{', xs @ .., b'}'] => s = xs,
            _ => return Err(invalid_length()),
        },
        _ => return Err(invalid_length()),
    }
    unsafe { parse_uuid_hyphenated(s) }
}

unsafe fn parse_uuid_simple(s: &[u8]) -> Result<[u8; 16], Error> {
    parse(_mm256_loadu_si256(s.as_ptr().cast()))
}

unsafe fn parse_uuid_hyphenated(s: &[u8]) -> Result<[u8; 16], Error> {
    // b'-' is 0x2d
    let x: u32 = mem::transmute([
        *s.get_unchecked(8),
        *s.get_unchecked(13),
        *s.get_unchecked(18),
        *s.get_unchecked(23),
    ]);
    if x != 0x2d2d2d2d {
        return Err(invalid_character());
    }

    #[repr(C, align(32))]
    struct Buf {
        p1: u64,
        p2: u32,
        p3: u32,
        p4: u32,
        p5: u32,
        p6: u64,
    }

    let base: *const u8 = s.as_ptr();
    let buf = Buf {
        p1: base.cast::<u64>().read_unaligned(),
        p2: base.add(9).cast::<u32>().read_unaligned(),
        p3: base.add(14).cast::<u32>().read_unaligned(),
        p4: base.add(19).cast::<u32>().read_unaligned(),
        p5: base.add(24).cast::<u32>().read_unaligned(),
        p6: base.add(28).cast::<u64>().read_unaligned(),
    };
    parse(_mm256_load_si256(&buf as *const Buf as *const _))
}

unsafe fn parse(a: __m256i) -> Result<[u8; 16], Error> {
    if !hex_check(a) {
        return Err(invalid_character());
    }
    let ans = hex_decode(a);
    Ok(mem::transmute(ans))
}

/// Rewritten from <https://github.com/nervosnetwork/faster-hex/blob/0114ee6bcf8a02718bc19e769f9de076ce119726/src/decode.rs#L108-L143>
unsafe fn hex_check(a: __m256i) -> bool {
    let m_0 = _mm256_set1_epi8((b'0' - 1) as i8);
    let ge_0 = _mm256_cmpgt_epi8(a, m_0); // a > b'0' - 1
    let m_9 = _mm256_set1_epi8((b'9' + 1) as i8);
    let le_9 = _mm256_cmpgt_epi8(m_9, a); // b'9' + 1 > a
    let in_0_9 = _mm256_and_si256(ge_0, le_9);

    let m_ua = _mm256_set1_epi8((b'A' - 1) as i8);
    let ge_ua = _mm256_cmpgt_epi8(a, m_ua); // a > b'A - 1
    let m_uf = _mm256_set1_epi8((b'F' + 1) as i8);
    let le_uf = _mm256_cmpgt_epi8(m_uf, a); // b'F' + 1 > a
    let in_ua_uf = _mm256_and_si256(ge_ua, le_uf);

    let m_la = _mm256_set1_epi8((b'a' - 1) as i8);
    let ge_la = _mm256_cmpgt_epi8(a, m_la); // a > b'a' - 1
    let m_lf = _mm256_set1_epi8((b'f' + 1) as i8);
    let le_lf = _mm256_cmpgt_epi8(m_lf, a); // b'f' + 1 > a
    let in_la_lf = _mm256_and_si256(ge_la, le_lf);

    let ans = _mm256_or_si256(_mm256_or_si256(in_la_lf, in_ua_uf), in_0_9);
    _mm256_movemask_epi8(ans) as u32 == 0xffffffff
}

unsafe fn hex_decode(a: __m256i) -> __m128i {
    // epi16
    // a1 = c & 0x0f0f;
    // a2 = a1 >> 4;
    // a3 = a1 + a2;
    // a4 = c & 0x4040;
    // a5 = a4 >> 6;
    // a6 = a4 >> 10;
    // a7 = a5 + a6;
    // a8 = a7 * 9;
    // a9 = a3 + a8;
    // a10 = a9 & 0x00ff

    #[repr(align(32))]
    struct Align32I8x32([i8; 32]);
    const SHUFFLE_MASK: Align32I8x32 = Align32I8x32([
        1, 0, 3, 2, 5, 4, 7, 6, //
        9, 8, 11, 10, 13, 12, 15, 14, //
        17, 16, 19, 18, 21, 20, 23, 22, //
        25, 24, 27, 26, 29, 28, 31, 30, //
    ]);

    let m = _mm256_load_si256(SHUFFLE_MASK.0.as_ptr().cast());
    let a = _mm256_shuffle_epi8(a, m); // big endian -> little endian

    let a1 = _mm256_and_si256(a, _mm256_set1_epi16(0x0f0f));
    let a2 = _mm256_srai_epi16::<4>(a1);
    let a3 = _mm256_add_epi16(a1, a2);

    let a4 = _mm256_and_si256(a, _mm256_set1_epi16(0x4040));
    let a5 = _mm256_srai_epi16::<6>(a4);
    let a6 = _mm256_srai_epi16::<10>(a4);
    let a7 = _mm256_add_epi16(a5, a6);
    let a8 = _mm256_mullo_epi16(a7, _mm256_set1_epi16(0x9));

    let a9 = _mm256_add_epi16(a3, a8);
    let a10 = _mm256_and_si256(a9, _mm256_set1_epi16(0x00ff));

    avx2_mm256_cvtepi16_epi8(a10)
}

#[inline(always)]
unsafe fn avx2_mm256_cvtepi16_epi8(a: __m256i) -> __m128i {
    let a1 = _mm256_castsi256_si128(a);
    let a2 = _mm256_extractf128_si256::<1>(a);
    _mm_packus_epi16(a1, a2)
}

KodrAus · 2021-11-04T12:02:39Z

Wow that’s really awesome @Nugine! 😁 I wonder if it’s possible to write with std::simd, even if it turns out a little slower. I’ll have to spend a bit more time studying it, maybe by trying to port to Neon or WASM (neither of which support 256bit wide vectors AFAIK).

Nugine · 2021-11-04T12:47:32Z

Wow that’s really awesome @Nugine! 😁 I wonder if it’s possible to write with std::simd, even if it turns out a little slower. I’ll have to spend a bit more time studying it, maybe by trying to port to Neon or WASM (neither of which support 256bit wide vectors AFAIK).

I believe it's possible to write with std::simd. But that is unstable now. The faster-hex crate inspires me.

QnnOkabayashi · 2021-11-04T15:38:07Z

That's awesome! Is this architecture specific though? Also, is there any way we can make this const friendly?

I also did some more thinking, and now I propose that we have some ultra-fast parsing function that returns something like Result<[u8; 16], ParseError<'a> where the err variant is a wrapper struct containing the original string, which is always guaranteed to contain an invalid uuid string. Another function can accept a ParseError<'a> and obtain some basic diagnostics for the error. Finally, a third function contained purely in the proc macro would accept a ParseError<'a> and provide super detailed diagnostics, like exactly which bytes are invalid so we can have nice error highlighting.

Nugine · 2021-11-04T16:35:01Z

That's awesome! Is this architecture specific though? Also, is there any way we can make this const friendly?

AVX2 is widely used but not all CPUs support it.
The feature can be detected at compile-time or runtime.

Unfortunately, SIMD operations are not available at const context now.

KodrAus · 2021-11-04T21:59:07Z

We did have some designs for fallbacks in std::simd that should be naturally const friendly, but am not sure where it all landed yet.

Nugine · 2021-11-07T15:08:00Z

https://github.com/Nugine/uuid-simd
Probably the fastest uuid parser in the world 🚀

Add a feature simd. When simd and std are enabled, use runtime cpu feature detection to dispatch the parser. When simd is enabled, use conditional compilation to dispatch the parser. Otherwise, use the fallback parser (it's const).
The uuid! macro can use the proc-macro parser when the arg is literal and use the const parser when the arg is expr.

Drawbacks

It's too complex. Maybe we don't need so many parsers.
Uuid::parse_str can not be const if it internally calls simd functions.

GitHub
GitHub - Nugine/uuid-simd: My SIMD playground
My SIMD playground. Contribute to Nugine/uuid-simd development by creating an account on GitHub.

QnnOkabayashi mentioned this issue Nov 11, 2021

Improved parser & diagnostics #560

Merged

Nugine mentioned this issue Nov 16, 2021

Provide uuid macro without dependencies #566

Merged

KodrAus closed this as completed in #566 Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Macro construction without `syn` #556

Macro construction without `syn` #556

Plecra commented Nov 2, 2021

QnnOkabayashi commented Nov 2, 2021

Nugine commented Nov 3, 2021 •

edited by unfurl-links bot

Loading

Nugine commented Nov 3, 2021

KodrAus commented Nov 3, 2021

QnnOkabayashi commented Nov 3, 2021

Nugine commented Nov 3, 2021

KodrAus commented Nov 3, 2021

Nugine commented Nov 3, 2021 •

edited by unfurl-links bot

Loading

QnnOkabayashi commented Nov 3, 2021

Nugine commented Nov 3, 2021

KodrAus commented Nov 3, 2021

Nugine commented Nov 3, 2021 •

edited

Loading

KodrAus commented Nov 3, 2021

Plecra commented Nov 3, 2021

QnnOkabayashi commented Nov 3, 2021 •

edited

Loading

Nugine commented Nov 4, 2021 •

edited

Loading

KodrAus commented Nov 4, 2021

Nugine commented Nov 4, 2021

QnnOkabayashi commented Nov 4, 2021 •

edited

Loading

Nugine commented Nov 4, 2021

KodrAus commented Nov 4, 2021

Nugine commented Nov 7, 2021 •

edited by unfurl-links bot

Loading

Macro construction without syn #556

Macro construction without syn #556

Comments

Plecra commented Nov 2, 2021

QnnOkabayashi commented Nov 2, 2021

Nugine commented Nov 3, 2021 • edited by unfurl-links bot Loading

Nugine commented Nov 3, 2021

KodrAus commented Nov 3, 2021

QnnOkabayashi commented Nov 3, 2021

Nugine commented Nov 3, 2021

KodrAus commented Nov 3, 2021

Nugine commented Nov 3, 2021 • edited by unfurl-links bot Loading

QnnOkabayashi commented Nov 3, 2021

Nugine commented Nov 3, 2021

KodrAus commented Nov 3, 2021

Nugine commented Nov 3, 2021 • edited Loading

KodrAus commented Nov 3, 2021

Plecra commented Nov 3, 2021

QnnOkabayashi commented Nov 3, 2021 • edited Loading

Nugine commented Nov 4, 2021 • edited Loading

KodrAus commented Nov 4, 2021

Nugine commented Nov 4, 2021

QnnOkabayashi commented Nov 4, 2021 • edited Loading

Nugine commented Nov 4, 2021

KodrAus commented Nov 4, 2021

Nugine commented Nov 7, 2021 • edited by unfurl-links bot Loading

Macro construction without `syn` #556

Macro construction without `syn` #556

Nugine commented Nov 3, 2021 •

edited by unfurl-links bot

Loading

Nugine commented Nov 3, 2021 •

edited by unfurl-links bot

Loading

Nugine commented Nov 3, 2021 •

edited

Loading

QnnOkabayashi commented Nov 3, 2021 •

edited

Loading

Nugine commented Nov 4, 2021 •

edited

Loading

QnnOkabayashi commented Nov 4, 2021 •

edited

Loading

Nugine commented Nov 7, 2021 •

edited by unfurl-links bot

Loading