Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions crates/oxc_transformer/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ compact_str = { workspace = true }
cow-utils = { workspace = true }
indexmap = { workspace = true }
itoa = { workspace = true }
memchr = { workspace = true }
rustc-hash = { workspace = true }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
Expand Down
32 changes: 21 additions & 11 deletions crates/oxc_transformer/src/jsx/comments.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
use std::borrow::Cow;

use memchr::memchr;

use oxc_ast::Comment;

use crate::{JsxOptions, JsxRuntime, TransformCtx, TypeScriptOptions};
Expand Down Expand Up @@ -96,20 +98,26 @@ enum PragmaType {
fn find_jsx_pragma(mut comment_str: &str) -> Option<(PragmaType, &str, &str)> {
let pragma_type;
loop {
// Search for `@jsx`.
let mut at_sign_index = None;
for (index, next4) in comment_str.as_bytes().windows(4).enumerate() {
if next4 == b"@jsx" {
at_sign_index = Some(index);
break;
}
// Search for `@`.
// Note: Using `memchr::memmem::Finder` to search for `@jsx` is slower than only using `memchr`
// to find `@` characters, and then checking if `@` is followed by `jsx` separately.
let at_sign_index = memchr(b'@', comment_str.as_bytes())?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might also be faster to use memchr_iter rather than a loop and having the searcher re-compiled again. (it also might make no difference depending on how the compiler is optimizing it)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think memchr (unlike memchr::memmem::Finder) compiles a searcher type - it's a straight function call. If it's inlined, the "needle" (@) should be statically known to compiler, which might allow it to generate more optimal code than a more complex type.

I've taken some shortcuts here - there are some bounds checks which could be eliminated, and other stuff that could be a little tighter. But my assumption was that almost all the cost is in searching for @, so everything after that was not worthwhile optimizing.

But still, yes, memchr_iter might be faster, especially if the searcher was created outside of find_jsx_pragma and passed in so it can be re-used. Want to try it?


// Check `@` is start of `@jsx`.
// Note: Checking 4 bytes including leading `@` is faster than checking the 3 bytes after `@`,
// because 4 bytes is a `u32`.
let next4 = comment_str.as_bytes().get(at_sign_index..at_sign_index + 4)?;
if next4 != b"@jsx" {
// Not `@jsx`. Trim off up to and including `@` and search again.
// SAFETY: Byte at `at_sign_index` is `@`, so `at_sign_index + 1` is either within string
// or end of string, and on a UTF-8 char boundary.
comment_str = unsafe { comment_str.get_unchecked(at_sign_index + 1..) };
continue;
}
// Exit if not found
let at_sign_index = at_sign_index?;

// Trim `@jsx` from start of `comment_str`.
// Trim `@jsx` and everything before it from start of `comment_str`.
// SAFETY: 4 bytes starting at `at_sign_index` are `@jsx`, so `at_sign_index + 4` is within string
// or end of string, and must be on a UTF-8 character boundary
// or end of string, and must be on a UTF-8 character boundary.
comment_str = unsafe { comment_str.get_unchecked(at_sign_index + 4..) };

// Get rest of keyword e.g. `Runtime` in `@jsxRuntime`
Expand Down Expand Up @@ -230,6 +238,8 @@ mod tests {
("@jsxX @jsx h @jsxX", &[(PragmaType::Jsx, "h")]),
("@jsxMoon @jsx h @jsxMoon", &[(PragmaType::Jsx, "h")]),
("@jsx @jsx h", &[(PragmaType::Jsx, "@jsx")]),
// Multiple `@` signs
("@@@@@jsx h", &[(PragmaType::Jsx, "h")]),
];

let prefixes = ["", " ", "\n\n", "*\n* "];
Expand Down
Loading