Close #506: Implement preprocessor concatenation #510

hdamron17 · 2020-08-16T18:03:10Z

Will close #506. So far I have collected the tokens before and after ## and passed them to a concat function. Now I just need to figure out how to concatenate them.

Collected before and after tokens and passed them to a concat function

hdamron17 · 2020-08-16T18:04:31Z

Also, I'm probably not working with replacements and pending properly in replace().

Add active flag to HashHash so it does not get replaced indefinitely Ignore test which relies on jyn514#513 - stringify out of order

hdamron17 · 2020-08-17T19:49:29Z

I'm assuming you'll want more tests...

jyn514

So, this looks good to me, I don't see anything wrong. However I'm having doubts about how maintainable this all is ... in particular I don't know how this handles empty preprocessor arguments:

If an argument consists of no preprocessing tokens, the parameter is replaced by a placemarker preprocessing token instead.

I think currently this will give an error, which is not correct. But fixing it sounds really painful and will need lots of rearchitecting.

An idea I came up with was to reuse some of the ideas from https://github.com/eddyb/cprep/. The benefits there are:

The algorithm is much more clear, the one in saltwater is a bit of a mess
It runs before the lexer, which I've wanted to do for a while
It's already written :P

What do you think about merging this as-is (with nits fixed), but eventually transitioning to using cprep instead?

cc @eddyb

jyn514 · 2020-08-22T21:20:45Z

src/data/error.rs

+    HashHashMissingParameter(bool),
+
+    /// The result of '##' is not a valid token
+    #[error("pasting formed '{0}{1}', an invalid preprocessing token")]


Suggested change

#[error("pasting formed '{0}{1}', an invalid preprocessing token")]

#[error("token pasting formed '{0}{1}', an invalid preprocessing token")]

jyn514 · 2020-09-06T13:11:26Z

src/lex/cpp.rs

+        assert_concat(r#""x""#, r#""y""#, None);
+        assert_concat("0b1", "6", None);
+        assert_concat("/", "/", None); // Not a comment
+    }


Suggested change

}

}

#[test]

fn gnu_weirdness() {

// #506

let program = "

#define __GLIBC_USE(F) __GLIBC_USE_ ## F

# define __GLIBC_USE_IEC_60559_TYPES_EXT 1

# define __GLIBC_USE_ISOC2X 1

#if __GLIBC_USE (IEC_60559_BFP_EXT) || __GLIBC_USE (ISOC2X)

int main() {}

#endif

";

assert_same(program, "int main() {}");

}

jyn514 · 2020-09-06T13:18:55Z

src/data/lex.rs

-    Hash,        // #, used for preprocessing
+    StructDeref,    // ->
+    Hash,           // #, used for preprocessing
+    HashHash(bool), // ##, used for preprocessing (the bool is true unless it is created by `# ## #`)


It's not clear what the bool is where you use it in the code, might be better to say explicitly (the comment is fine though).

Suggested change

HashHash(bool), // ##, used for preprocessing (the bool is true unless it is created by `# ## #`)

HashHash { operator: bool }, // ##, used for preprocessing (the bool is true unless it is created by `# ## #`)

jyn514 · 2020-09-06T13:19:06Z

src/data/lex.rs

-    StructDeref, // ->
-    Hash,        // #, used for preprocessing
+    StructDeref,    // ->
+    Hash,           // #, used for preprocessing
+    HashHash(bool), // ##, used for preprocessing (the bool is true unless it is created by `# ## #`)


nit: I'd mildly prefer to either have inconsistent alignment or to put the comment on the line above. Otherwise it's more work next time you need to update the comments.

jyn514 · 2020-09-06T13:30:48Z

src/lex/replace.rs

+        match token {
+            Ok(Locatable {
+                data: ref succeeding_tok,
+                ..
+            }) if pending_hashhash.is_some() => {
+                if matches!(succeeding_tok, Token::Whitespace(_)) {
+                    continue;
+                }
+                let pending_hashhash = pending_hashhash.take().unwrap(); // We just checked that it's some


Better to write this as

match (token, pending_hashhash) { (Ok(..), Some(hashhash)) => {

That way you don't have an extra unwrap.

jyn514 · 2020-09-06T13:35:08Z

src/lex/replace.rs

+                        None | Some(Err(_)) => {
+                            return wrap_error(&location, CppError::HashHashMissingParameter(true))
+                        }


Hmm, in the Some(Err) case I think we should return the original error instead of swallowing it.

jyn514 · 2020-09-06T13:37:23Z

src/lex/replace.rs

@@ -439,3 +488,21 @@ fn stringify(args: Vec<Token>) -> Token {
        ret.trim()
    ))]))
 }
+
+fn concat(x: &Token, y: &Token, location: &Location) -> Option<Locatable<Token>> {
+    let mut lexer = Lexer::new(location.file, format!("{}{}", x, y), false);


You said doing this means we can get rid of the LiteralParser trait, right? Do you want to do that here or in a follow-up PR?

Initial WIP on preprocessor-hashhash

28aaca8

Collected before and after tokens and passed them to a concat function

Implement concat with a temporary lexer

1f32b12

hdamron17 force-pushed the preprocessor-hashhash branch from 6866968 to 1f32b12 Compare August 17, 2020 18:16

hdamron17 added 2 commits August 17, 2020 14:50

Add more tests for hashhash

4f0e7d6

Used pending tokens in replace with ##

6396e51

Add active flag to HashHash so it does not get replaced indefinitely Ignore test which relies on jyn514#513 - stringify out of order

hdamron17 marked this pull request as ready for review August 17, 2020 19:48

hdamron17 requested a review from jyn514 August 17, 2020 19:48

clippy knows best

d128286

jyn514 added enhancement New feature or request preprocessor Issue in the preprocessor (probably cycle detection) labels Sep 6, 2020

jyn514 reviewed Sep 6, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Close #506: Implement preprocessor concatenation #510

Close #506: Implement preprocessor concatenation #510

hdamron17 commented Aug 16, 2020

hdamron17 commented Aug 16, 2020

hdamron17 commented Aug 17, 2020

jyn514 left a comment

jyn514 Aug 22, 2020

jyn514 Sep 6, 2020

jyn514 Sep 6, 2020

jyn514 Sep 6, 2020

jyn514 Sep 6, 2020

jyn514 Sep 6, 2020

jyn514 Sep 6, 2020

	#[error("pasting formed '{0}{1}', an invalid preprocessing token")]
	#[error("token pasting formed '{0}{1}', an invalid preprocessing token")]

-    }
+    }
+    #[test]
+    fn gnu_weirdness() {
+        // #506
+        let program = "
+#define __GLIBC_USE(F)	__GLIBC_USE_ ## F
+# define __GLIBC_USE_IEC_60559_TYPES_EXT 1
+# define __GLIBC_USE_ISOC2X 1
+#if __GLIBC_USE (IEC_60559_BFP_EXT) || __GLIBC_USE (ISOC2X)
+	int main() {}
+#endif
+        ";
+        assert_same(program, "int main() {}");
+    }

	HashHash(bool), // ##, used for preprocessing (the bool is true unless it is created by `# ## #`)
	HashHash { operator: bool }, // ##, used for preprocessing (the bool is true unless it is created by `# ## #`)

Close #506: Implement preprocessor concatenation #510

Are you sure you want to change the base?

Close #506: Implement preprocessor concatenation #510

Conversation

hdamron17 commented Aug 16, 2020

hdamron17 commented Aug 16, 2020

hdamron17 commented Aug 17, 2020

jyn514 left a comment

Choose a reason for hiding this comment

jyn514 Aug 22, 2020

Choose a reason for hiding this comment

jyn514 Sep 6, 2020

Choose a reason for hiding this comment

jyn514 Sep 6, 2020

Choose a reason for hiding this comment

jyn514 Sep 6, 2020

Choose a reason for hiding this comment

jyn514 Sep 6, 2020

Choose a reason for hiding this comment

jyn514 Sep 6, 2020

Choose a reason for hiding this comment

jyn514 Sep 6, 2020

Choose a reason for hiding this comment