-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Remove redundant typenames and use standard naming for typenames. #58
base: main
Are you sure you want to change the base?
refactor: Remove redundant typenames and use standard naming for typenames. #58
Conversation
Co-authored-by: Lin Zhihao <[email protected]>
…call succeeds in NFA's serialize. Co-authored-by: Lin Zhihao <[email protected]>
…on classes when they are initialized in their constructor.
…d transitions instead of emplace back.
…han one leaving an NFA state.
…egativeTaggedTransition classes into their own files.
…just an id. This object is created and owned by the capture AST, and other AST and NFA states point to these tags.
…that was accidentally removed.
…viously untouched files.
…Should fix this so the ordering is determinstic.
…ename tag members to make it clear they are positions.
typename
s.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
src/log_surgeon/finite_automata/RegexDFAState.hpp (2)
37-37
: Preferfalse == expression
over!expression
According to the coding guidelines, prefer using
false == expression
rather than!expression
. In line 37, consider changing!m_matching_variable_ids.empty()
tofalse == m_matching_variable_ids.empty()
.
68-68
: Preferfalse == expression
over!expression
As per the coding guidelines, it's recommended to use
false == expression
instead of!expression
. In line 68, please consider changing!result->empty()
tofalse == result->empty()
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (10)
CMakeLists.txt
(1 hunks)examples/intersect-test.cpp
(1 hunks)src/log_surgeon/LALR1Parser.cpp
(1 hunks)src/log_surgeon/finite_automata/RegexDFA.hpp
(1 hunks)src/log_surgeon/finite_automata/RegexDFA.tpp
(0 hunks)src/log_surgeon/finite_automata/RegexDFAState.hpp
(1 hunks)src/log_surgeon/finite_automata/RegexDFAStatePair.hpp
(1 hunks)src/log_surgeon/finite_automata/RegexDFAStateType.hpp
(1 hunks)src/log_surgeon/finite_automata/RegexNFA.hpp
(0 hunks)src/log_surgeon/finite_automata/RegexNFAState.hpp
(1 hunks)
💤 Files with no reviewable changes (2)
- src/log_surgeon/finite_automata/RegexNFA.hpp
- src/log_surgeon/finite_automata/RegexDFA.tpp
✅ Files skipped from review due to trivial changes (1)
- src/log_surgeon/finite_automata/RegexNFAState.hpp
🧰 Additional context used
📓 Path-based instructions (6)
examples/intersect-test.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/RegexDFAStateType.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/LALR1Parser.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/RegexDFA.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/RegexDFAStatePair.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/RegexDFAState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (4)
src/log_surgeon/finite_automata/RegexDFAStateType.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
CMakeLists.txt (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
src/log_surgeon/finite_automata/RegexDFA.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/RegexDFAState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
🔇 Additional comments (7)
src/log_surgeon/finite_automata/RegexDFA.hpp (2)
40-52
: Efficient implementation of new_state
method
The updated new_state
method efficiently creates new DFA states and correctly handles matching variable IDs. The use of std::make_unique
ensures proper memory management.
54-72
: Well-structured get_intersect
method
The revised get_intersect
method is well-organized and effectively computes the intersection of DFAs. The use of sets for tracking visited and unvisited pairs enhances readability and performance.
src/log_surgeon/finite_automata/RegexDFAStatePair.hpp (1)
1-84
: Clear and logical implementation of RegexDFAStatePair
The RegexDFAStatePair
class is well-designed, with appropriate methods for managing DFA state pairs. The operator overloading and reachability logic are correctly implemented.
src/log_surgeon/finite_automata/RegexDFAStateType.hpp (1)
1-13
: Correct definition of RegexDFAStateType
enumeration
The RegexDFAStateType
enum is properly defined and will facilitate state specialization based on type. This enhances code clarity and maintainability.
src/log_surgeon/LALR1Parser.cpp (1)
11-14
: LGTM: Constructor initialization is properly formatted
The initialization list follows good practices with clear member initialization.
examples/intersect-test.cpp (1)
45-45
: LGTM: Proper usage of unique_ptr
The change to use dfa2.get()
correctly passes the raw pointer while maintaining ownership semantics of the unique_ptr.
CMakeLists.txt (1)
100-102
: LGTM: Improved organization of RegexDFA components
The separation into distinct header files (RegexDFAState.hpp
, RegexDFAStatePair.hpp
, and RegexDFAStateType.hpp
) improves code organization and maintainability.
Let's verify that all necessary header files are included:
✅ Verification successful
All RegexDFA-related header files are properly included in CMakeLists.txt
The comparison shows that all header files in the finite_automata
directory, including the newly added RegexDFA component files (RegexDFAState.hpp
, RegexDFAStatePair.hpp
, and RegexDFAStateType.hpp
), are correctly listed in CMakeLists.txt. The file list is complete and up-to-date.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that all necessary header files are included in CMakeLists.txt
# Check for any .hpp files in the finite_automata directory that might be missing
fd -e hpp -e tpp . src/log_surgeon/finite_automata/ | sort > /tmp/actual_files.txt
rg -o 'src/log_surgeon/finite_automata/[^[:space:]"]+\.(hpp|tpp)' CMakeLists.txt | sort > /tmp/listed_files.txt
diff /tmp/actual_files.txt /tmp/listed_files.txt
Length of output: 201
Script:
#!/bin/bash
# Let's try a different approach to list files
echo "=== Files in finite_automata directory ==="
fd . src/log_surgeon/finite_automata/ -t f
echo -e "\n=== Files listed in CMakeLists.txt ==="
rg "src/log_surgeon/finite_automata/" CMakeLists.txt
Length of output: 1841
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
src/log_surgeon/finite_automata/RegexDFAState.hpp (1)
62-64
: Consider following the coding guideline for boolean expressions.The condition
RegexDFAStateType::Byte == state_type
already follows the guideline of putting the constant on the left side of the comparison. However, for consistency, consider applying the same pattern throughout the method.Apply this diff:
if constexpr (RegexDFAStateType::Byte == state_type) { return m_bytes_transition[character]; } else { if (character < cSizeOfByte) { return m_bytes_transition[character]; } std::unique_ptr<std::vector<typename Tree::Data>> result = m_tree_transitions.find(Interval(character, character)); assert(result->size() <= 1); - if (false == result->empty()) { + if (true == result->empty()) { + return nullptr; + } - return result->front().m_value; - } - return nullptr; + return result->front().m_value; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
src/log_surgeon/finite_automata/RegexDFAState.hpp
(2 hunks)src/log_surgeon/finite_automata/RegexDFAStatePair.hpp
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/log_surgeon/finite_automata/RegexDFAStatePair.hpp
🧰 Additional context used
📓 Path-based instructions (1)
src/log_surgeon/finite_automata/RegexDFAState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (1)
src/log_surgeon/finite_automata/RegexDFAState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
🔇 Additional comments (3)
src/log_surgeon/finite_automata/RegexDFAState.hpp (3)
22-22
: LGTM! The template parameter renaming improves consistency.
The change from stateType
to state_type
follows standard C++ naming conventions, and the simplified type alias improves code readability.
Also applies to: 25-25
55-59
: LGTM! Well-documented conditional type declaration.
The member variable declarations are clear and well-documented. The use of std::conditional_t
for compile-time type selection is appropriate, and the comment effectively explains the empty tuple case.
43-43
: LGTM! Method signatures are clear and consistent.
The simplified parameter types in method signatures improve readability while maintaining type safety.
Also applies to: 51-51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is a comment for myself): since this PR has many different types of fixes, I think we need to have a multi-line commit message when we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There're similar cases in LALR1Parser.hpp
:
[[nodiscard]] auto token_cast(uint32_t i) const -> Token* {
assert(i < cSizeOfAllChildren);
return &std::get<Token>(NonTerminal::m_all_children[m_children_start + i]);
}
/**
* Return the ith child's (body of production) MatchedSymbol as a
* NonTerminal. Note: only children are needed (and stored) for performing
* semantic actions (for the AST)
* @param i
* @return NonTerminal*
*/
[[nodiscard]] auto non_terminal_cast(uint32_t i) const -> NonTerminal* {
assert(i < cSizeOfAllChildren);
return &std::get<NonTerminal>(NonTerminal::m_all_children[m_children_start + i]);
}
We could simplify NonTerminal::m_all_children
to m_all_children
@@ -27,10 +27,10 @@ class RegexDFAStatePair { | |||
m_state2(state2) {}; | |||
|
|||
/** | |||
* Used for ordering in a set by considering the states' addresses | |||
* Used for ordering in a set by considering the states' addresses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Used for ordering in a set by considering the states' addresses. |
I think we can drop this description. Otherwise we need to make sure it matches the rule of doc string descriptions, sth like Orders two states required by std::set.
) const -> void { | ||
// TODO: Handle UTF-8 (multi-byte transitions) as well | ||
for (uint32_t i = 0; i < cSizeOfByte; i++) { | ||
auto next_state1 = m_state1->next(i); | ||
auto next_state2 = m_state2->next(i); | ||
if (next_state1 != nullptr && next_state2 != nullptr) { | ||
RegexDFAStatePair<DFAState> reachable_pair{next_state1, next_state2}; | ||
RegexDFAStatePair reachable_pair{next_state1, next_state2}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
RegexDFAStatePair reachable_pair{next_state1, next_state2}; | |
RegexDFAStatePair const reachable_pair{next_state1, next_state2}; |
I understand this is out of the PR's scope, but I don't mind if we also include fixes like this
It's up to u whether to commit this change within this PR
Add `dest_state` to `m_bytes_transitions` if all values in interval are a byte, otherwise add | ||
`dest_state` to `m_tree_transitions`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add `dest_state` to `m_bytes_transitions` if all values in interval are a byte, otherwise add | |
`dest_state` to `m_tree_transitions`. | |
* Add `dest_state` to `m_bytes_transitions` if all values in interval are a byte, otherwise add | |
* `dest_state` to `m_tree_transitions`. |
References
Description
stateType
tostate_type
.Validation performed
Previously existing tests still succeed.
Summary by CodeRabbit
RegexNFA
class.