diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md index ec85a16..e4f2e7c 100644 --- a/DEVELOPMENT.md +++ b/DEVELOPMENT.md @@ -10,14 +10,13 @@ Following `justfile` commands are helpful for development: - `just develop`: compiles everything and installs the latest compiled state of `sqlquerypp` into the current python virtual environment - which is located at `.venv/` at the repository root. Please note that + which is located in `.venv/` at the repository root. Please note that you might need to activate it manually using `source .venv/bin/activate`. - `just lint` checks whether all coding conventions (as defined in `pyproject.toml` and `rustfmt.toml`) are fulfilled. -- `just format` autoformats code according to coding conventions - as much as possible. +- `just format` autoformats as much code as possible according to coding conventions. - `just test` runs all lints and tests. @@ -32,39 +31,31 @@ This package is mainly separated into two components: in `sqlquerypp.compiler.Compiler` and its subclasses. - Rust API: `src/` - - - `lib.rs` is the main entrypoint to look at. It constructs a module with - the full-qualified name `sqlquerypp.sqlquerypp`. It is internal to the + - `lib.rs` is the main entrypoint to look at. It constructs a module with + the fully qualified name `sqlquerypp.sqlquerypp`. It is internal to the Python API and exposes internally used, fast SQL preprocessor - implementations. Its python interface declaration is located in + implementations. Its Python interface declaration is located in `python/sqlquerypp/sqlquerypp.pyi`. - - - `error.rs`, `lex.rs`, `scanner.rs` and `types.rs` should be quite self- - explaining. - - - The code within `parser/` is responsible for parsing nodes (i.e. - representations of `sqlquerypp` directives) and generating codes - for them. - - - `ParserState` is a state automaton based parser implementation - which does the "magic" transforming `sqlquerypp` code strings + - `error.rs`, `lex.rs`, `scanner.rs` and `types.rs` should be self-explanatory. + - The code within `parser/` is responsible for parsing nodes (i.e. + representations of `sqlquerypp` directives) and generating codes + for them. + - `ParserState` is a state automaton based parser implementation + that handles the "magic" of transforming `sqlquerypp` code strings into internal data structures (in terms of compiler construction, called "nodes" in abstract syntax tree, although `sqlquerypp` does not provide a correct, academic-style AST-oriented implementation). - - - For example, while parsing `combined_result` instructions are + - For example, while parsing `combined_result`, instructions are reflected as `CombinedResultNode` instances (`src/parser/nodes/combined_result.rs`). These node objects are obviously very low-level and stateful (many public and optional fields). - - - When generating code, it's most recommended to use + - When generating code, it is recommended to use `CompleteCombinedResultNode` objects. This strategy applies to all nodes `sqlquerypp` supports. See also: - - `ParserState::finalize()` - - `FinalParserState` - - - `codegen/` provides common structs, traits and functions for + - `ParserState::finalize()` + - `FinalParserState` + - `codegen/` provides common structs, traits and functions for generating valid SQL statements from a `FinalParserState`. ## Manual release workflow @@ -72,11 +63,9 @@ This package is mainly separated into two components: - `source .venv/bin/activate` - `maturin build --release` - - - if successful, returns output like "Built wheel for CPython 3.13 to 'PATH'" + - if successful, returns output like "Built wheel for CPython 3.13 to 'PATH'" - `maturin upload ` (use 'PATH' from last command) - - - **NOTE**: This requires token-based authentication. As this is just a + - **NOTE**: This requires token-based authentication. As this is just a quick-and-dirty solution which should not be necessary for long, I won't document this further. diff --git a/README.md b/README.md index 526f4f1..3c47675 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ for both maintainability and high performance. Currently, only MySQL 8.4 syntax is supported. -## Why preprocessing SQL queries? +## Why preprocess SQL queries? SQL (Structed Query Language) follows a declarative paradigm, i.e. a query explains "what should be done" not "how should it be done". This stands in @@ -16,7 +16,7 @@ contrast to imperative programming, which expresses the "how should a certain task be fulfilled" aspect. Database systems' internals are responsible for maintaining this aspect. -But, however, for certain and large data structures, writing down "naive" +However, for certain and large data structures, writing down "naive" queries sometimes result in poor performance. ## Supported performance optimizations @@ -25,7 +25,7 @@ queries sometimes result in poor performance. Consider the following original query: - ``` + ```sql SELECT entity_b.* FROM entity_b INNER JOIN entity_a @@ -33,13 +33,13 @@ Consider the following original query: AND entity_a.criteria = 1337; ``` -This is a very simplified example, but if you assume `entity_b` contains very -many items, even correct index conditions may exhaust any DBMS' join buffer. +This is a very simplified example, but if you assume `entity_b` contains +a multitude of items, even correct index conditions may exhaust any DBMS' join buffer. -An alternative approach might be doing a loop at application side (Python -pseudocode), if network overhead is acceptable: +If network overhead is acceptable, a fitting alternative approach could +be a loop on the application side (Python pseudocode): - ``` + ```python all_matches_in_entity_b = [] for entity_a_id in [rec.id for rec in mysql_query("SELECT id FROM entity_a " @@ -49,15 +49,15 @@ pseudocode), if network overhead is acceptable: all_matches_in_entity_b += inner_result ``` -The following statement, being no valid SQL, translates to a MySQL +The following statement, which is invalid SQL, translates to a MySQL native construct of `Recursive Common Table Expression` and `UNION` -fragments when being compiled by `sqlquerypp`. This allows for maximal +fragments when compiled by `sqlquerypp`. This allows for maximal query performance, because the inner query with reduced complexity is still taken into account. At the same time, it grants minimal I/O overhead as only one query is executed on the database: - ``` + ```text combined_result (SELECT id FROM entity_a WHERE criteria = 1337) AS $id { SELECT * FROM entity_b WHERE entity_a_id = $id; } - ``` \ No newline at end of file + ``` diff --git a/flake.lock b/flake.lock index 88bf936..f177107 100644 --- a/flake.lock +++ b/flake.lock @@ -20,16 +20,16 @@ }, "nixpkgs": { "locked": { - "lastModified": 1755274400, - "narHash": "sha256-rTInmnp/xYrfcMZyFMH3kc8oko5zYfxsowaLv1LVobY=", + "lastModified": 1758701979, + "narHash": "sha256-c7DUti3XM1aga8oVgaPnrVmEeCFtN9PaBxyNuqx8jPc=", "owner": "NixOS", "repo": "nixpkgs", - "rev": "ad7196ae55c295f53a7d1ec39e4a06d922f3b899", + "rev": "e2642aa7d5a15eae586932a56f4294934f959c14", "type": "github" }, "original": { "owner": "NixOS", - "ref": "nixos-25.05", + "ref": "nixpkgs-unstable", "repo": "nixpkgs", "type": "github" } diff --git a/flake.nix b/flake.nix index b76f303..3513da0 100644 --- a/flake.nix +++ b/flake.nix @@ -1,6 +1,6 @@ { inputs = { - nixpkgs.url = "github:NixOS/nixpkgs/nixos-25.05"; + nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable"; flake-utils.url = "github:numtide/flake-utils"; rust-overlay = { url = "github:oxalica/rust-overlay"; @@ -8,7 +8,7 @@ }; }; - outputs = { self, nixpkgs, rust-overlay, flake-utils }: + outputs = { self, nixpkgs, rust-overlay, flake-utils}: flake-utils.lib.eachDefaultSystem (system: let overlays = [ (import rust-overlay) ]; @@ -30,6 +30,7 @@ ])) just rust + mado ]; }; } diff --git a/justfile b/justfile index 41f853c..4f51c05 100644 --- a/justfile +++ b/justfile @@ -14,6 +14,7 @@ lint: mypy --check cargo fmt --check cargo clippy --all-targets --all-features -- --deny warnings + mado check format: ruff format diff --git a/mado.toml b/mado.toml new file mode 100644 index 0000000..32d6edb --- /dev/null +++ b/mado.toml @@ -0,0 +1,45 @@ +[lint] +rules = [ + "MD001", + "MD002", + "MD003", + "MD004", + "MD005", + "MD006", + # Ist etwas instabil bei nested lists + # "MD007", + "MD009", + "MD010", + "MD012", + "MD013", + "MD014", + "MD018", + "MD019", + "MD020", + "MD021", + "MD022", + "MD023", + "MD024", + "MD025", + "MD026", + "MD027", + "MD028", + "MD029", + "MD030", + "MD031", + "MD032", + "MD033", + "MD034", + "MD035", + "MD036", + "MD037", + "MD038", + "MD039", + "MD040", + "MD041", + "MD046", + "MD047", +] + +[lint.md026] +punctuation = ".,;:!"