Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for BufferedInput and fix couple of bugs #367

Closed
wants to merge 36 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
a595b1c
Fix an error in reading <!doctype > definition from `&[u8]` (in lower…
Mingun Feb 12, 2022
e240a6b
Add tests for `BufferedInput::read_bytes_until` and document it
Mingun Feb 18, 2022
e365d54
Advance position correctly in `BufferedInput` implementation for `&[u8]`
Mingun Feb 17, 2022
778eac6
Simplify `read_until_close`
Mingun Feb 16, 2022
e46193b
Add tests for `BufferedInput::read_bang_element` and document it
Mingun Mar 6, 2022
a885278
Factor out BangType and BangType creation
Mingun Feb 12, 2022
58c7661
Factor out checking of the end of `<!` syntax constructs into a funct…
Mingun Feb 12, 2022
9b4f0e3
Factor out bang error creation code to the function
Mingun Feb 12, 2022
6dedae2
Improve `UnexpectedBang` error, add the byte found
Mingun Feb 13, 2022
a7ad48c
Deduplicate `read_bang_element` code for borrowed and non-borrowed re…
Mingun Feb 12, 2022
6dd90c0
Move bang parsing into `fill_buf()` match because this code used only…
Mingun Feb 12, 2022
de982f6
Move updating position to the match arm
Mingun Feb 12, 2022
4882ed3
Add canary test for correct comment in `dashes_in_comments` test
Mingun Mar 7, 2022
e9434c0
Improve `test_comment_starting_with_gt` - assert comment content usin…
Mingun Mar 7, 2022
206e57f
Do not include `]]` service sequence to the buffer when parse CDATA c…
Mingun Mar 7, 2022
9582fe5
Add tests for #344
Mingun Mar 7, 2022
05814ea
Fix internal panic when parsing not fully forming comment, CDATA and …
Mingun Mar 6, 2022
6429311
Add tests for `BufferedInput::read_element` and document it
Mingun Feb 13, 2022
e36a0f9
Isolate loop in a closure
Mingun Feb 12, 2022
abefcbe
Use early-return instead of `break`
Mingun Feb 12, 2022
e32a8e0
Move end-of-loop processing outside of loop
Mingun Feb 12, 2022
1eadc6e
Convert loop into for-in
Mingun Feb 12, 2022
ec7c3b0
Move `buf` outside of closure
Mingun Feb 12, 2022
a8274f0
Move `done` outside of closure
Mingun Feb 12, 2022
4b3fbcc
Move `available` outside of closure
Mingun Feb 12, 2022
90eb538
Convert closure to member function of `State`
Mingun Feb 12, 2022
8493871
Add optimization hint, rename function and convert parameter to an in…
Mingun Feb 12, 2022
86277eb
Remove constant as it useless
Mingun Feb 12, 2022
7cf43ba
Use `Self` instead of an enum name
Mingun Feb 12, 2022
fcb9bc3
Deduplicate `read_element` code for borrowed and non-borrowed readers
Mingun Feb 12, 2022
6f4c7e9
Move state changing code into `fill_buf()` match because this code us…
Mingun Feb 12, 2022
d09b6fd
Move consuming inside of match arm
Mingun Feb 12, 2022
7b7ee2f
Drop the `done` variable
Mingun Feb 12, 2022
6691c39
Move updating position to the match arm
Mingun Feb 12, 2022
32feb13
Remove duplicated code
Mingun Feb 12, 2022
3bdcd71
Document `BufferedInput` and remove unused `input_borrowed` method. C…
Mingun Mar 6, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add tests for BufferedInput::read_element and document it
Mingun committed Mar 8, 2022
commit 6429311815dbf66c66fb8bcc12c2cf0c6e9ab515
196 changes: 186 additions & 10 deletions src/reader.rs
Original file line number Diff line number Diff line change
@@ -981,6 +981,28 @@ where
position: &mut usize,
) -> Result<Option<(BangType, &'r [u8])>>;

/// Read input until XML element is closed by approaching a `>` symbol.
/// Returns `Some(buffer)` that contains a data between `<` and `>` or
/// `None` if end-of-input was reached and nothing was read.
///
/// Derived from `read_until`, but modified to handle XML attributes
/// using a minimal state machine.
///
/// Attribute values are [defined] as follows:
/// ```plain
/// AttValue := '"' (([^<&"]) | Reference)* '"'
/// | "'" (([^<&']) | Reference)* "'"
/// ```
/// (`Reference` is something like `&quot;`, but we don't care about
/// escaped characters at this level)
///
/// # Parameters
/// - `buf`: Buffer that could be filled from an input (`Self`) and
/// from which [events] could borrow their data
/// - `position`: Will be increased by amount of bytes consumed
///
/// [defined]: https://www.w3.org/TR/xml11/#NT-AttValue
/// [events]: crate::events::Event
fn read_element(&mut self, buf: B, position: &mut usize) -> Result<Option<&'r [u8]>>;

fn skip_whitespace(&mut self, position: &mut usize) -> Result<()>;
@@ -1092,16 +1114,6 @@ impl<'b, 'i, R: BufRead + 'i> BufferedInput<'b, 'i, &'b mut Vec<u8>> for R {
}
}

/// Derived from `read_until`, but modified to handle XML attributes using a minimal state machine.
/// [W3C Extensible Markup Language (XML) 1.1 (2006)](https://www.w3.org/TR/xml11)
///
/// Attribute values are defined as follows:
/// ```plain
/// AttValue := '"' (([^<&"]) | Reference)* '"'
/// | "'" (([^<&']) | Reference)* "'"
/// ```
/// (`Reference` is something like `&quot;`, but we don't care about escaped characters at this
/// level)
#[inline]
fn read_element(
&mut self,
@@ -2097,6 +2109,170 @@ mod test {
}
}

mod read_element {
use crate::reader::BufferedInput;

/// Checks that nothing was read from empty buffer
#[test]
fn empty() {
let buf = $buf;
let mut position = 0;
let mut input = b"".as_ref();
// ^= 0

assert_eq!(input.read_element(buf, &mut position).unwrap(), None);
assert_eq!(position, 0);
}

mod open {
use crate::reader::BufferedInput;

#[test]
fn empty_tag() {
let buf = $buf;
let mut position = 0;
let mut input = b">".as_ref();
// ^= 1

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b"".as_ref())
);
assert_eq!(position, 1);
}

#[test]
fn normal() {
let buf = $buf;
let mut position = 0;
let mut input = b"tag>".as_ref();
// ^= 4

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b"tag".as_ref())
);
assert_eq!(position, 4);
}

#[test]
fn empty_ns_empty_tag() {
let buf = $buf;
let mut position = 0;
let mut input = b":>".as_ref();
// ^= 2

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b":".as_ref())
);
assert_eq!(position, 2);
}

#[test]
fn empty_ns() {
let buf = $buf;
let mut position = 0;
let mut input = b":tag>".as_ref();
// ^= 5

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b":tag".as_ref())
);
assert_eq!(position, 5);
}

#[test]
fn with_attributes() {
let buf = $buf;
let mut position = 0;
let mut input = br#"tag attr-1=">" attr2 = '>' 3attr>"#.as_ref();
// ^= 38

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(br#"tag attr-1=">" attr2 = '>' 3attr"#.as_ref())
);
assert_eq!(position, 38);
}
}

mod self_closed {
use crate::reader::BufferedInput;

#[test]
fn empty_tag() {
let buf = $buf;
let mut position = 0;
let mut input = b"/>".as_ref();
// ^= 2

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b"/".as_ref())
);
assert_eq!(position, 2);
}

#[test]
fn normal() {
let buf = $buf;
let mut position = 0;
let mut input = b"tag/>".as_ref();
// ^= 5

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b"tag/".as_ref())
);
assert_eq!(position, 5);
}

#[test]
fn empty_ns_empty_tag() {
let buf = $buf;
let mut position = 0;
let mut input = b":/>".as_ref();
// ^= 3

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b":/".as_ref())
);
assert_eq!(position, 3);
}

#[test]
fn empty_ns() {
let buf = $buf;
let mut position = 0;
let mut input = b":tag/>".as_ref();
// ^= 6

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(b":tag/".as_ref())
);
assert_eq!(position, 6);
}

#[test]
fn with_attributes() {
let buf = $buf;
let mut position = 0;
let mut input = br#"tag attr-1="/>" attr2 = '/>' 3attr/>"#.as_ref();
// ^= 41

assert_eq!(
input.read_element(buf, &mut position).unwrap(),
Some(br#"tag attr-1="/>" attr2 = '/>' 3attr/"#.as_ref())
);
assert_eq!(position, 41);
}
}
}

mod issue_344 {
use crate::errors::Error;