-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convenience functions for traversing DOM? #18
Comments
It seems like you already planned to select descendants tl/src/queryselector/selector.rs Line 18 in 1b8fab3
The match case is just not implemented in |
Yep, though I'm not sure about the complexity of adding this to the query selector. It's probably worth mentioning somewhere in the docs that not all of the query selector API is fully implemented yet (
I agree, |
Oh right, the
That's great! If Node/HTMLTag has that functionality, that already gives us powerful composability. |
Have you considered putting a https://github.com/dist1ll/hltv-rust/blob/main/src/tl_extensions.rs I created a #[derive(Clone, Copy)]
pub struct RichNode<'a> {
pub d: &'a VDom<'a>,
pub n: Option<NodeHandle>,
} I've used this extension to create an ergonomic find function: // Example of using .find and .get_attr
fn parse_team(h: RichNode, team_id: &str) -> Option<Team> {
Some(Team{
// get node as tag, and parse attribute to correct type
id: h.get_attr(team_id).unwrap_or(None)?,
// use .find to find children with the given class name
name: h.find(team_id).find("matchTeamName").inner_text()?,
})
}
|
I've thought of this before and I'm not sure if this would work (if I understood this correctly). This sounds like we would end up with self referential structs if struct Node<'input, 'parser> {
raw: &'input [u8]
children: Vec<NodeHandle<'input, 'parser>>
}
struct NodeHandle<'input, 'parser> {
idx: usize,
parser: &'parser VDom<'input>
}
struct Parser<'input> {
nodes: Vec<Node<'input, '???>> // can't annotate the 'parser ('self) lifetime
}
struct VDom<'input> {
parser: Parser<'input>
} The reason why struct Node<'input> {
raw: &'input [u8]
children: Vec<NodeHandle>
}
struct NodeHandle {
idx: usize,
}
struct Parser<'input> {
nodes: Vec<Node<'input>>
}
struct VDom<'input> {
parser: Parser<'input>
} I agree that this makes chaining rather awkward and possibly unidiomatic, but I'm not sure there is a nice and easy solution.
yes, that module will be public in the next release |
Thanks for explaining your thoughts, I see now how that approach creates some problems. I really like the idea of a sink though |
0.6.0 is on crates.io with some of your requested features:
let dom = tl::parse(r#"
<div class="x">
<div class="y">
<div class="z"></div>
<div class="z"></div>
<div class="z"></div>
</div>
</div>
<div class="z"></div>
<div class="z"></div>
<div class="z"></div>
"#, Default::default()).unwrap();
let x_element = dom
.get_elements_by_class_name("x")
.next()
.unwrap()
.get(dom.parser())
.unwrap()
.as_tag()
.unwrap();
let zs = x_element.query_selector(".z").unwrap();
assert_eq!(zs.count(), 3);
I have also refactored the HTMLTag children API a bit. Instead of having
|
wow good stuff! I think those are great improvements! I'll try out the new features, cheers! |
Assume we have the following structure
Let's say I want to get all nested divs of class
z
that are inx
, but not the outside ones. Right now that's quite cumbersome, becausequery_selector()
only works onVDom
. Then I have to implement a DFS, usingNode::children
continuously.Suggestion:
NodeHandle
should also have aquery_selector
method, so that we can more easily traverse nested divs. I think it makes sense from an API perspective, because a node essentially spawns a subtree, so it should have the same functions that a VDom has.Alternative: An alternative solution would be to extend the syntax of the
query_selector
method to search for nested divs. Something likequery_selector("div.x > div.y > div.z")
.The text was updated successfully, but these errors were encountered: