Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much more efficient memory usage and smarter dynamic dispatch usage,... #210

Merged
merged 14 commits into from
Apr 15, 2024
Merged

Conversation

Vrtgs
Copy link
Owner

@Vrtgs Vrtgs commented Apr 13, 2024

…and a bunch of kinda nit-picky changes to docs (including typo fixes)

I ran the tests this time, I learned, but this time I only ran chrome, because I just could not get the gekodriver to work properly

…and a bunch of kinda nit-picky changes to docs (including typo fixes)
@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 13, 2024

oh yeah msvc is probably 1.75 if that matters too much I'm gonna have to fix that

Copy link
Collaborator

@stevepryde stevepryde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

I'm curious to know why you felt memory usage was an issue, especially for a web browser automation crate. What is your use-case? Neither performance nor memory usage are usually critical for this kind of software. That said, I think the ergonomics from a user perspective is mostly unaffected by the changes and in some cases ergonomics is slightly improved (such as element filtering predicates etc) so I'm happy to go ahead with it.

Since we're nit-picking, I've flagged the cases where you've changed the Australian spelling of certain words to American. I'm assuming it wasn't intentional. Normally I wouldn't care too much about the spelling of new contributions, but changing existing documentation to American spelling is a bit much😛.

use std::rc::Rc;
use std::sync::Arc;

pub trait IntoTransfer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why IntoTransfer? What about IntoArcStr or something more specific?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transfer, because were gonna transfer it to the other management task

IntoArcStr would be confusing because why not use Into<Arc> which I didn't use because many of the types added to the impl don't implement Into<Arc>

Copy link
Owner Author

@Vrtgs Vrtgs Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have name suggestions I'm up for it I tried to come up with a good name for a good 10 minutes

@stevepryde
Copy link
Collaborator

oh yeah msvc is probably 1.75 if that matters too much I'm gonna have to fix that

Personally I don't mind, but some users might not like it.
How much work it is to support versions back 12 months or something like that? Somewhere around 1.69 or 1.70. That's still arbitrary but might be nicer for some users if they're stuck on an older version for whatever reason.

@stevepryde
Copy link
Collaborator

…and a bunch of kinda nit-picky changes to docs (including typo fixes)

I ran the tests this time, I learned, but this time I only ran chrome, because I just could not get the gekodriver to work properly

What OS? Wondering if there's any documentation updates we can add to fix this.
https://stevepryde.github.io/thirtyfour/contributing/testing.html

I am on Windows WSL but I've tried it on Linux previously. On Ubuntu you can just install the firefox snap and then run geckodriver from a terminal. Then in another terminal/tab you can run the tests using THIRTYFOUR_BROWSER=firefox cargo test.

@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 13, 2024

oh yeah msvc is probably 1.75 if that matters too much I'm gonna have to fix that

Personally I don't mind, but some users might not like it. How much work it is to support versions back 12 months or something like that? Somewhere around 1.69 or 1.70. That's still arbitrary but might be nicer for some users if they're stuck on an older version for whatever reason.

I think we can do that by just adding #[async_trait] to
thirtyfour::components::wrapper::resolver::sealed::Resolve

@Vrtgs Vrtgs closed this Apr 13, 2024
@Vrtgs Vrtgs reopened this Apr 13, 2024
@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 13, 2024

…and a bunch of kinda nit-picky changes to docs (including typo fixes)
I ran the tests this time, I learned, but this time I only ran chrome, because I just could not get the gekodriver to work properly

What OS? Wondering if there's any documentation updates we can add to fix this. stevepryde.github.io/thirtyfour/contributing/testing.html

I am on Windows WSL but I've tried it on Linux previously. On Ubuntu you can just install the firefox snap and then run geckodriver from a terminal. Then in another terminal/tab you can run the tests using THIRTYFOUR_BROWSER=firefox cargo test.

running geckodriver on the terminal displays a "Machine type Mismatch"

@codecov-commenter
Copy link

codecov-commenter commented Apr 13, 2024

Codecov Report

Attention: Patch coverage is 37.22826% with 231 lines in your changes are missing coverage. Please review.

Project coverage is 41.31%. Comparing base (f2b9a13) to head (47a11ef).

❗ Current head 47a11ef differs from pull request most recent head 45cbafc. Consider uploading reports for the commit 45cbafc to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
Files Coverage Δ
thirtyfour/src/action_chain.rs 61.25% <ø> (ø)
thirtyfour/src/alert.rs 73.33% <ø> (ø)
thirtyfour/src/common/capabilities/firefox.rs 0.00% <ø> (ø)
thirtyfour/src/common/command.rs 78.21% <100.00%> (ø)
thirtyfour/src/common/requestdata.rs 47.61% <100.00%> (ø)
thirtyfour/src/extensions/query/poller.rs 100.00% <ø> (ø)
thirtyfour/src/web_element.rs 87.50% <100.00%> (ø)
thirtyfour/tests/components.rs 100.00% <ø> (ø)
thirtyfour-macros/src/lib.rs 0.00% <0.00%> (ø)
thirtyfour/src/common/capabilities/chromium.rs 10.12% <0.00%> (-1.37%) ⬇️
... and 16 more

@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 13, 2024

I'm curious to know why you felt memory usage was an issue, especially for a web browser automation crate. What is your use-case? Neither performance nor memory usage are usually critical for this kind of software

a bit of a dumb reason honestly what sparked my initial snooping in the code was I got mad that I had to wrap everything in Box::new and Box::pin for the resolver, and looked at the code

and honestly the thing that ticked me off the most to go read the entire codebase and do these changes was because I saw you storing Arc<ElementQueryFn> but ElementQueryFn was Box<dyn ...>, yes again I realize its a bit of a waste to care too much, I ended up taking it as a challenge for the most part, not that it was that bad for performance

@Vrtgs Vrtgs requested a review from stevepryde April 13, 2024 21:36
@stevepryde
Copy link
Collaborator

I'm curious to know why you felt memory usage was an issue, especially for a web browser automation crate. What is your use-case? Neither performance nor memory usage are usually critical for this kind of software

a bit of a dumb reason honestly what sparked my initial snooping in the code was I got mad that I had to wrap everything in Box::new and Box::pin for the resolver, and looked at the code

and honestly the thing that ticked me off the most to go read the entire codebase and do these changes was because I saw you storing Arc but ElementQueryFn was Box<dyn ...>, yes again I realize its a bit of a waste to care too much, I ended up taking it as a challenge for the most part, not that it was that bad for performance

Yep I like the changes you made in that area. The predicate code feels a bit nicer now.

@stevepryde
Copy link
Collaborator

…and a bunch of kinda nit-picky changes to docs (including typo fixes)
I ran the tests this time, I learned, but this time I only ran chrome, because I just could not get the gekodriver to work properly

What OS? Wondering if there's any documentation updates we can add to fix this. stevepryde.github.io/thirtyfour/contributing/testing.html
I am on Windows WSL but I've tried it on Linux previously. On Ubuntu you can just install the firefox snap and then run geckodriver from a terminal. Then in another terminal/tab you can run the tests using THIRTYFOUR_BROWSER=firefox cargo test.

running geckodriver on the terminal displays a "Machine type Mismatch"

That's odd. What OS? What arch? Are you downloading the correct version of geckodriver?

@stevepryde
Copy link
Collaborator

oh yeah msvc is probably 1.75 if that matters too much I'm gonna have to fix that

Personally I don't mind, but some users might not like it. How much work it is to support versions back 12 months or something like that? Somewhere around 1.69 or 1.70. That's still arbitrary but might be nicer for some users if they're stuck on an older version for whatever reason.

I think we can do that by just adding #[async_trait] to thirtyfour::components::wrapper::resolver::sealed::Resolve

I think this is worth doing

@stevepryde
Copy link
Collaborator

Once you're done making changes, please provide a summary of the changes and the reasons for them. I feel that these later changes probably should have been a separate PR. Let's not try to change the entire codebase in one commit 😅

Comment on lines +36 to +49
impl Display for Escaped<'_> {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
for (i, substring) in self.0.split('\"').enumerate() {
if i != 0 {
f.write_str(", '\"', ")?
}
write!(f, "\"{}\"", substring)?;
}
if self.0.ends_with('\"') {
f.write_str(", '\"'")?;
}
Ok(())
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could have just been a function 🤔 ?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because I put escaped into a call of format!() and I don't think there is a way to access the formatter directly

I kind of use the same patter as any of the std::str::Escape*

@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 14, 2024

Once you're done making changes, please provide a summary of the changes and the reasons for them. I feel that these later changes probably should have been a separate PR. Let's not try to change the entire codebase in one commit 😅

hmm yeah, I should

@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 14, 2024

…and a bunch of kinda nit-picky changes to docs (including typo fixes)
I ran the tests this time, I learned, but this time I only ran chrome, because I just could not get the gekodriver to work properly

What OS? Wondering if there's any documentation updates we can add to fix this. stevepryde.github.io/thirtyfour/contributing/testing.html
I am on Windows WSL but I've tried it on Linux previously. On Ubuntu you can just install the firefox snap and then run geckodriver from a terminal. Then in another terminal/tab you can run the tests using THIRTYFOUR_BROWSER=firefox cargo test.

running geckodriver on the terminal displays a "Machine type Mismatch"

That's odd. What OS? What arch? Are you downloading the correct version of geckodriver?

windows x86_64, I downloaded geckodriver-v0.33.0-win64.zip

@stevepryde
Copy link
Collaborator

I appreciate the effort you've put in, but dumping a huge code change like this as a PR makes it really difficult to sift through and understand the intent and the implications.

I like the improvements to the predicate functions but a lot of other changes seem to increase the complexity of the codebase without offering a tangible real-world benefit either to the developer or to the runtime outcome. I'd even suggest there's a fair chunk that largely comes down to personal preference. This is the first I've seen anyone recommend Arc<str> instead of String, for example. I can see why you'd care about this for a performance critical application, but this isn't that. Instead many users of the library will probably find it confusing.

I think there's an argument for keeping things familiar and not increasing the complexity unnecessarily.

@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 14, 2024

I appreciate the effort you've put in, but dumping a huge code change like this as a PR makes it really difficult to sift through and understand the intent and the implications.

I like the improvements to the predicate functions but a lot of other changes seem to increase the complexity of the codebase without offering a tangible real-world benefit either to the developer or to the runtime outcome. I'd even suggest there's a fair chunk that largely comes down to personal preference. This is the first I've seen anyone recommend Arc<str> instead of String, for example. I can see why you'd care about this for a performance critical application, but this isn't that. Instead many users of the library will probably find it confusing.

I think there's an argument for keeping things familiar and not increasing the complexity unnecessarily.

hmm yea I see your point, and I think I really should make a list of all the changes and detail what how and why I did them

as for Arc<str> I think should be used here because we just want an immutable view to a string slice, and as can be see from the code since we clone the string slice a lot I think it would be useful to use an Arc<str> instead of a String, and I mean most of the changes that came with that were not at the site where the string was used, just where it was created and given a type

@stevepryde
Copy link
Collaborator

I appreciate the effort you've put in, but dumping a huge code change like this as a PR makes it really difficult to sift through and understand the intent and the implications.
I like the improvements to the predicate functions but a lot of other changes seem to increase the complexity of the codebase without offering a tangible real-world benefit either to the developer or to the runtime outcome. I'd even suggest there's a fair chunk that largely comes down to personal preference. This is the first I've seen anyone recommend Arc<str> instead of String, for example. I can see why you'd care about this for a performance critical application, but this isn't that. Instead many users of the library will probably find it confusing.
I think there's an argument for keeping things familiar and not increasing the complexity unnecessarily.

hmm yea I see your point, and I think I really should make a list of all the changes and detail what how and why I did them

as for Arc<str> I think should be used here because we just want an immutable view to a string slice, and as can be see from the code since we clone the string slice a lot I think it would be useful to use an Arc<str> instead of a String, and I mean most of the changes that came with that were not at the site where the string was used, just where it was created and given a type

I understand the reason for the Arc<str>. It's not a bad change. There is some benefit like you said. And I think you did it in a way that doesn't affect how the library is used. It seems to be an internal change only for the most part. So that's fine.

Don't worry about splitting it or anything. Let's just put a freeze on the feature changes and I'll have another look through the code and we can go from there.

If you can summarise the changes with a bullet point each that would be helpful too.

@Vrtgs
Copy link
Owner Author

Vrtgs commented Apr 14, 2024

I appreciate the effort you've put in, but dumping a huge code change like this as a PR makes it really difficult to sift through and understand the intent and the implications.
I like the improvements to the predicate functions but a lot of other changes seem to increase the complexity of the codebase without offering a tangible real-world benefit either to the developer or to the runtime outcome. I'd even suggest there's a fair chunk that largely comes down to personal preference. This is the first I've seen anyone recommend Arc<str> instead of String, for example. I can see why you'd care about this for a performance critical application, but this isn't that. Instead many users of the library will probably find it confusing.
I think there's an argument for keeping things familiar and not increasing the complexity unnecessarily.

hmm yea I see your point, and I think I really should make a list of all the changes and detail what how and why I did them
as for Arc<str> I think should be used here because we just want an immutable view to a string slice, and as can be see from the code since we clone the string slice a lot I think it would be useful to use an Arc<str> instead of a String, and I mean most of the changes that came with that were not at the site where the string was used, just where it was created and given a type

I understand the reason for the Arc<str>. It's not a bad change. There is some benefit like you said. And I think you did it in a way that doesn't affect how the library is used. It seems to be an internal change only for the most part. So that's fine.

Don't worry about splitting it or anything. Let's just put a freeze on the feature changes and I'll have another look through the code and we can go from there.

If you can summarise the changes with a bullet point each that would be helpful too.

here since I'm done with the first part for the first list of major changes you can read through while I write part 2

List of changes:

  • Some admittedly nitty spelling changes

lets start with thirtyfour-macros, the changes I made were:

as for thirtyfour it self:

  • in common/capabilities
    • I pulled remove_arg and args and has_arg from their specific browsers capabilities, and moved it to BrowserCapabilitiesHelper, because they are the exact same function
    • changed has_arg slightly, to avoid the one extra copy of the argument made to pass into contains, and instead used .iter().any() and checked for equality directly
  • in common/command:
    • I changed the Strings to Arc<str> to make the enclosing structs cheaply cloneable, and Arc<str> isn't any different from a normal String, so it shouldn't matter that much from an ergonomic standpoint, as you can use it in basically much the same way
    • removed unnecessary prefixes for serde_json::Value
  • in common/types:
    • much the same changes as before for Arc<str>
    • improvements on predicate functions
  • in /components/wrapper/resolver:
    • improvements to the way caching is implemented,
      the old system, there could be more than one access to resolve all of them see that there is no cached result and then both of them, will try to resolve it having many requests to the browser to resolve the element and then all of them start replacing the cached value (which also requires mutable access and requiring the use of a mutex),
      the new system I use tokio's async OnceCell, this makes sure that the query runs only once per time caching, and if it was already initialized there is a fast path that returns the already initialized value, I store this behind an ArcSwap which basically is an Arc that I can also atomically change the value it is pointing to, which allows the resolution of components to be nearly fully lock free
    • use the new style of predicates
    • reduce code duplication, by adding a trait to the things that can be resolved, and adding the proper validate and resolve methods onto those types, instead of duplicating the code again and again
    • made the resolution of lists of Elements concurrent, by making them check the status of many elements concurrently
  • in extensions/query/conditions:
    • changed to the new style of predicate
    • removed some code duplication by introducing a macro
  • in extensions/query/element_query and extensions/query/element_waiter:
    • changed to the new style of predicate
    • much the same changes as before for Arc<str>

other changes have either been discussed or are very minor

Copy link
Collaborator

@stevepryde stevepryde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the writeup and for the awesome work. It's much appreciated!

@stevepryde stevepryde merged commit 928b9cd into Vrtgs:main Apr 15, 2024
11 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants