-
Notifications
You must be signed in to change notification settings - Fork 29
Add search relevance stats API #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7dc0410 to
6ec1481
Compare
|
There need to be additional stats added after this is merged to make use of this API. I added
As examples but there are probably other stats that are more vital to track. |
epugh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't really speak to how stats work (new to me) but did review some specific to search relevance items.
We do need to extend this out to the more intensive operations, like issueing queries while running an evaluation.
Lastly, I am a bit surprised how many Java classes are in the org.opensearch.searchrelevance.stats package... Should many of these classes be imported from another project? org.opensearch.stats or something like that? Or be part of the base of a plugin?
Thank you for the contribution and making it possible to ship this in 3.1!
src/main/java/org/opensearch/searchrelevance/judgments/ImportJudgmentsProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/searchrelevance/rest/RestSearchRelevanceStatsAction.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/searchrelevance/rest/RestSearchRelevanceStatsAction.java
Outdated
Show resolved
Hide resolved
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #63 +/- ##
==========================
==========================
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/main/java/org/opensearch/searchrelevance/stats/events/EventStatName.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/searchrelevance/stats/info/InfoStatsManager.java
Outdated
Show resolved
Hide resolved
...ain/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsNodeRequest.java
Outdated
Show resolved
Hide resolved
...in/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsNodeResponse.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsRequest.java
Outdated
Show resolved
Hide resolved
...java/org/opensearch/searchrelevance/transport/stats/SearchRelevanceStatsTransportAction.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/searchrelevance/stats/SearchRelevanceStatsInput.java
Outdated
Show resolved
Hide resolved
|
Thanks for the comments all.
Yes, I'm still catching up on context here of what the common operations users of this plugin are doing. I was planning on implementing the framework first so that way the other maintainers with more context could implement stats using their best judgment. @epugh do you have some more examples of things that would be useful to understand how the plugin is being used?
We've discussed this before on neural side, having a common package would be ideal, but there's a lot of refactoring that has to be done to make that happen. Currently each plugin implements stats APIs with their own business logic indepedently. Given the turn-around time for initial release here in 3.1 here, we can implement like this for now and refactor in the future. |
Signed-off-by: Andy Qin <[email protected]> # Conflicts: # src/test/java/org/opensearch/searchrelevance/plugin/SearchRelevancePluginTests.java # Conflicts: # src/main/java/org/opensearch/searchrelevance/judgments/ImportJudgmentsProcessor.java # src/main/java/org/opensearch/searchrelevance/judgments/LlmJudgmentsProcessor.java # src/main/java/org/opensearch/searchrelevance/judgments/UbiJudgmentsProcessor.java # Conflicts: # src/main/java/org/opensearch/searchrelevance/judgments/ImportJudgmentsProcessor.java # src/main/java/org/opensearch/searchrelevance/judgments/LlmJudgmentsProcessor.java # src/main/java/org/opensearch/searchrelevance/judgments/UbiJudgmentsProcessor.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add entry to changelog. Othere than that PR looks good to me
Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
|
@martin-gaievski @epugh @fen-qin updated changelog. Also added stats to track experiment type executions. Here's the result after running the demo.sh script: |
fen-qin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
Before merging let me also port the changes from opensearch-project/neural-search#1360 |
Signed-off-by: Andy Qin <[email protected]>
|
|
||
| private void recordStats(Experiment experiment) { | ||
| EventStatsManager.increment(EventStatName.EXPERIMENT_EXECUTIONS); | ||
| Optional.ofNullable(experimentTypeIncrementers.get(experiment.type())).ifPresent(Runnable::run); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this same pattern for other plugins, this total number of executions is effectively the sum of all metrics for individual types? This saves us efforts for aggregating them outside OS but increases numbers of KPIs we're collecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's on a case by case basis but here it's the same pattern, the idea for this API design is to reduce external aggregations when possible. For hybrid query stats in neural search for example, we track the total number of normalization processor executions in addition to the normalization/combination technique breakdowns, see here
In this case the primary way users will be interacting with SRW is by running experiments. So I think have a coarse grained metric like this is worthwhile, even if we can aggregate the info in other ways
|
Ready to merge after approvals |
martin-gaievski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you!
* Add stats API Signed-off-by: Andy Qin <[email protected]> * Rename all stats references to search relevance Signed-off-by: Andy Qin <[email protected]> * Add experiment stats Signed-off-by: Andy Qin <[email protected]> * Add include category parameters Signed-off-by: Andy Qin <[email protected]> --------- Signed-off-by: Andy Qin <[email protected]>
Description
Adds stats API framework to search relevance backend. This is largely ported from the design in opensearch-project/neural-search#1196.
This stats API is default enabled.
Framework should be merged first, then additional stats can be added based on the examples.
API details
Path Parameters
nodes: specify node ids to retrieve stats from (default all)stats: specify stat names to retrieve (default all)Query Parameters
include_metadata: boolean, include recent_interval/stat_type/minutes_since_last_event (default false)flat_stat_paths: boolean, flatten the JSON response (default false)Example calls
Cluster level setting to disable Stats API/Collection
Example response:
Issues Resolved
Resolves #47
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.