-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GEODE-9602: QueryObserver improvements. #6874
base: develop
Are you sure you want to change the base?
Conversation
bf0c44c
to
2a617f7
Compare
- Make QueryObserverHolder thread-safe - Allow having an observer per query by means of setting the observer in the query at the start of the execution. - Invoke beforeIterationEvaluation and afterIterationEvaluation callbacks when query is using indexes.
2a617f7
to
d32b31c
Compare
This pull request fixes 2 alerts when merging d32b31c into d0113fc - view on LGTM.com fixed alerts:
|
verify(myQueryObserver, times(0)).beforeIterationEvaluation(any(), any()); | ||
verify(myQueryObserver, times(0)).afterIterationEvaluation(any()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also use never()
:
verify(myQueryObserver, never()).beforeIterationEvaluation(any(), any());
verify(myQueryObserver, never()).afterIterationEvaluation(any());
return _instance != NO_OBSERVER; | ||
} | ||
|
||
/** Return the current QueryObserver instance */ | ||
public static QueryObserver getInstance() { | ||
public static synchronized QueryObserver getInstance() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've wanted to make this class thread-safe too. There is one problem with doing so though. getInstance()
is in the path of every query execution and index evaluate call. If you make it synchronized
, then there will be a significant performance penalty inflicted on every query and index evaluation.
To see this, look at the call hierarchy of constructing a new ExecutionConext. Every new instance will invoke:
private QueryObserver observer = QueryObserverHolder.getInstance();
In order to make this class thread-safe, I think we need to solve the problem of how to get queries and index evaluations from hitting those synchronized
blocks on every invocation. I'm not as familiar with querying or indexing so you may need to chat about it further with Anil or others.
Another problem is that the new private QueryObserver observer
field in ExecutionContext
is itself not thread-safe, so the setter and getter you added are not thread-safe either.
I think your intention is to have a test invoke setObserver
for one or more direct invocations of a query. If that's true, then you're probably better off introducing a ThreadLocal
to ExecutionContext
for this purpose. Then the execution would check that ThreadLocal
for a non-null value and then use it. If it's null, the execution would fallback to using the usual observer
stored in the field.
Then you could make the field itself final
in the hopes that it would be used multiple times. But again, I suspect that every query invocation creates a new ExecutionContext
which will perform poorly because of the synchronized
in QueryObserverHolder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is not going to achieve what you are looking for:
The QueryObserverHolder class is not thread-safe.
As Kirk pointed out, you have to make this as a ThreadLocal and set/reset while in use.
Other option could be making this as non-static...If the query context is available in all the places where query-observer is getting used, you can think of creating a new QueryObserver and setting it on the context.
QueryContext.setObserver(new QueryObserver()); And no static implementation/use of query observer.
I did not see any test changes where concurrent queries are using QueryObserver...It will show if the query observers are cross referenced...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've wanted to make this class thread-safe too. There is one problem with doing so though.
getInstance()
is in the path of every query execution and index evaluate call. If you make itsynchronized
, then there will be a significant performance penalty inflicted on every query and index evaluation.To see this, look at the call hierarchy of constructing a new ExecutionConext. Every new instance will invoke:
private QueryObserver observer = QueryObserverHolder.getInstance();In order to make this class thread-safe, I think we need to solve the problem of how to get queries and index evaluations from hitting those
synchronized
blocks on every invocation. I'm not as familiar with querying or indexing so you may need to chat about it further with Anil or others.
I would expect this to be a problem if the above calls are done many times per query. But in this PR, it is proposed to invoke it just once, to get the value and then set it in the ExecutionContext
, right before the query is executed. Later, when the query needs the value of the observer, it would get it from the ExecutionContext
using a non-synchronized call given that all the calls to get the observer have been changed from QueryObserverHolder.getInstance()
to context.getObserver()
.
Another problem is that the new
private QueryObserver observer
field inExecutionContext
is itself not thread-safe, so the setter and getter you added are not thread-safe either.
I am assuming that the ExecutionContext
is not shared between threads and therefore it does not need to be thread-safe. I am wrong here?
I think your intention is to have a test invoke
setObserver
for one or more direct invocations of a query. If that's true, then you're probably better off introducing aThreadLocal
toExecutionContext
for this purpose. Then the execution would check thatThreadLocal
for a non-null value and then use it. If it's null, the execution would fallback to using the usualobserver
stored in the field.
Actually, my intention was just to make this class thread-safe but not extend it to have new usages. Nevertheless, the class is documented and used so that the observer can be set from any thread and be available to any thread in the VM.
I have proposed the current solution but have also considered the use of a ThreadLocal
variable, not in the ExecutionContext
class (which did not have an observer
field) but in the QueryObserverHolder
class to be used in a way similar to what you are proposing.
But I think the lifecycle of the ThreadLocal
variable would be complex. It should be reset after the query execution. But who would do it?
Then you could make the field itself
final
in the hopes that it would be used multiple times. But again, I suspect that every query invocation creates a newExecutionContext
which will perform poorly because of thesynchronized
inQueryObserverHolder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is not going to achieve what you are looking for:
The QueryObserverHolder class is not thread-safe.
As Kirk pointed out, you have to make this as a ThreadLocal and set/reset while in use.
Other option could be making this as non-static...If the query context is available in all the places where query-observer is getting used, you can think of creating a new QueryObserver and setting it on the context.
QueryContext.setObserver(new QueryObserver()); And no static implementation/use of query observer.
That's actually what I tried to do with this proposal. Am I missing something here?
I did not see any test changes where concurrent queries are using QueryObserver...It will show if the query observers are cross referenced...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like you have that covered Alberto. By passing around context, it looks to me like you have addressed many if not all the thread safety issues. I tend to agree with Kirk about preferring and atomic to a synchronize, though.
@@ -54,26 +54,26 @@ | |||
/** | |||
* Set the given observer to be notified of query events. Returns the current observer. | |||
*/ | |||
public static QueryObserver setInstance(QueryObserver observer) { | |||
public static synchronized QueryObserver setInstance(QueryObserver observer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using synchronized
, you'd be better off making _instance
a private static final AtomicReference<QueryObserver>
. The use of synchronized
will cause at least some JVM implementations to pin every thread hitting it to the same CPU.
Using atomics is generally always a better approach if you can use it.
try { | ||
if (ok && runtimeItr != null) { | ||
runtimeItr.setCurrent(o1); | ||
observer.beforeIterationEvaluation(iterOp, o1); | ||
ok = QueryUtils.applyCondition(iterOp, context); | ||
} | ||
} finally { | ||
observer.afterIterationEvaluation(ok); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should try to extract the blocks that look like this to their own private methods. There is some repetition involved that you may be able to reduce down to one method. The code calling these methods will also become more readable.
try { | ||
if (ok && runtimeItr != null) { | ||
observer.beforeIterationEvaluation(iterOps, value); | ||
ok = QueryUtils.applyCondition(iterOps, context); | ||
} | ||
} finally { | ||
observer.afterIterationEvaluation(ok); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class also has several try-finally blocks like that this may benefit by being extracted as a new private method both for readability and to hopefully reduce some of the repetition in these blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving this PR, but I share Kirk's concerns over query performance.
@@ -296,8 +319,11 @@ public void testTraceOnLocalRegionWithTracePrefixNoComments() throws Exception { | |||
assertTrue(((DefaultQuery) query).isTraced()); | |||
|
|||
SelectResults results = (SelectResults) query.execute(); | |||
assertTrue(QueryObserverHolder.getInstance() instanceof IndexTrackingQueryObserver); | |||
// The query should return all elements in region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems to have been accidentally removed.
return _instance != NO_OBSERVER; | ||
} | ||
|
||
/** Return the current QueryObserver instance */ | ||
public static QueryObserver getInstance() { | ||
public static synchronized QueryObserver getInstance() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is not going to achieve what you are looking for:
The QueryObserverHolder class is not thread-safe.
As Kirk pointed out, you have to make this as a ThreadLocal and set/reset while in use.
Other option could be making this as non-static...If the query context is available in all the places where query-observer is getting used, you can think of creating a new QueryObserver and setting it on the context.
QueryContext.setObserver(new QueryObserver()); And no static implementation/use of query observer.
I did not see any test changes where concurrent queries are using QueryObserver...It will show if the query observers are cross referenced...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the use of Atomics is preferable to Synchronize. This takes one extra step elsewhere but encapsulates and manages synchronization challenges. Other than that I think this looks good.
@@ -366,7 +396,11 @@ public void testTraceOnLocalRegionWithSmallTracePrefixNoComments() throws Except | |||
assertTrue(((DefaultQuery) query).isTraced()); | |||
|
|||
SelectResults results = (SelectResults) query.execute(); | |||
assertTrue(QueryObserverHolder.getInstance() instanceof IndexTrackingQueryObserver); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests seem awfully repetitive. Is there a reason to check the test hook in every test? From my perspective it seems like a waste of CPU. Just my opinion though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the way to check that the trace
directive in the query has taken effect when set or that trace is not being done when it is not specified.
return _instance != NO_OBSERVER; | ||
} | ||
|
||
/** Return the current QueryObserver instance */ | ||
public static QueryObserver getInstance() { | ||
public static synchronized QueryObserver getInstance() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like you have that covered Alberto. By passing around context, it looks to me like you have addressed many if not all the thread safety issues. I tend to agree with Kirk about preferring and atomic to a synchronize, though.
This pull request introduces 1 alert and fixes 2 when merging 52fe699 into 19f55ad - view on LGTM.com new alerts:
fixed alerts:
|
52fe699
to
96a15bc
Compare
This pull request fixes 2 alerts when merging 96a15bc into cf1b35c - view on LGTM.com fixed alerts:
|
@@ -237,6 +238,9 @@ private DataCommandResult select(InternalCache cache, Object principal, String q | |||
if (queryObserver != null) { | |||
QueryObserverHolder.reset(); | |||
} | |||
if (tracedQuery.isTraced()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this block is already inside the if (tracedQuery.isTraced())
condition, is this if
necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not. This finally block is outside any if (tracedQuery.isTraced())
condition.
Maybe the way to go here is to rewrite |
I like that idea. The problem is that there are test cases that count on setting the observer globally and then running a query from the client. Take a look for example at: I have been able to change the test cases in I am also worried about the use done by the We could also go for having two variables in the |
@albertogpz @jhuynh1 It might even be possible to support multiple QueryObservers and invoke them all for each query. So here are some suggestions from Jason and me: Have the test set a global in the server(s). When the server starts a The query execution then adds the instance of Don't forget to call
It might even be possible to avoid the |
To add to Kirk's prior suggestion (some suggestions are probably duplicated as we are currently discussing) The main difference between the two is the use of vs possible removal of the threadlocal We might be able to avoid the thread local entirely if we add a method to the QueryObserver interface such as createObserver() where it is expected to pass back a unique instance of an observer. Improvements to reuses/pool these can be made to reduce garbage later. Where we currently do start query, perhaps call this new createObserver() method and stuff it into the execution context. ( geode/geode-core/src/main/java/org/apache/geode/cache/query/internal/DefaultQuery.java Line 342 in 7dd6143
Wherever we currently do QueryObserverHolder.getInstance(), instead do context.getObserver() There are a few locations of QueryObserverHolder.getInstance() like OrderbyComparator ( geode/geode-core/src/main/java/org/apache/geode/cache/query/internal/OrderByComparator.java Line 116 in 7dd6143
The execution context and observer should be gc'd eventually once the execution context is no longer being used. There is probably more I am missing but thought I'd put this down as a suggestion |
this PR appears to be abandoned, can it be closed? |
It is not yet abandoned. It is pending from a discussion with @kirklund about it. |
in the query at the start of the execution.
query is using indexes.
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Has your PR been rebased against the latest commit within the target branch (typically
develop
)?Is your initial contribution a single, squashed commit?
Does
gradlew build
run cleanly?Have you written or updated unit tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?