Adds hashing algorithm for buffered producer#30012
Adds hashing algorithm for buffered producer#30012conniey merged 7 commits intoAzure:conniey/feature/buffered-producerfrom
Conversation
|
This pull request is protected by Check Enforcer. What is Check Enforcer?Check Enforcer helps ensure all pull requests are covered by at least one check-run (typically an Azure Pipeline). When all check-runs associated with this pull request pass then Check Enforcer itself will pass. Why am I getting this message?You are getting this message because Check Enforcer did not detect any check-runs being associated with this pull request within five minutes. This may indicate that your pull request is not covered by any pipelines and so Check Enforcer is correctly blocking the pull request being merged. What should I do now?If the check-enforcer check-run is not passing and all other check-runs associated with this PR are passing (excluding license-cla) then you could try telling Check Enforcer to evaluate your pull request again. You can do this by adding a comment to this pull request as follows: What if I am onboarding a new service?Often, new services do not have validation pipelines associated with them, in order to bootstrap pipelines for a new service, you can issue the following command as a pull request comment: |
| final Hashed hashed = computeHash(bytes, 0, 0); | ||
| final int i = hashed.getHash1() ^ hashed.getHash2(); | ||
|
|
||
| return Integer.valueOf(i).shortValue(); |
There was a problem hiding this comment.
Does it include a case of truncating of a large int? If so, we might want to make sure Java does same truncating as .NET in backend (likely so, but better to be sure).
PS: currently this PartitionResolver is not used?
|
LGTM. I'm curious why .NET choose this hashing algorithm and implements it in the code, it there a issue for it? |
liukun-msft
left a comment
There was a problem hiding this comment.
LGTM. Just add one comment needs to validate the results.
| final Hashed hashed = computeHash(bytes, 0, 0); | ||
| final int i = hashed.getHash1() ^ hashed.getHash2(); | ||
|
|
||
| return Integer.valueOf(i).shortValue(); |
There was a problem hiding this comment.
.NET cast uint to short, but here cast int to short. I afraid it will take the sign bit when it is a negative value, and the return value will be different?
However, the the return value will be calculated by Math.abs(hashValue % partitions.length), not sure eventually the index result is same.
There was a problem hiding this comment.
Yeah. I did some digging into this. Only matters if you're doing mathematical operations on the value. https://stackoverflow.com/a/9854205/4220757.
Hey Zejia, this is the algorithm that the service team told us they use.... We'd have to ask the service team why they chose this. |
|
/azp run java - eventhubs |
|
Azure Pipelines successfully started running 1 pipeline(s). |
* Adding implementation of PartitionResolver * Adding tests for PartitionResolver. * Using unsigned right shift operator. * Adding concurrent test cases. * Add CHANGELOG
* Adding initial Buffered Producer APIs with documentation. (#29525) * Adding skeleton classes. * Adding documentation to buffered producer models. * Adding minor implementation for Event HubBufferedProducerClientBuilder. * Adding some documentation to Event Hub Buffered Producer Client Builder. * Adding suppression for PagedFlux * Adds hashing algorithm for buffered producer (#30012) * Adding implementation of PartitionResolver * Adding tests for PartitionResolver. * Using unsigned right shift operator. * Adding concurrent test cases. * Add CHANGELOG * Adding internal classes to support buffered producer. (#30078) * Updating property name to getMaxEventBufferLengthPerPartition * Adds an aggregator that publishes events based on full batches and time. * Adding implementation for publishing events for a partition. * Adding license info. * Adding inline mock maker for final classes. * Fixing recursive subscription. * Do not get additional batches if the current instance is completed. * Rename to EventHubBufferedPartitionProducer * Applying retryWhen policy to Aggregator. * Remove unused EventData field * Adding documentation to mock helper. * Hack around the blocking call. * Adding tests for EventHubBufferedPartitionProducerTest * Populating PartitionProcessors for each partition id. * Creating private static class to hold PublishResults from buffered producer. * Adding tests to ensure upstream requests are respected. * Add suppression for fallthrough * Use --no-cone in pipeline sparse checkout script (#29905) Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com> * Adding comments for cases meant to fallthrough * Wrap synchronous lock. * Adding documentation to EventDataAggregatora Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com> * Connecting EventHubBufferedProducerClient to internal classes. (#30194) * Adds flush to PartitionProcessor. * Adds implementation to methods for async client. * Fixes build breaks. * Remove idempotent retries for next release. * Connect asynchronous client. * Closing producer client after test case. * Make default retry options package-private. * Adding integration test for BufferedProducerAsyncClient. * Adding partition id to EventDataAggregator * Adding defaults to builder and documentation. * Remove localizedBy usages. * Fixing switchMap logic. * Have PartitionProducer throw an UncheckedInterruptedException when a batch creation occurs while the thread is interrrupted. * Update PublishResultSubscriber to clear remaining queue items when closed. * Delay computation of EventDataAggregator to prevent multiple instances being created. * Adding support for flush(). * Add emitResult constant. * Complete EventDataAggregator when it is cancelled. * Implement flush and clean up logging. * Add implementation for sync-client. * Making non-implemented methods package-private. * Add changelog entry. Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com>
…#30224) * Adding initial Buffered Producer APIs with documentation. (Azure#29525) * Adding skeleton classes. * Adding documentation to buffered producer models. * Adding minor implementation for Event HubBufferedProducerClientBuilder. * Adding some documentation to Event Hub Buffered Producer Client Builder. * Adding suppression for PagedFlux * Adds hashing algorithm for buffered producer (Azure#30012) * Adding implementation of PartitionResolver * Adding tests for PartitionResolver. * Using unsigned right shift operator. * Adding concurrent test cases. * Add CHANGELOG * Adding internal classes to support buffered producer. (Azure#30078) * Updating property name to getMaxEventBufferLengthPerPartition * Adds an aggregator that publishes events based on full batches and time. * Adding implementation for publishing events for a partition. * Adding license info. * Adding inline mock maker for final classes. * Fixing recursive subscription. * Do not get additional batches if the current instance is completed. * Rename to EventHubBufferedPartitionProducer * Applying retryWhen policy to Aggregator. * Remove unused EventData field * Adding documentation to mock helper. * Hack around the blocking call. * Adding tests for EventHubBufferedPartitionProducerTest * Populating PartitionProcessors for each partition id. * Creating private static class to hold PublishResults from buffered producer. * Adding tests to ensure upstream requests are respected. * Add suppression for fallthrough * Use --no-cone in pipeline sparse checkout script (Azure#29905) Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com> * Adding comments for cases meant to fallthrough * Wrap synchronous lock. * Adding documentation to EventDataAggregatora Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com> * Connecting EventHubBufferedProducerClient to internal classes. (Azure#30194) * Adds flush to PartitionProcessor. * Adds implementation to methods for async client. * Fixes build breaks. * Remove idempotent retries for next release. * Connect asynchronous client. * Closing producer client after test case. * Make default retry options package-private. * Adding integration test for BufferedProducerAsyncClient. * Adding partition id to EventDataAggregator * Adding defaults to builder and documentation. * Remove localizedBy usages. * Fixing switchMap logic. * Have PartitionProducer throw an UncheckedInterruptedException when a batch creation occurs while the thread is interrrupted. * Update PublishResultSubscriber to clear remaining queue items when closed. * Delay computation of EventDataAggregator to prevent multiple instances being created. * Adding support for flush(). * Add emitResult constant. * Complete EventDataAggregator when it is cancelled. * Implement flush and clean up logging. * Add implementation for sync-client. * Making non-implemented methods package-private. * Add changelog entry. Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com>
* Remerge "Using new windowTimeout operator with backpressure support" (#30111) * Using new windowTimeout operator with backpressure support * Using Flux.class directly for windowTimeout lookup, using azure core libs defined to use v4.3.19 reactor-core * Improving reflection logic and moving it to ReactorShim * Replace inline reflection with ReactorShim * Adding ReactorShim test for backpressure aware windowTimeout. * Changelog update for ReactorShim * Using ConcurrentLinkedQueue to avoid ConcurrentModificationException while iterating array. Co-authored-by: Anu Thomas Chandy <anuamd@hotmail.com> * Merge buffered producer into release/azure-messaging-eventhubs (#30224) * Adding initial Buffered Producer APIs with documentation. (#29525) * Adding skeleton classes. * Adding documentation to buffered producer models. * Adding minor implementation for Event HubBufferedProducerClientBuilder. * Adding some documentation to Event Hub Buffered Producer Client Builder. * Adding suppression for PagedFlux * Adds hashing algorithm for buffered producer (#30012) * Adding implementation of PartitionResolver * Adding tests for PartitionResolver. * Using unsigned right shift operator. * Adding concurrent test cases. * Add CHANGELOG * Adding internal classes to support buffered producer. (#30078) * Updating property name to getMaxEventBufferLengthPerPartition * Adds an aggregator that publishes events based on full batches and time. * Adding implementation for publishing events for a partition. * Adding license info. * Adding inline mock maker for final classes. * Fixing recursive subscription. * Do not get additional batches if the current instance is completed. * Rename to EventHubBufferedPartitionProducer * Applying retryWhen policy to Aggregator. * Remove unused EventData field * Adding documentation to mock helper. * Hack around the blocking call. * Adding tests for EventHubBufferedPartitionProducerTest * Populating PartitionProcessors for each partition id. * Creating private static class to hold PublishResults from buffered producer. * Adding tests to ensure upstream requests are respected. * Add suppression for fallthrough * Use --no-cone in pipeline sparse checkout script (#29905) Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com> * Adding comments for cases meant to fallthrough * Wrap synchronous lock. * Adding documentation to EventDataAggregatora Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com> * Connecting EventHubBufferedProducerClient to internal classes. (#30194) * Adds flush to PartitionProcessor. * Adds implementation to methods for async client. * Fixes build breaks. * Remove idempotent retries for next release. * Connect asynchronous client. * Closing producer client after test case. * Make default retry options package-private. * Adding integration test for BufferedProducerAsyncClient. * Adding partition id to EventDataAggregator * Adding defaults to builder and documentation. * Remove localizedBy usages. * Fixing switchMap logic. * Have PartitionProducer throw an UncheckedInterruptedException when a batch creation occurs while the thread is interrrupted. * Update PublishResultSubscriber to clear remaining queue items when closed. * Delay computation of EventDataAggregator to prevent multiple instances being created. * Adding support for flush(). * Add emitResult constant. * Complete EventDataAggregator when it is cancelled. * Implement flush and clean up logging. * Add implementation for sync-client. * Making non-implemented methods package-private. * Add changelog entry. Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com> * Update CHANGELOG.md * Fix CHANGELOG. Co-authored-by: Anu Thomas Chandy <anuamd@hotmail.com> Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Ben Broderick Phillips <bebroder@microsoft.com>
Description
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines