Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat][io] AzureDataExplorer/Kusto Sink for Pulsar #22006

Merged
merged 5 commits into from
Feb 13, 2024

Conversation

asaharn
Copy link
Contributor

@asaharn asaharn commented Jan 31, 2024

Motivation

This PR introduces an Azure Data Explorer (ADX) sink connector for Apache Pulsar. Streamlining the ingestion process, it solves the challenge of seamlessly transferring data from Pulsar to ADX clusters.

Modifications

We have added a new Azure Data Explorer (ADX) sink connector tailored for Apache Pulsar, enabling smooth and optimized data transfer from Pulsar to ADX clusters. The modification includes the integration of key functionalities and configurations necessary for seamless data ingestion.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change added tests and can be verified as follows:

  • Added end-2-end tests for the sink and can be tested by setting env vars
      export kustoAadAppId="" 
      export kustoAadAppSecret=""
      export kustoAadAuthorityID="tenentId"
      export kustoDatabase="<dbname>"
      export kustoCluster="https://ingest-<cluster-name>.kusto.windows.net"```
    
    

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: asaharn#2

…tps://learn.microsoft.com/en-us/azure/data-explorer/) (#2)

* Added ADX Sink for Pulsar
* Added test cases and E2E tests
* Formatting and refactoring
---------

Co-authored-by: Abhishek Saharn <[email protected]>
Copy link

@asaharn Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@github-actions github-actions bot added doc-required Your PR changes impact docs and you will update later. and removed doc-label-missing labels Jan 31, 2024
@asaharn asaharn changed the title [feat][io] New component. AzureDataEXplorer/Kusto sink for pulsar (https://learn.microsoft.com/en-us/azure/data-explorer/) [feat][io] New component. AzureDataExplorer/Kusto sink for pulsar (https://learn.microsoft.com/en-us/azure/data-explorer/) Feb 2, 2024
@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-required Your PR changes impact docs and you will update later. labels Feb 7, 2024
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @asaharn ! Looks really good in general. A few small details to revisit.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one more change needed for handling secrets. I'm sorry I missed that in the initial review.

Copy link
Contributor

@david-streamlio david-streamlio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asaharn
Copy link
Contributor Author

asaharn commented Feb 13, 2024

Thanks @david-streamlio.
@lhotari Made the changes for handling secrets.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments about .gitignore and the mockito-inline dependency that is no longer needed in Mockito 5.x.x.

@codecov-commenter
Copy link

Codecov Report

Attention: 156 lines in your changes are missing coverage. Please review.

Comparison is base (1b4127a) 36.44% compared to head (6248a25) 73.58%.
Report is 21 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #22006       +/-   ##
=============================================
+ Coverage     36.44%   73.58%   +37.14%     
- Complexity    12369    32572    +20203     
=============================================
  Files          1727     1874      +147     
  Lines        131879   139220     +7341     
  Branches      14419    15260      +841     
=============================================
+ Hits          48057   102445    +54388     
+ Misses        77404    28862    -48542     
- Partials       6418     7913     +1495     
Flag Coverage Δ
inttests 24.67% <ø> (+0.48%) ⬆️
systests 24.36% <ø> (+0.46%) ⬆️
unittests 72.86% <13.81%> (+40.97%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...che/pulsar/io/azuredataexplorer/ADXSinkConfig.java 86.20% <86.20%> (ø)
...he/pulsar/io/azuredataexplorer/ADXPulsarEvent.java 0.00% <0.00%> (ø)
...ache/pulsar/io/azuredataexplorer/ADXSinkUtils.java 0.00% <0.00%> (ø)
...rg/apache/pulsar/io/azuredataexplorer/ADXSink.java 0.00% <0.00%> (ø)

... and 1447 files with indirect coverage changes

@lhotari
Copy link
Member

lhotari commented Feb 13, 2024

Thanks for the great work @asaharn . Merging this.

@lhotari lhotari merged commit beed0cf into apache:master Feb 13, 2024
49 of 50 checks passed
@lhotari lhotari changed the title [feat][io] New component. AzureDataExplorer/Kusto sink for pulsar (https://learn.microsoft.com/en-us/azure/data-explorer/) [feat][io] AzureDataExplorer/Kusto Sink for Pulsar Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants