Skip to content

enhancement(azure_blob sink): update to latest azure SDK#13453

Merged
spencergilbert merged 4 commits intovectordotdev:masterfrom
yvespp:yvespp_update_to_latest_azure_sdk
Oct 3, 2022
Merged

enhancement(azure_blob sink): update to latest azure SDK#13453
spencergilbert merged 4 commits intovectordotdev:masterfrom
yvespp:yvespp_update_to_latest_azure_sdk

Conversation

@yvespp
Copy link
Contributor

@yvespp yvespp commented Jul 6, 2022

Update the Azure SDK to fix a hang when Azure Blob sink was used with managed identity: Azure/azure-sdk-for-rust#808

There where quite a view changes in the SDK recently which required some code updates.
The conversion from http-types::StatusCode of the SDK to hyper::StatusCode of Vector in the error handling is a bit awkward and maybe could be implemented nicer.

@netlify
Copy link

netlify bot commented Jul 6, 2022

Deploy Preview for vector-project ready!

Name Link
🔨 Latest commit d7f2b05
🔍 Latest deploy log https://app.netlify.com/sites/vector-project/deploys/6336f56e6e461f00084592c6
😎 Deploy Preview https://deploy-preview-13453--vector-project.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions github-actions bot added the domain: sinks Anything related to the Vector's sinks label Jul 6, 2022
match request {
let resp: crate::Result<()> = match response {
Ok(_) => Ok(()),
Err(reason) => Err(match reason.downcast_ref::<HttpError>() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to downcast_ref to check the status code anymore, which should significantly simplify this code.

Check out the example missing_blob.rs that shows how to check the status code specifically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input. I tried to change the code but run into a problem I don't know how to solve:

pub fn build_healthcheck(
    container_name: String,
    client: Arc<ContainerClient>,
) -> crate::Result<Healthcheck> {
    let healthcheck = async move {
        let response = client.get_properties().into_future().await;

        let resp: crate::Result<()> = match response {
            Ok(_) => Ok(()),
            Err(reason) => match reason.kind() {
                ErrorKind::HttpResponse {
                    status: azure_core::StatusCode::Forbidden,
                    ..
                } => Err(Box::new(HealthcheckError::InvalidCredentials)),
                ErrorKind::HttpResponse {
                    status: azure_core::StatusCode::NotFound,
                    ..
                } => Err(Box::new(HealthcheckError::UnknownContainer {
                    container: container_name,
                })),
                ErrorKind::HttpResponse {
                    status,
                    ..
                } => {
                    let result = StatusCode::from_u16(status.into());
                    match result {
                        Ok(status) => Err(Box::new(HealthcheckError::Unknown { status: status })),
                        Err(status) => Err("Unknown status code".into()),
                    }
                },
                _ => Err(reason.into()),
            },
        };
        resp
    };

    Ok(healthcheck.boxed())
}

The problem is that I can't convert the azure_core::StatusCode into http::StatusCode required for HealthcheckError::Unknown. On the line StatusCode::from_u16(status.into()) I get an error that it can't be converted to u16.
I'm still new to Rust and I'm surprised that this doesn't work as it works with http_types::StatusCode

let result: Result<PutBlockBlobResponse, Self::Error> = blob
.into_future()
.inspect_err(|reason| {
match reason.downcast_ref::<HttpError>() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, you don't need to use downcast_ref here, which will simplify this code.

@bmc-msft
Copy link

bmc-msft commented Jul 6, 2022

I noticed Vector includes a global retry mechanism that appears to be used for all storage APIs.

With the Azure SDK's move to the pipeline architecture, by Azure SDK will retry requests automatically using an exponential backoff policy, starting with 800ms delay, retrying up to 3 times, and up to 60s.

@yvespp
Copy link
Contributor Author

yvespp commented Jul 7, 2022

I noticed Vector includes a global retry mechanism that appears to be used for all storage APIs.

With the Azure SDK's move to the pipeline architecture, by Azure SDK will retry requests automatically using an exponential backoff policy, starting with 800ms delay, retrying up to 3 times, and up to 60s.

My guess is Vector wants to manage the retries itself to have more control over it. Maybe someone from the Vector team can comment on how to deal with that?
Can retry be disabled in the SDK?

@github-actions

This comment was marked as outdated.

@jszwedko
Copy link
Collaborator

I noticed Vector includes a global retry mechanism that appears to be used for all storage APIs.
With the Azure SDK's move to the pipeline architecture, by Azure SDK will retry requests automatically using an exponential backoff policy, starting with 800ms delay, retrying up to 3 times, and up to 60s.

My guess is Vector wants to manage the retries itself to have more control over it. Maybe someone from the Vector team can comment on how to deal with that? Can retry be disabled in the SDK?

Yeah, that's correct, we would like to retain control over the retry behavior to have it consistently implemented for all integrations. We do this, for example, with the AWS SDK by disabling its internal retry behavior. It would be great if we could do that here as well!

@bmc-msft
Copy link

The Azure SDK's storage crates do not currently expose a mechanism to disable the retry behavior. Our intent is to add it before the next release.

@tobz tobz added the meta: blocked Anything that is blocked to the point where it cannot be worked on. label Aug 1, 2022
@tobz
Copy link
Contributor

tobz commented Aug 1, 2022

Sounds like we're blocked then until the Azure SDK exposes the ability to control the retry behavior.

@tobz tobz self-assigned this Aug 1, 2022
@yvespp
Copy link
Contributor Author

yvespp commented Sep 27, 2022

The Azure SDK now provides the ability to disable retry. I've updated this pull request to the latest SDK and it isn't blocked anymore.

I had a problem with storage_account in azure_blob/config.rs being Option<SensitiveString>. In my testing the client tries to connect to **REDACTED**.blob.core.windows.net which obviously doesn't work. I change storage_account to be just Option<String> and that works. In my opinion the storage account name isn't a sensitive value anyway.

Another problem was with the time dependency (see Cargo.toml for error). Looks to me like cargo is confused with one- and two-digit patch versions (^0.3.4 excludes all 0.3.1x versions).

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@spencergilbert spencergilbert added ci-condition: integration tests enable and removed meta: blocked Anything that is blocked to the point where it cannot be worked on. labels Sep 28, 2022
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

yvespp and others added 2 commits September 29, 2022 08:34
Signed-off-by: Yves Peter <ypwebstuff@gmail.com>
Co-authored-by: David Schneider <dsbrng25b@gmail.com>
Signed-off-by: yvespp <ypwebstuff@gmail.com>
@yvespp
Copy link
Contributor Author

yvespp commented Sep 29, 2022

I rebased 20 minutes ago hopping to fix the datadog integration tests, but they still have the same problem: TEST_DATADOG_API_KEY required
Don't know how my code could have broken this, maybe through a dependency update?

Now two new tests are also broken (elasticsearch and redis) but they look more like temporary failures.

@spencergilbert
Copy link
Contributor

I rebased 20 minutes ago hopping to fix the datadog integration tests, but they still have the same problem: TEST_DATADOG_API_KEY required Don't know how my code could have broken this, maybe through a dependency update?

Now two new tests are also broken (elasticsearch and redis) but they look more like temporary failures.

Sorry! That's on us, that secret doesn't get included in contributor PRs so those jobs always fail. We're looking for a workaround that avoid it polluting community PRs but haven't been able to implement anything.

I'm looking to review this PR this week!

@spencergilbert spencergilbert self-assigned this Sep 29, 2022
@spencergilbert spencergilbert self-requested a review September 29, 2022 12:57
@github-actions
Copy link

Soak Test Results

Baseline: f44a7d5
Comparison: c46153e
Total Vector CPUs: 4

Explanation

A soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core.

The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed.

No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%:

Fine details of change detection per experiment.
experiment Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
http_text_to_http_json 1.84MiB 4.9 100.00% 37.53MiB 881.37KiB 17.99KiB 0 0.0229321 39.37MiB 818.73KiB 16.71KiB 0 0.0203065 False False
syslog_loki 656.86KiB 4.68 100.00% 13.72MiB 466.23KiB 9.53KiB 0 0.0331833 14.36MiB 723.41KiB 14.71KiB 0 0.0491882 False False
datadog_agent_remap_blackhole 2.01MiB 3.53 100.00% 56.98MiB 5.08MiB 105.75KiB 0 0.0890705 58.99MiB 3.68MiB 76.9KiB 0 0.0624204 False False
socket_to_socket_blackhole 620.48KiB 2.79 100.00% 21.72MiB 563.04KiB 11.49KiB 0 0.0253053 22.33MiB 528.56KiB 10.79KiB 0 0.0231107 False False
syslog_splunk_hec_logs 379.86KiB 2.35 100.00% 15.75MiB 804.17KiB 16.38KiB 0 0.0498381 16.13MiB 727.35KiB 14.83KiB 0 0.0440402 False False
splunk_hec_route_s3 330.85KiB 1.78 100.00% 18.18MiB 2.33MiB 48.5KiB 0 0.128105 18.5MiB 2.22MiB 46.35KiB 0 0.119773 False False
datadog_agent_remap_datadog_logs_acks 909.15KiB 1.49 100.00% 59.66MiB 3.05MiB 63.8KiB 0 0.0511509 60.54MiB 4.26MiB 88.76KiB 0 0.0704135 False False
syslog_humio_logs 209.19KiB 1.26 100.00% 16.2MiB 170.18KiB 3.47KiB 0 0.0102564 16.4MiB 177.52KiB 3.64KiB 0 0.0105654 False False
datadog_agent_remap_datadog_logs 582.59KiB 0.93 100.00% 61.15MiB 410.32KiB 8.4KiB 0 0.00655126 61.72MiB 4.27MiB 88.94KiB 0 0.0691955 False False
datadog_agent_remap_blackhole_acks 236.75KiB 0.39 92.74% 59.9MiB 5.11MiB 106.48KiB 0 0.0853488 60.13MiB 3.71MiB 77.71KiB 0 0.0616895 False False
syslog_regex_logs2metric_ddmetrics 44.06KiB 0.36 99.90% 12.01MiB 458.9KiB 9.35KiB 0 0.0373106 12.05MiB 467.7KiB 9.53KiB 0 0.0378904 False False
splunk_hec_to_splunk_hec_logs_noack 20.3KiB 0.08 89.74% 23.82MiB 511.93KiB 10.44KiB 0 0.0209862 23.84MiB 330.42KiB 6.75KiB 0 0.013534 False False
http_to_http_acks 8.59KiB 0.05 2.84% 17.39MiB 8.27MiB 172.94KiB 0 0.475685 17.39MiB 8.03MiB 167.68KiB 0 0.46158 True True
splunk_hec_indexer_ack_blackhole 3.08KiB 0.01 9.41% 23.75MiB 912.37KiB 18.56KiB 0 0.0375105 23.75MiB 895.26KiB 18.21KiB 0 0.0368023 False False
enterprise_http_to_http 992.78B 0 10.49% 23.84MiB 253.76KiB 5.18KiB 0 0.0103908 23.85MiB 255.4KiB 5.22KiB 0 0.0104573 False False
splunk_hec_to_splunk_hec_logs_acks 362.79B 0 1.15% 23.75MiB 853.03KiB 17.35KiB 0 0.0350687 23.75MiB 856.62KiB 17.42KiB 0 0.0352159 False False
file_to_blackhole -37.36KiB -0.04 27.88% 95.34MiB 3.33MiB 68.94KiB 0 0.0348746 95.3MiB 3.79MiB 78.77KiB 0 0.0397126 False False
http_to_http_json -36.74KiB -0.15 99.64% 23.85MiB 331.61KiB 6.77KiB 0 0.0135759 23.81MiB 521.88KiB 10.66KiB 0 0.0213972 False False
http_pipelines_blackhole_acks -2.49KiB -0.2 61.47% 1.2MiB 106.48KiB 2.17KiB 0 0.0868507 1.19MiB 92.0KiB 1.87KiB 0 0.0751895 False False
http_to_http_noack -130.32KiB -0.53 100.00% 23.84MiB 401.98KiB 8.22KiB 0 0.0164644 23.71MiB 1.26MiB 26.19KiB 0 0.0530075 False False
fluent_elasticsearch -463.52KiB -0.57 100.00% 79.47MiB 52.19KiB 1.05KiB 0 0.000641205 79.02MiB 4.91MiB 100.7KiB 0 0.0620625 False False
syslog_log2metric_splunk_hec_metrics -169.09KiB -0.97 100.00% 17.04MiB 710.71KiB 14.5KiB 0 0.0407215 16.88MiB 1.04MiB 21.63KiB 0 0.0614758 False False
http_pipelines_blackhole -16.91KiB -1 100.00% 1.66MiB 58.23KiB 1.19KiB 0 0.0343507 1.64MiB 126.24KiB 2.57KiB 0 0.0752135 False False
syslog_log2metric_humio_metrics -204.98KiB -1.57 100.00% 12.76MiB 235.25KiB 4.8KiB 0 0.0180074 12.56MiB 577.45KiB 11.75KiB 0 0.0449061 False False
http_pipelines_no_grok_blackhole -204.45KiB -1.84 100.00% 10.87MiB 107.27KiB 2.19KiB 0 0.00963929 10.67MiB 1.09MiB 22.68KiB 0 0.102133 False False

Signed-off-by: Yves Peter <ypwebstuff@gmail.com>
@spencergilbert spencergilbert merged commit 05f57e4 into vectordotdev:master Oct 3, 2022
@yvespp yvespp deleted the yvespp_update_to_latest_azure_sdk branch October 3, 2022 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: sinks Anything related to the Vector's sinks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants