Skip to content

Conversation

@steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Aug 14, 2018

This PR has been superceded by #22081

What changes were proposed in this pull request?

Increment the kinesis client, producer and transient AWS SDK versions to a more recent release.

This is to help with the move off bouncy castle of #21146 and #22081; the goal is that moving up to the new SDK will allow a JVM with unlimited JCE but without bouncy castle to work with Kinesis endpoints.

Why this specific set of artifacts? it syncs up with the 1.11.271 AWS SDK used by hadoop 3.0.3, hadoop-3.1. and hadoop 3.1.1; that's been stable for the uses there (s3, STS, dynamo).

How was this patch tested?

Running all the external/kinesis-asl tests via maven with java 8.121 & unlimited JCE, without bouncy castle (#21146); default endpoint of us-west.2. Without this SDK update I was getting http cert validation errors, with it they went away.

This PR is not ready without

  • Jenkins test runs to see what it is happy with
  • more testing: repeated runs, another endpoint
  • looking at the new deprecation warnings and selectively addressing them (the AWS SDKs are pretty aggressive about deprecation, but sometimes they increase the complexity of the client code or block some codepaths off completely)

…k to match.

Change-Id: Ic2d12a07d273bd1b6fc4c681075070f22ed1e44c
@steveloughran
Copy link
Contributor Author

As noted in #22146; stripping off bouncy castle and upgrading the SDK worked. But a local test run of just this patch brought up the same error seen in #22081

WithoutAggregationKinesisStreamSuite:
- KinesisUtils API
- RDD generation
- basic operation
- custom message handling *** FAILED ***
  The code passed to eventually never returned normally. Attempted 20 times over 2.092846262916667 minutes. Last failure message: collected.synchronized[Boolean](KinesisStreamTests.this.convertToEqualizer[scala.collection.mutable.HashSet[Int]](collected).===(modData.toSet[Int])(scalactic.this.Equality.default[scala.collection.mutable.HashSet[Int]])) was false
  Data received does not match data sent. (KinesisStreamSuite.scala:230)
- Kinesis read with custom configurations
- split and merge shards in a stream
- failure recovery *** FAILED ***
  The code passed to eventually never returned normally. Attempted 105 times over 2.0055098129 minutes. Last failure message: isCheckpointPresent was true, but 0 was not greater than 10. (KinesisStreamSuite.scala:398)

That wasn't a full clean build, so let's see what Jenkins says and some more test runs tomorrow. It could just be this is all showing up some flakiness in the test case. At the very least, some more details on the failure might be good.

@steveloughran
Copy link
Contributor Author

@srowen @budde @ajfabbri

@SparkQA
Copy link

SparkQA commented Aug 14, 2018

Test build #94728 has finished for PR 22099 at commit e79e5b9.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK in principle, pending tests.

@SparkQA
Copy link

SparkQA commented Aug 14, 2018

Test build #4246 has finished for PR 22099 at commit e79e5b9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 14, 2018

Test build #4247 has started for PR 22099 at commit e79e5b9.

@steveloughran
Copy link
Contributor Author

Local kinesis tests with both -Phadoop-3.1, -Phadoop-2.7 & Phadoop-3.1 -Dhadoop.version=3.1.1 are all working here (with bouncycastle, unlimited JCE in JVM).

I'm updating the #21146 PR with this patch to see what happens with the combination in Jenkins of no bouncycastle, updated Kinesis.

Test run failure here was org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.offset recovery from kafka; hard to see how it relates

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Aug 15, 2018

Test build #94776 has finished for PR 22099 at commit e79e5b9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Aug 15, 2018

To be clear you think this passed because it still uses jets3t and that still brings in BC? Then we can maybe merge this and rebase the other change to find out. This update won't have changed that situation with strong crypto being required right?

@steveloughran
Copy link
Contributor Author

To be clear you think this passed because it still uses jets3t and that still brings in BC?
correct

Then we can maybe merge this and rebase the other change to find out.
correct

This update won't have changed that situation with strong crypto being required right?

don't know. What it did do was stop my local test runs without bouncy castle failing with errors about certificate validation.

This patch is a good thing to do anyway, because it's good to stay somewhat current with the AWS releases (more chance of issues being addressed, reduced cost of future migrations). So it can be merged in and then the problem of getting #22081's test run to work addressed after.

I reopened #21146 & applied this patched to it, to see what Jenkins did there. The overall test runs come out as failing -hard to point to any related cause, but the Kinesis ones do all pass: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94769/testReport/org.apache.spark.streaming.kinesis/

I'm going to close that one again to avoid confusion about which of the "remove jets3t" patches people should be looking at; once the kinesis update is merged in you'll need to retest your #22081 PR and let's see what Jenkins says there

@srowen
Copy link
Member

srowen commented Aug 15, 2018

Merged to master

@asfgit asfgit closed this in 4d8ae0d Aug 15, 2018
@steveloughran
Copy link
Contributor Author

thanks

@steveloughran steveloughran deleted the cloud/SPARK-25111-kinesis branch August 15, 2018 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants