-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48152][BUILD] Make spark-profiler as a part of release and publish to maven central repo
#46402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48152][BUILD] Make spark-profiler as a part of release and publish to maven central repo
#46402
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,6 +31,9 @@ | |
| </properties> | ||
| <packaging>jar</packaging> | ||
| <name>Spark Profiler</name> | ||
| <description> | ||
| Enables code profiling of executors based on the the async profiler. | ||
| </description> | ||
| <url>https://spark.apache.org/</url> | ||
|
|
||
| <dependencies> | ||
|
|
@@ -44,7 +47,8 @@ | |
| <dependency> | ||
| <groupId>me.bechberger</groupId> | ||
| <artifactId>ap-loader-all</artifactId> | ||
| <version>3.0-9</version> | ||
| <version>${ap-loader.version}</version> | ||
| <scope>provided</scope> | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cc @parthchandra , too
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's great to include this feature in the spark release. I feel though, that if we are making it available in the release the ap-loader dependency should be included as well. (Currently, if we build with the jvm-profiler profile, the dependency is included in the build). |
||
| </dependency> | ||
| </dependencies> | ||
| </project> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,7 +31,7 @@ export LC_ALL=C | |
| # NOTE: These should match those in the release publishing script, and be kept in sync with | ||
| # dev/create-release/release-build.sh | ||
| HADOOP_MODULE_PROFILES="-Phive-thriftserver -Pkubernetes -Pyarn -Phive \ | ||
| -Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud" | ||
| -Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud -Pjvm-profiler" | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Do you know how we skip Kafka module's dependency here?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, I will remove it.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Let me investigate it.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it's identical with Apache Spark's Kafka module and Apache Spark Hadoop Cloud module
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, let me add detailed guide in some document.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I already know about it. It is implemented through |
||
| MVN="build/mvn" | ||
| HADOOP_HIVE_PROFILES=( | ||
| hadoop-3-hive-2.3 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -117,6 +117,13 @@ where `spark-streaming_{{site.SCALA_BINARY_VERSION}}` is the `artifactId` as def | |
|
|
||
| ./build/mvn -Pconnect -DskipTests clean package | ||
|
|
||
| ## Building with JVM Profile support | ||
|
|
||
| ./build/mvn -Pjvm-profiler -DskipTests clean package | ||
|
|
||
| **Note:** The `jvm-profiler` profile builds the assembly without including the dependency `ap-loader`, | ||
| you can download it manually from maven central repo and use it together with `spark-profiler_{{site.SCALA_BINARY_VERSION}}`. | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| ## Continuous Compilation | ||
|
|
||
| We use the scala-maven-plugin which supports incremental and continuous compilation. E.g. | ||
|
|
||

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we need to rename the file
connector/profiler/README.mdto "jvm-profiler-integration.md" and move it to the directorydocs/jvm-profiler-integration.md", while linking it in the filedocs/building-spark.md? Do we need to do this? @dongjoon-hyunThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @parthchandra