GitHub - mganta/adls-spark-examples: spark adls read write

A spark program to read and write from ADLS

Here is how you can use livy to submit jobs to process data from ADLS

The flow is:

spark job via rest api --> livy --> spark-dispatcher --> mesos --> spark job (read/write from adls)

After ssh-tunnel

with credentials hard-coded:

curl -v -H "Content-Type: application/json" -X POST -d '{ "file": "https:///adls-spark-examples-0.0.1-SNAPSHOT-jar-with-dependencies.jar", "className":"TestSparkWordCount" }' 'http://localhost/service/livy-spark2/batches'

with credentials as parameters:

curl -v -H "Content-Type: application/json" -X POST -d '{ "file": "https://myjars.blob.core.windows.net/myjars/adls-spark-examples-0.0.1-SNAPSHOT-jar-with-dependencies.jar", "className":"TestHiveTable", "args": ["val for credentail", "val for clientid", "https://login.microsoftonline.com/blah-blah/oauth2/token"] }' 'http://localhost/service/livy-spark2/batches'

(ADLS jars in src/main/resources as maven central do not have the jars for hadoop 2.x)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src/main		src/main
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A spark program to read and write from ADLS

About

Releases

Packages

Languages

mganta/adls-spark-examples

Folders and files

Latest commit

History

Repository files navigation

A spark program to read and write from ADLS

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages