Skip to content

Conversation

@nthanvi
Copy link
Contributor

@nthanvi nthanvi commented Jul 22, 2016

Changes proposed in this pull request

  1. Added support for dynamic Jar loading in snappy. The idea here is to support the existing snappy command for loading the jar file. For that lead node driver and executor class loader are modified in such a way that they look for the jar in the snappy-store if it is not found in regular class path.
  2. Apart form above change there is another problem that once the job is submitted through the job server and if user make any changes to the jar related to submitted application and try to rerun again then executors are not able to pick the new jar, they pick the old jar as they are long running and old jar is still in ClassPath. to fix this issue , a new DynamicURLClassLoader implementation is done which maintains a Loader per Jar.
  3. The current implementation of SnappySQLJOB and JavaSnappySQLJob.java is kind of inconsistent in sence of method to overload and return types.
    Now only two methods are exposed to the user runSnappyJob and isValidJob with a single return Type SnappyValidation so user has a small learning curve and limited entry points.
  4. all the existing jobs are modified to use the new Method Implementations of SnappySQLJob and SnappyStreamingJob etc.
  5. code formatting for the existing jobs.

Patch testing

dunit , junit , clean precheckin with store

ReleaseNotes.txt changes

No. but will update the related *md files for same.

Other PRs

TIBCOSoftware/snappy-spark-jobserver#1
TIBCOSoftware/snappy-spark#39
TIBCOSoftware/snappy-store#88

nthanvi added 14 commits July 13, 2016 13:08
Now user need to implement isValidJob and runSnappyJob methods for all jobs
added contextClassLoader for driver and client thread. it wil enable it to read the
classes from the snappy store repository.
it handles executor class loader per jar file.
still need to implement clientfirst classloader
it supports both childFirst and parentFirst  jarLoading
added basic test coverage
added full JUnit coverage for DynamicURLClassLoader
with Junit and DUnit coverage
SnappySQLJob can be used for same purpose

}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This override can also be added for IsolatedClientLoader.classLoader. It is used for any external JDBC URLs apart from our gemxd drivers (e.g. if user wishes to use it to load data into gemxd). If it is not added there, we need docs on how user should go about doing it (perhaps JDBC drivers have to be in SPARK_DIST_CLASSPATH while other parts will be fine with dynamic install-jar).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove the following line
private val overwriteFiles = env.conf.getBoolean("spark.files.overwrite", false)

as it is not needed after the changes in the job server. The Job server always create a new jar file where we have added our own suffix in the name so now executor knows that the jar belongs to the same job and it can reload it. executors can also copy this jar as it is a new jar for them.

@sumwale
Copy link
Contributor

sumwale commented Aug 11, 2016

@rishitesh Please review the Java job API changes in this.

public abstract class JavaSnappyStreamingJob implements SparkJobBase {

abstract public Object runJavaJob(JavaSnappyStreamingContext snc, Config jobConfig);
abstract public Object runSnappyJob(JavaSnappyStreamingContext snc, Config jobConfig);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this rename? I suppose the idea was to emphasize that the code needs to use the Java wrapper JavaSnappyStreamingContext (and not the regular SnappyStreamingContext). @rishitesh your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was more to do with type resolution in case of scala generic type. The API proposed is looking cleaner if its running all the tests. As use only have to deal with two methods. Only input types will change.

else new File(tempDir, jarName.format(System.currentTimeMillis()))
TestUtils.createJar(files1 ++ files2, jarFile)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is being used only be tests, move this to io.snappydata.util.TestUtils

@sumwale
Copy link
Contributor

sumwale commented Aug 11, 2016

Haven't looked at the tests yet but looked through pretty much rest of the code. Please ensure we have tests for:
a) install/replace/drop from job-server (is drop possible from job-server?)
b) install/replace/drop using the usual GemXD procedures
c) install/replace/drop of JDBC drivers

We also need docs for how user should install/replace/drop jars as well as JDBC drivers which is common case users try to load import data into snappydata. If c) is not possible, then we need to document and test that too (SPARK_DIST_CLASSPATH)

@nthanvi
Copy link
Contributor Author

nthanvi commented Aug 24, 2016

Added another pull request after incorporating the review comments and 2.0 merge.
#337

For some open issues with PR open the Jira https://jira.snappydata.io/browse/SNAP-999?filter=-1

@nthanvi nthanvi closed this Aug 24, 2016
@sumwale sumwale deleted the SNAP-293 branch December 5, 2016 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants