-
Notifications
You must be signed in to change notification settings - Fork 28
Setting up a Schedoscope Project
Schedoscope is an internal Scala DSL for specifying views (Hive table partitions), their structure and dependencies, as well as the transformation logic required compute views from other views. As a consequence, setting up a Schedoscope project means setting up a Scala project that uses Schedoscope as a library.
For this purpose, we provide a Maven POM template in this section. It is of course possible to use other build tools such as SBT or Ant/Ivy. You are also encouraged to take a look at the POM of the tutorial.
For running Schedoscope, the template utilizes the exec Maven plugin which assembles a classpath from the Maven dependencies and launches the Schedoscope REST service right out of the project folder.
In real-world production deployment scenarios, you should probably follow a different [deployment / bundling](Bundling and Deploying) / [launching strategy](Starting Schedoscope).
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>my-projects-group-id</groupId>
<artifactId>my-project</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>Schedoscope POM template</name>
<description>A template making it easier for you to set up a Schedoscope project</description>
<dependencies>
<dependency>
<groupId>schedoscope</groupId>
<artifactId>schedoscope-core</artifactId>
<version>0.10.2</version>
</dependency>
<!-- If you need Oozie transformations, add the following dependency -->
<dependency>
<groupId>schedoscope</groupId>
<artifactId>schedoscope-transformation-oozie</artifactId>
<version>0.10.2</version>
</dependency>
<!-- If you need Pig transformations, add the following dependency -->
<dependency>
<groupId>schedoscope</groupId>
<artifactId>schedoscope-transformation-pig</artifactId>
<version>0.10.2</version>
</dependency>
<!-- If you need Spark transformations, add the following dependency -->
<dependency>
<groupId>schedoscope</groupId>
<artifactId>schedoscope-transformation-spark</artifactId>
<version>0.10.2</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.10</artifactId>
<version>2.2.5</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>hadoop-launcher</groupId>
<artifactId>hadoop-launcher</artifactId>
<version>0.1.1</version>
<scope>test</scope>
</dependency>
<!-- For tests of Oozie transformations, add this dependency -->
<dependency>
<groupId>minioozie</groupId>
<artifactId>minioozie</artifactId>
<version>1.2.4</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-source-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.19</version>
<configuration>
<skipTests>true</skipTests>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
<goal>doc-jar</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>2.11.11</scalaVersion>
</configuration>
</plugin>
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>WDF TestSuite.txt</filereports>
<argLine>-Xmx1024m -XX:MaxPermSize=512M</argLine>
<environmentVariables>
<HADOOP_HOME>${project.build.directory}/hadoop</HADOOP_HOME>
</environmentVariables>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.4.0</version>
<executions>
<execution>
<goals>
<goal>java</goal>
</goals>
</execution>
</executions>
<configuration>
<mainClass>org.schedoscope.scheduler.rest.server.SchedoscopeRestService</mainClass>
<arguments>
<argument>--shell</argument>
</arguments>
<systemProperties>
<systemProperty>
<key>config.file</key>
<value>src/main/resources/schedoscope.conf</value>
</systemProperty>
</systemProperties>
<additionalClasspathElements>
<additionalClasspathElement>/etc/hadoop/conf</additionalClasspathElement>
<additionalClasspathElement>/etc/hive/conf</additionalClasspathElement>
<additionalClasspathElement>target/${project.build.finalName}-mapreduce.jar</additionalClasspathElement>
<additionalClasspathElement>target/${project.build.finalName}-hive.jar</additionalClasspathElement>
</additionalClasspathElements>
</configuration>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>otto-bintray</id>
<url>https://dl.bintray.com/ottogroup/maven</url>
</repository>
</repositories>
</project>
In case you want use [JDBC exports](JDBC Exports), you should also add the JDBC driver of your database to your pom.
As a reminder, a Maven project folder structure looks like this:
project
|
+-- src
| |
| +-- main
| | |
| | +-- scala
| | |
| | +-- resources
| |
| +-- test
| |
| +-- scala
| |
| +-- resources
|
+-- pom.xml