Skip to content

Commit 5ff71e5

Browse files
author
Chris Riccomini
committed
initial import.
0 parents  commit 5ff71e5

File tree

304 files changed

+25647
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

304 files changed

+25647
-0
lines changed

.gitignore

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
project/boot/
2+
target/
3+
.classpath
4+
.project
5+
.scala_dependencies
6+
.settings/
7+
dist/
8+
record_timestamps.log*
9+
deployable.tgz
10+
.idea/
11+
.idea_modules/
12+
*.iml
13+
*.ipr
14+
*.iws
15+
*/.cache
16+
dashboard-deployable.tgz
17+
deployable.tar
18+
dist-dashboard
19+
docs/_site
20+
.gradle
21+
build
22+
**/bin
23+
samza-test/state
24+
docs/learn/documentation/0.7.0/api/javadocs

DISCLAIMER

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Apache Samza is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

HEADER

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
Licensed to the Apache Software Foundation (ASF) under one
2+
or more contributor license agreements. See the NOTICE file
3+
distributed with this work for additional information
4+
regarding copyright ownership. The ASF licenses this file
5+
to you under the Apache License, Version 2.0 (the
6+
"License"); you may not use this file except in compliance
7+
with the License. You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing,
12+
software distributed under the License is distributed on an
13+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
KIND, either express or implied. See the License for the
15+
specific language governing permissions and limitations
16+
under the License.

LICENSE

+580
Large diffs are not rendered by default.

README.md

+55
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
## What is Samza?
2+
3+
Apache Incubator Samza is a distributed stream processing framework. It uses <a target="_blank" href="http://kafka.apache.org">Apache Kafka</a> for messaging, and <a target="_blank" href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">Apache Hadoop YARN</a> to provide fault tolerance, processor isolation, security, and resource management.
4+
5+
* **Simpe API:** Unlike most low-level messaging system APIs, Samza provides a very simple call-back based "process message" API that should be familiar to anyone that's used Map/Reduce.
6+
* **Managed state:** Samza manages snapshotting and restoration of a stream processor's state. Samza will restore a stream processor's state to a snapshot consistent with the processor's last read messages when the processor is restarted.
7+
* **Fault tolerance:** Samza will work with YARN to restart your stream processor if there is a machine or processor failure.
8+
* **Durability:** Samza uses Kafka to guarantee that messages will be processed in the order they were written to a partition, and that no messages will ever be lost.
9+
* **Scalability:** Samza is partitioned and distributed at every level. Kafka provides ordered, partitioned, re-playable, fault-tolerant streams. YARN provides a distributed environment for Samza containers to run in.
10+
* **Pluggable:** Though Samza works out of the box with Kafka and YARN, Samza provides a pluggable API that lets you run Samza with other messaging systems and execution environments.
11+
* **Processor isolation:** Samza works with Apache YARN, which supports processor security through Hadoop's security model, and resource isolation through Linux CGroups.
12+
13+
Check out [Hello Samza](/startup/hello-samza/0.7.0) to try Samza. Read the [Background](/learn/documentation/0.7.0/introduction/background.html) page to learn more about Samza.
14+
15+
### Building Samza
16+
17+
To build Samza, run:
18+
19+
./gradlew clean build
20+
21+
#### Scala and YARN
22+
23+
Samza builds with [Scala](http://www.scala-lang.org/) 2.9.2 and [YARN](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) 2.0.5-alpha, by default. Use the -PscalaVersion and -PyarnVersion switches to change versions. Samza supports building Scala with 2.8.1 or 2.9.2, and building YARN with 2.0.3-alpha, 2.0.4-alpha, and 2.0.5-alpha.
24+
25+
./gradlew -PscalaVersion=2.8.1 -PyarnVersion=2.0.3-alpha clean build
26+
27+
YARN protocols are backwards incompatible, so you must pick the version that matches your YARN grid.
28+
29+
### Testing Samza
30+
31+
To run all tests:
32+
33+
./gradlew clean test
34+
35+
To run a single test:
36+
37+
./gradlew clean :samza-test:test -Dtest.single=TestStatefulTask
38+
39+
#### Maven
40+
41+
Samza uses Kafka, which is not managed by Maven. To use Kafka as though it were a Maven artifact, Samza installs Kafka into a local repository using the `mvn install` command. You must have Maven installed to build Samza.
42+
43+
### Developers
44+
45+
To get eclipse projects, run:
46+
47+
./gradlew eclipse
48+
49+
For IntelliJ, run:
50+
51+
./gradlew idea
52+
53+
### Pardon our Dust
54+
55+
Apache Samza is currently undergoing incubation at the [Apache Software Foundation](http://www.apache.org/).

RELEASE.md

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
Validate that all Samza source files have proper license information in their header.
2+
3+
./gradlew check
4+
5+
Auto-generate all missing headers in files:
6+
7+
./gradlew licenseFormatMain
8+
9+
To release to a local Maven repository:
10+
11+
./gradlew clean publishToMavenLocal
12+
./gradlew -PscalaVersion=2.8.1 clean publishToMavenLocal
13+
14+
To generate test coverage reports:
15+
16+
./gradlew clean jacocoTestReport

build.gradle

+182
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
buildscript {
2+
repositories {
3+
mavenCentral()
4+
}
5+
6+
apply from: file('gradle/buildscript.gradle'), to: buildscript
7+
}
8+
9+
allprojects {
10+
repositories {
11+
// Required for Kafka. Kafka's 0.8.0-beta1 Maven Central
12+
// POM is broken. Should go away in future releases.
13+
maven {
14+
url 'https://repository.apache.org/content/groups/public'
15+
}
16+
mavenCentral()
17+
mavenLocal()
18+
}
19+
}
20+
21+
apply from: file('gradle/license.gradle')
22+
apply from: file('gradle/maven.gradle')
23+
apply from: file("gradle/dependency-versions.gradle")
24+
apply from: file("gradle/dependency-versions-scala-" + scalaVersion + ".gradle")
25+
26+
subprojects {
27+
group = "org.apache.samza"
28+
29+
apply plugin: 'jacoco'
30+
apply plugin: 'eclipse'
31+
apply plugin: 'idea'
32+
apply plugin: 'project-report'
33+
}
34+
35+
project(':samza-api') {
36+
apply plugin: 'java'
37+
38+
dependencies {
39+
testCompile "junit:junit:$junitVersion"
40+
}
41+
}
42+
43+
project(":samza-core_$scalaVersion") {
44+
apply plugin: 'scala'
45+
46+
dependencies {
47+
compile project(':samza-api')
48+
compile "org.scala-lang:scala-library:$scalaVersion"
49+
compile "org.clapper:grizzled-slf4j_$scalaVersion:$grizzledVersion"
50+
compile "net.sf.jopt-simple:jopt-simple:$joptSimpleVersion"
51+
compile "org.codehaus.jackson:jackson-jaxrs:$jacksonVersion"
52+
testCompile "junit:junit:$junitVersion"
53+
}
54+
}
55+
56+
project(":samza-kafka_$scalaVersion") {
57+
apply plugin: 'scala'
58+
59+
dependencies {
60+
compile project(':samza-api')
61+
compile project(":samza-core_$scalaVersion")
62+
compile project(":samza-serializers_$scalaVersion")
63+
compile "org.scala-lang:scala-library:$scalaVersion"
64+
compile "org.clapper:grizzled-slf4j_$scalaVersion:$grizzledVersion"
65+
compile "com.101tec:zkclient:$zkClientVersion"
66+
compile "org.codehaus.jackson:jackson-jaxrs:$jacksonVersion"
67+
// these can all go away when kafka is in maven
68+
compile files("lib/kafka_$scalaVersion-" + kafkaVersion + ".jar")
69+
compile "com.yammer.metrics:metrics-core:$metricsVersion"
70+
compile "com.yammer.metrics:metrics-annotation:$metricsVersion"
71+
// end these can all go away when kafka is in maven
72+
testCompile "junit:junit:$junitVersion"
73+
testCompile "org.mockito:mockito-all:$mockitoVersion"
74+
// these can all go away when kafka is in maven
75+
testCompile files("lib/kafka_$scalaVersion-$kafkaVersion-test.jar")
76+
// end these can all go away when kafka is in maven
77+
}
78+
}
79+
80+
project(":samza-serializers_$scalaVersion") {
81+
apply plugin: 'scala'
82+
83+
dependencies {
84+
compile project(':samza-api')
85+
compile project(":samza-core_$scalaVersion")
86+
compile "org.scala-lang:scala-library:$scalaVersion"
87+
compile "org.clapper:grizzled-slf4j_$scalaVersion:$grizzledVersion"
88+
compile "org.codehaus.jackson:jackson-jaxrs:$jacksonVersion"
89+
testCompile "junit:junit:$junitVersion"
90+
}
91+
}
92+
93+
project(":samza-yarn_$scalaVersion") {
94+
apply plugin: 'scala'
95+
96+
jar {
97+
classifier = "yarn-$yarnVersion"
98+
}
99+
100+
dependencies {
101+
compile project(':samza-api')
102+
compile project(":samza-core_$scalaVersion")
103+
compile "org.scala-lang:scala-library:$scalaVersion"
104+
compile "org.scala-lang:scala-compiler:$scalaVersion"
105+
compile "org.clapper:grizzled-slf4j_$scalaVersion:$grizzledVersion"
106+
compile "org.codehaus.jackson:jackson-jaxrs:$jacksonVersion"
107+
compile "commons-httpclient:commons-httpclient:$commonsHttpClientVersion"
108+
compile "org.eclipse.jetty:jetty-webapp:$jettyVersion"
109+
compile("org.apache.hadoop:hadoop-yarn-api:$yarnVersion") {
110+
exclude module: 'slf4j-log4j12'
111+
}
112+
compile("org.apache.hadoop:hadoop-yarn-common:$yarnVersion") {
113+
exclude module: 'slf4j-log4j12'
114+
}
115+
compile("org.apache.hadoop:hadoop-yarn-client:$yarnVersion") {
116+
exclude module: 'slf4j-log4j12'
117+
}
118+
compile("org.apache.hadoop:hadoop-common:$yarnVersion") {
119+
exclude module: 'slf4j-log4j12'
120+
exclude module: 'servlet-api'
121+
exclude module: 'jetty'
122+
exclude module: 'jetty-util'
123+
}
124+
compile("org.scalatra:scalatra_$scalaVersion:$scalatraVersion") {
125+
exclude module: 'scala-compiler'
126+
exclude module: 'slf4j-api'
127+
}
128+
compile("org.scalatra:scalatra-scalate_$scalaVersion:$scalatraVersion") {
129+
exclude module: 'scala-compiler'
130+
exclude module: 'slf4j-api'
131+
}
132+
testCompile "junit:junit:$junitVersion"
133+
}
134+
135+
repositories {
136+
maven {
137+
url "http://repo.typesafe.com/typesafe/releases"
138+
}
139+
}
140+
}
141+
142+
project(":samza-shell") {
143+
apply plugin: 'java'
144+
145+
task shellTarGz(type: Tar) {
146+
compression = Compression.GZIP
147+
classifier = 'dist'
148+
from 'src/main/bash'
149+
}
150+
}
151+
152+
project(":samza-kv_$scalaVersion") {
153+
apply plugin: 'scala'
154+
155+
dependencies {
156+
compile project(':samza-api')
157+
compile "org.scala-lang:scala-library:$scalaVersion"
158+
compile "org.clapper:grizzled-slf4j_$scalaVersion:$grizzledVersion"
159+
compile "org.fusesource.leveldbjni:leveldbjni-all:$leveldbVersion"
160+
testCompile "junit:junit:$junitVersion"
161+
}
162+
}
163+
164+
project(":samza-test_$scalaVersion") {
165+
apply plugin: 'scala'
166+
167+
dependencies {
168+
compile project(':samza-api')
169+
compile project(":samza-kv_$scalaVersion")
170+
compile "org.scala-lang:scala-library:$scalaVersion"
171+
compile "org.clapper:grizzled-slf4j_$scalaVersion:$grizzledVersion"
172+
compile "net.sf.jopt-simple:jopt-simple:$joptSimpleVersion"
173+
compile "javax.mail:mail:1.4"
174+
compile files("../samza-kafka/lib/kafka_$scalaVersion-" + kafkaVersion + ".jar")
175+
testCompile "junit:junit:$junitVersion"
176+
testCompile files("../samza-kafka/lib/kafka_$scalaVersion-" + kafkaVersion + "-test.jar")
177+
testCompile "com.101tec:zkclient:$zkClientVersion"
178+
testCompile project(":samza-core_$scalaVersion")
179+
testCompile project(":samza-kafka_$scalaVersion")
180+
}
181+
}
182+

docs/README.md

+35
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
## Setup
2+
3+
Samza's documentation uses Jekyll to build a website out of markdown pages. To install Jekyll, run this command:
4+
5+
sudo gem install jekyll redcarpet
6+
7+
To run the website locally, execute:
8+
9+
jekyll serve --watch --host 0.0.0.0
10+
11+
To compile the website in the _site directory, execute:
12+
13+
jekyll build
14+
15+
## Versioning
16+
17+
The "Learn" section of this website is versioned. To add a new version, copy the folder at the version number-level (0.7.0 to 0.8.0, for example).
18+
19+
All links between pages inside a versioned folder should be relative links, not absolute.
20+
21+
## Javadocs
22+
23+
To auto-generate the latest Javadocs, run:
24+
25+
_tools/generate-javadocs.sh <version>
26+
27+
The version number is the number that will be used in the /docs/learn/documentation/<version>/api/javadocs path.
28+
29+
## Release
30+
31+
To build and publish the website to Samza's Apache SVN repository, run:
32+
33+
_tools/publish-site.sh 0.7.0 "updating welcome page" criccomini
34+
35+
This command will re-build the Javadocs and website, checkout https://svn.apache.org/repos/asf/incubator/samza/site/ locally, copy the site into the directory, and commit the changes.

docs/_config.yml

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
permalink: /:categories/:title
2+
name: Samza
3+
pygments: true
4+
markdown: redcarpet
5+
exclude: ['_notes']

0 commit comments

Comments
 (0)