Team30 Project LeCloud

Run the cluster ( Playaround with Docker Compose)

Note: Run below commands from the directory where docker-compose.yml file is present.

bring up the cluster in disconnected mode

docker-compose up -d

stop the cluster

docker-compose stop

restart the stopped cluster

docker-compose start

remove containers

docker-compose rm -f

Running Instructions

There are two modes to run the Producer and Consumer routines:

Single topic mode
Two-topic batch mode

Single topic run is the simple mode where the producer pushes the data into Kafka to one Topic ("aminer1"). While in 2-topic mode, the producer pushes the data alternately, per the batch size set, to two topics ("aminer0" and "aminer1").

Regardless of the run mode, first you must spin up the containers. A. Load docker Images from docker-compose file

docker-compose up
or 
docker-compose up -d

Single Topic Mode

B.i Producer Code: ( make sure file exists: project\kafka\data\aminer_papers_0.txt)

cd project\kafka
python producer.py

C.i Consumer Code: Just before running the consumer, run the producer, so that messages are published to Kafka Queue

Simple Consumer Test: Connect to Spark Master docker and run

python /opt/spark/code/consumer.py

Spark Streaming Consumer:

docker exec spark-master bin/spark-submit --verbose --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.3.1 --master spark://spark-master:7077 /opt/spark/code/consumerSpark.py

Two-Topic Mode

For 2-topic run mode, you must copy the producer_batch.py in the batch_mode folder to the kafka folder. You also need to copy the consumerSpark.py and consumerSpark2.py in the batch_mode folder to the spark/code folder.

B.ii Producer Code: ( make sure file exists: project\kafka\data\aminer_papers_0.txt)

cd project\kafka
python producer_batch.py

C.ii Consumer Code: Just before running the consumer, run the producer, so that messages are published to Kafka Queue.

Open up two separate terminal shells. Now, in the terminal, go to the /spark/code folder.
Run Spark Streaming Consumer 1 in one of the terminal:

docker exec spark-master bin/spark-submit --verbose --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.3.1 --master spark://spark-master:7077  --executor-memory 1g --num-executors 2 --executor-cores 1 --total-executor-cores 2  /opt/spark/code/consumerSpark.py

Run Spark Streaming Consumer 2 in the other terminal:

docker exec spark-master bin/spark-submit --verbose --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.3.1 --master spark://spark-master:7077  --executor-memory 1g --num-executors 2 --executor-cores 1 --total-executor-cores 2  /opt/spark/code/consumerSpark2.py

D. Visualization:

1. Run local http server
```python
   cd project\guide
   python http-server.py
```
This will be running against localhost:8081 port pointing to guide folder
(Check) Try to navigate http://localhost:18001/AMiner.html



2. Connect to Neo4j browser using http://localhost:7474/browser with username: neo4j and password: password
    This will load the above AMiner.html tutorial page by default after connecting
    OR
    run this code in the query window 
    ```
        play: http://localhost:18001/AMiner.html    
    ```

Notes: If you see that above port is being used and not able to launch above url, then you can change the port in project\guide\http-server.py and launch this from neo4j browser with above command ( play: http://localhost:/AMiner.html ). If you want it automatic launch then you need to update docker\db\config\neo4j.conf and restart the container.

Happy Learning Kafka ( Producer, Consumer), Spark-Streaming, Neo4j and binding docker images enables scaling for distributed processing

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
batch_mode		batch_mode
docker		docker
project		project
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Team30 Project LeCloud

Run the cluster ( Playaround with Docker Compose)

bring up the cluster in disconnected mode

stop the cluster

restart the stopped cluster

remove containers

Running Instructions

Single Topic Mode

Two-Topic Mode

About

Releases

Packages

Contributors 4

Languages

License

h0n2/teamCCA

Folders and files

Latest commit

History

Repository files navigation

Team30 Project LeCloud

Run the cluster ( Playaround with Docker Compose)

bring up the cluster in disconnected mode

stop the cluster

restart the stopped cluster

remove containers

Running Instructions

Single Topic Mode

Two-Topic Mode

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages