Development of Resource Orchestration Mechanisms for Heterogeneous Computing Infrastructures

Thesis(text mostly in greek with abstract in english): link

Installations

For Local Development in Linux

Go
Docker
Docker Compose
kind - Kubernetes in Docker
Install the project's requirements, preferably in a Python virtual env:

pip install -r requirements.txt

For a Production Environment

Create a swarm cluster and clone the repo to the master node
Create a Kubernetes cluster and clone the repo to the master node
Install the project's requirements in each cluster, preferably in a Python virtual env:

pip install -r requirements_production.txt

Configs to switch from prod to local and vice versa

Folder/File	Description
/serrano_app/docker-compose.yaml	environment/KAFKA_ADVERTISED_LISTENERS
/serrano_app/flask-start.sh	flask run
/serrano_app/config/	kafka_cfg/kubeconfig_yaml
/kubernetes_driver_app/config/	kafka_cfg/bootstrap.servers
/kubernetes_driver_app/config/	kafka_cfg/kubeconfig_yaml
/kubernetes_driver_app/config/	kafka_cfg/broker
/swarm_driver_app/config/	kafka_cfg/bootstrap.servers
/swarm_driver_app/config/	kafka_cfg/broker

Quickstart

Run Docker Compose on a terminal:
```
cd serrano_app/
docker-compose up -d
```
Check that everything is up and running:
```
docker-compose ps
```
Establish the MongoDB Sink connection
To start the serrano_app, from the /serrano_app folder:
- Start the Faust agents for the Dispatcher, Resource Optimization Toolkit and Orchestrator components:
```
<python3_path> ./src/__main__.py worker --loglevel=INFO
```
- Start the Flask app to receive requests for an app deployment or termination:
```
./flask-start.sh
```

Create an application request:

Deployment

<python3_path> ./deploy.py -yamlSpec <absolute path to a YAML file> -env <local or prod>

Termination

<python3_path> ./remove.py -requestUUID <requestUUID> -env <local or prod>

Activate the Faust agent of the swarm or the Kubernetes driver to process the request:
- For the swarm driver, connect to the manager node and from the swarm_driver_app folder:
```
<python3_path> ./src/__main__.py worker --loglevel=INFO
```
- For the Kubernetes driver, connect to the manager node and from the kubernetes_driver_app folder:
```
<python3_path> ./src/__main__.py worker --loglevel=INFO
```
To check that data is published to the respective topics:
- kafkacat -b localhost:9092 -t dispatcher
- kafkacat -b localhost:9092 -t resource_optimization_toolkit
- kafkacat -b localhost:9092 -t orchestrator
- kafkacat -b localhost:9092 -t swarm
- kafkacat -b localhost:9092 -t kubernetes

Check created namespace, services and deployments in Kubernetes:

kubectl get namespaces
kubectl get services
kubectl get deployments

Check stack creation in the swarm:
```
docker stack ls
```

Files' & Folders' Highlights

Folder/File	Description
/serrano_app	implementation of the Dispatcher, Resource Optimization Toolkit and Orchestrator components
/swarm_driver_app	implementation of the swarm driver component
/kubernetes_driver_app	implementation of the Kubernetes driver component
flask-start.sh	script to run the flask app (mod +x)
src/main.py	faust application entrypoint
src/faust_app.py	creates an instance of the Faust for stream processing
src/flask_app.py	creates an instance of the Flask
models.py	Faust models to describe the data in the streams
helpers.py	helping functions
agents.py	Faust async stream processors of the Kafka topics
swarm_driver.py	SwarmDriver class to connect and interact with a swarm
kubernetes_driver.py	K8sDriver class to connect and interact with a Kubernetes cluster

Establish the MongoDB Sink Connection (for upsert)

In this example, the MongoDB is in MongoDB Atlas.

curl -X PUT http://localhost:8083/connectors/sink-mongodb-users/config -H "Content-Type: application/json" -d ' {
      "connector.class":"com.mongodb.kafka.connect.MongoSinkConnector",
      "tasks.max":"1",
      "topics":"db_consumer",
      "connection.uri":"mongodb://athina_kyriakou:[email protected]:27017,ntua-thesis-cluster-shard-00-01.xcgej.mongodb.net:27017,ntua-thesis-cluster-shard-00-02.xcgej.mongodb.net:27017/myFirstDatabase?ssl=true&replicaSet=atlas-xirr44-shard-0&authSource=admin&retryWrites=true&w=majority",
      "database":"thesisdb",
      "collection":"db_consumer",
      "key.converter":"org.apache.kafka.connect.json.JsonConverter",
      "key.converter.schemas.enable":false,
      "value.converter":"org.apache.kafka.connect.json.JsonConverter",
      "value.converter.schemas.enable":false,
      "document.id.strategy":"com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy",
      "document.id.strategy.partial.value.projection.list":"requestUUID",
      "document.id.strategy.partial.value.projection.type":"AllowList",
      "writemodel.strategy":"com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneBusinessKeyStrategy"
}'

Detailed info here

Useful Commands

Kafka

Create a topic:

docker-compose exec broker kafka-topics \
  --create \
  --bootstrap-server localhost:9092 \
  --replication-factor 1 \
  --partitions 1 \
  --topic kdriver

Kafkacat

CLI used for debugging & testing. Documentation

Check a topic's content:

kafkacat -b localhost:9092 -t <topic_name>

Write to the orchestrator topic:

kafkacat -b localhost:9092 -t orchestrator -P
{"requestUUID": XX, "action": XXX, "yamlSpec":XXXX}

Swarm

CLI Documentation

Get created stacks:

docker stack ls

Delete a stack:

docker stack rm <stack name>

Check nodes' information (engine version, hostname, status, availability, manager status):

docker node ls

Check the labels of a node (and other info):

docker node inspect <node hostname> --pretty

Add a label to a node:

docker node update --label-add <label name>=<value>

Check the node that a service is running on:

docker service ps <service name>

Create a swarm stack based on a YAML file:

docker stack deploy --compose-file <yaml file path> <stack name>

Kubernetes with kind

Create a cluster:

kind create cluster
kind get clusters (dflt name for created clusters: kind)

To have terminal interaction with the created Kubernetes objects, mainly for debugging, install kubectl.

Kubernetes with kubectl

Get deployments in a namespace:

kubectl get deployments --namespace=<name>

Check in which nodes a namespace's pods are runnning:

kubectl get pod -o wide --namespace=<name>

Show the nodes' labels:

kubectl get nodes --show-labels

Add a label to a node:

kubectl label nodes <node name> <label name>=<value>

Create Kubernetes resources based on a YAML file:

kubectl apply -f <yaml file path>

Useful Resources

Kafka

Kubernetes

Kubernetes & Python

Faust

The App - Define your Faust project: Spot the suggested structure of medium/large projects here
Example of medium project structure implementation
Models, Serialization, and Codecs
Stream processing with Python Faust: Part II – Streaming pipeline

Kafka Source - MongoDB Sink

MongoDB Kafka Connector
Create a Database in MongoDB Using the CLI with MongoDB Atlas: here & here
MongoDB Write Model Strategies
Kafka Connector used strategy for upsert: ReplaceOneBusinessKeyStrategy
DeleteOne write model strategy

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
kubernetes_driver_app/src		kubernetes_driver_app/src
serrano_app		serrano_app
swarm_driver_app/src		swarm_driver_app/src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
requirements_production.txt		requirements_production.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Development of Resource Orchestration Mechanisms for Heterogeneous Computing Infrastructures

Installations

For Local Development in Linux

For a Production Environment

Configs to switch from prod to local and vice versa

Quickstart

Files' & Folders' Highlights

Establish the MongoDB Sink Connection (for upsert)

Useful Commands

Kafka

Kafkacat

Swarm

Kubernetes with kind

Kubernetes with kubectl

Useful Resources

Kafka

Kubernetes

Kubernetes & Python

Faust

Kafka Source - MongoDB Sink

Avro

About

Languages

License

AthinaKyriakou/ntua-hsnl-serrano-thesis-orchestrator

Folders and files

Latest commit

History

Repository files navigation

Development of Resource Orchestration Mechanisms for Heterogeneous Computing Infrastructures

Installations

For Local Development in Linux

For a Production Environment

Configs to switch from prod to local and vice versa

Quickstart

Files' & Folders' Highlights

Establish the MongoDB Sink Connection (for upsert)

Useful Commands

Kafka

Kafkacat

Swarm

Kubernetes with kind

Kubernetes with kubectl

Useful Resources

Kafka

Kubernetes

Kubernetes & Python

Faust

Kafka Source - MongoDB Sink

Avro

About

Topics

Resources

License

Stars

Watchers

Forks

Languages