Skip to content

Diploma thesis conducted during the studies at the school of Electrical and Computer Engineering of the National Technical University of Athens

License

Notifications You must be signed in to change notification settings

AthinaKyriakou/ntua-hsnl-serrano-thesis-orchestrator

Repository files navigation

Development of Resource Orchestration Mechanisms for Heterogeneous Computing Infrastructures

Thesis(text mostly in greek with abstract in english): link

Installations

For Local Development in Linux

  1. Go
  2. Docker
  3. Docker Compose
  4. kind - Kubernetes in Docker
  5. Install the project's requirements, preferably in a Python virtual env:
pip install -r requirements.txt

For a Production Environment

  1. Create a swarm cluster and clone the repo to the master node
  2. Create a Kubernetes cluster and clone the repo to the master node
  3. Install the project's requirements in each cluster, preferably in a Python virtual env:
pip install -r requirements_production.txt

Configs to switch from prod to local and vice versa

Folder/File Description
/serrano_app/docker-compose.yaml environment/KAFKA_ADVERTISED_LISTENERS
/serrano_app/flask-start.sh flask run
/serrano_app/config/ kafka_cfg/kubeconfig_yaml
/kubernetes_driver_app/config/ kafka_cfg/bootstrap.servers
/kubernetes_driver_app/config/ kafka_cfg/kubeconfig_yaml
/kubernetes_driver_app/config/ kafka_cfg/broker
/swarm_driver_app/config/ kafka_cfg/bootstrap.servers
/swarm_driver_app/config/ kafka_cfg/broker

Quickstart

  1. Run Docker Compose on a terminal:

    cd serrano_app/
    docker-compose up -d
  2. Check that everything is up and running:

    docker-compose ps
  3. Establish the MongoDB Sink connection

  4. To start the serrano_app, from the /serrano_app folder:

    • Start the Faust agents for the Dispatcher, Resource Optimization Toolkit and Orchestrator components:
      <python3_path> ./src/__main__.py worker --loglevel=INFO
    • Start the Flask app to receive requests for an app deployment or termination:
      ./flask-start.sh
  5. Create an application request:

    • Deployment
      <python3_path> ./deploy.py -yamlSpec <absolute path to a YAML file> -env <local or prod>
    • Termination
      <python3_path> ./remove.py -requestUUID <requestUUID> -env <local or prod>
  6. Activate the Faust agent of the swarm or the Kubernetes driver to process the request:

    • For the swarm driver, connect to the manager node and from the swarm_driver_app folder:
      <python3_path> ./src/__main__.py worker --loglevel=INFO
    • For the Kubernetes driver, connect to the manager node and from the kubernetes_driver_app folder:
      <python3_path> ./src/__main__.py worker --loglevel=INFO
  7. To check that data is published to the respective topics:

    • kafkacat -b localhost:9092 -t dispatcher
    • kafkacat -b localhost:9092 -t resource_optimization_toolkit
    • kafkacat -b localhost:9092 -t orchestrator
    • kafkacat -b localhost:9092 -t swarm
    • kafkacat -b localhost:9092 -t kubernetes
  8. Check created namespace, services and deployments in Kubernetes:

    kubectl get namespaces
    kubectl get services
    kubectl get deployments
  9. Check stack creation in the swarm:

    docker stack ls

Files' & Folders' Highlights

Folder/File Description
/serrano_app implementation of the Dispatcher, Resource Optimization Toolkit and Orchestrator components
/swarm_driver_app implementation of the swarm driver component
/kubernetes_driver_app implementation of the Kubernetes driver component
flask-start.sh script to run the flask app (mod +x)
src/main.py faust application entrypoint
src/faust_app.py creates an instance of the Faust for stream processing
src/flask_app.py creates an instance of the Flask
models.py Faust models to describe the data in the streams
helpers.py helping functions
agents.py Faust async stream processors of the Kafka topics
swarm_driver.py SwarmDriver class to connect and interact with a swarm
kubernetes_driver.py K8sDriver class to connect and interact with a Kubernetes cluster

Establish the MongoDB Sink Connection (for upsert)

In this example, the MongoDB is in MongoDB Atlas.

curl -X PUT http://localhost:8083/connectors/sink-mongodb-users/config -H "Content-Type: application/json" -d ' {
      "connector.class":"com.mongodb.kafka.connect.MongoSinkConnector",
      "tasks.max":"1",
      "topics":"db_consumer",
      "connection.uri":"mongodb://athina_kyriakou:[email protected]:27017,ntua-thesis-cluster-shard-00-01.xcgej.mongodb.net:27017,ntua-thesis-cluster-shard-00-02.xcgej.mongodb.net:27017/myFirstDatabase?ssl=true&replicaSet=atlas-xirr44-shard-0&authSource=admin&retryWrites=true&w=majority",
      "database":"thesisdb",
      "collection":"db_consumer",
      "key.converter":"org.apache.kafka.connect.json.JsonConverter",
      "key.converter.schemas.enable":false,
      "value.converter":"org.apache.kafka.connect.json.JsonConverter",
      "value.converter.schemas.enable":false,
      "document.id.strategy":"com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy",
      "document.id.strategy.partial.value.projection.list":"requestUUID",
      "document.id.strategy.partial.value.projection.type":"AllowList",
      "writemodel.strategy":"com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneBusinessKeyStrategy"
}'

Detailed info here

Useful Commands

Kafka

Create a topic:

docker-compose exec broker kafka-topics \
  --create \
  --bootstrap-server localhost:9092 \
  --replication-factor 1 \
  --partitions 1 \
  --topic kdriver

Kafkacat

CLI used for debugging & testing. Documentation

Check a topic's content:

kafkacat -b localhost:9092 -t <topic_name>

Write to the orchestrator topic:

kafkacat -b localhost:9092 -t orchestrator -P
{"requestUUID": XX, "action": XXX, "yamlSpec":XXXX}

Swarm

CLI Documentation

Get created stacks:

docker stack ls

Delete a stack:

docker stack rm <stack name>

Check nodes' information (engine version, hostname, status, availability, manager status):

docker node ls

Check the labels of a node (and other info):

docker node inspect <node hostname> --pretty

Add a label to a node:

docker node update --label-add <label name>=<value>

Check the node that a service is running on:

docker service ps <service name>

Create a swarm stack based on a YAML file:

docker stack deploy --compose-file <yaml file path> <stack name>

Kubernetes with kind

Create a cluster:

kind create cluster
kind get clusters (dflt name for created clusters: kind)

To have terminal interaction with the created Kubernetes objects, mainly for debugging, install kubectl.

Kubernetes with kubectl

Get deployments in a namespace:

kubectl get deployments --namespace=<name>

Check in which nodes a namespace's pods are runnning:

kubectl get pod -o wide --namespace=<name>

Show the nodes' labels:

kubectl get nodes --show-labels

Add a label to a node:

kubectl label nodes <node name> <label name>=<value>

Create Kubernetes resources based on a YAML file:

kubectl apply -f <yaml file path>

Useful Resources

Kafka

Kubernetes

Kubernetes & Python

Faust

Kafka Source - MongoDB Sink

Avro

About

Diploma thesis conducted during the studies at the school of Electrical and Computer Engineering of the National Technical University of Athens

Topics

Resources

License

Stars

Watchers

Forks