set up environment: docker-compose up
- config DAG
- create connection to HTTP
- create operator to get API
- use XCom to extract data
- store data to database
- docker exec -it material-2_postgres_1 /bin/bash
- psql -U airflow
- SELECT * FROM users;
- define Dataset
- define producer to update Dataset
- define consumer which schedule parameter was Dataset from producer
- DAGs view -> Trigger column -> Dataset update
- Dataset view
Using CeleryExecutor
- define Dag
- define task with parallel
- docker-compose down && docker-compose --profile flower up -d
- Check Flower UI on: localhost:5555
- add new worker in airflow-compose
- check worker available in Flower UI: click on worker -> queue -> name
- check task send to new queue when trigger Dag
- create group folder to define group of Operator Dowload and Transform
- define group_dag.py with config task from call function from group folder which is pre-defined
- navigate to graph view in dag -> have button can group task in 1 task
- create task with push and pull data to XCom
- config trigger rule
- write dag
- go to XCom view in Admin -> Check value column
- Config Elastic Search in local:
- dowload docker-compose-es.yaml
- docker-compose -f docker-compose-es.yaml up -d
- docker-compose -f docker-compose-es.yaml p
- go to localhost:9200 to check elastic search was avalable in docker or check in command line:
- docker exec -it material-2_elastic_1 /bin/bash
- curl -X GET 'http://elastic:9200'
- Add elastic_hook to plugins system
- config elastic_hook.py in plugins folder
- in command line:
- docker exec -it material-2_airflow-scheduler_1 /bin/bash
- airflow plugins
- add AirflowElasticPlugin
- down and up docker-compose to restart -> check airflow plugins again
- add elastic_hook in elastic_dag