Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge develop into master v0.2 #16

Merged
merged 83 commits into from
Jul 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
3f0b804
Add gitignore
Bear-Witness-98 Jul 16, 2024
1b3d8c3
Add packaging files
Bear-Witness-98 Jul 16, 2024
0e731ff
Add formatting to python script files
Bear-Witness-98 Jul 16, 2024
1e56b65
Fixes to make notebook work
Bear-Witness-98 Jul 16, 2024
89fadcc
Merge pull request #1 from Bear-Witness-98/feature/repo-setup
Bear-Witness-98 Jul 16, 2024
09fde77
Notebook review, challange updated.
Bear-Witness-98 Jul 16, 2024
9a5d872
Merge pull request #2 from Bear-Witness-98/feature/notebook-revision
Bear-Witness-98 Jul 16, 2024
80719b8
Fix data loading directory for test
Bear-Witness-98 Jul 16, 2024
06b9145
update gitignore
Bear-Witness-98 Jul 16, 2024
0285c28
fix prediction test
Bear-Witness-98 Jul 16, 2024
7540ec3
Add comment to challange.md
Bear-Witness-98 Jul 16, 2024
3e2ab2b
Passed models tests
Bear-Witness-98 Jul 16, 2024
eae1e08
Add poetry lock file
Bear-Witness-98 Jul 17, 2024
bbefabf
Merge pull request #3 from Bear-Witness-98/feature/model-addition
Bear-Witness-98 Jul 17, 2024
e06d3dd
Fix model bugs
Bear-Witness-98 Jul 17, 2024
8e5831e
Additional model improvements
Bear-Witness-98 Jul 17, 2024
f74012a
Remove ipdb from model training.
Bear-Witness-98 Jul 17, 2024
1030988
Merge pull request #4 from Bear-Witness-98/bugfix/model-implementation
Bear-Witness-98 Jul 17, 2024
0eb6226
Update fastapi version
Bear-Witness-98 Jul 17, 2024
d04e150
Add api
Bear-Witness-98 Jul 17, 2024
c5fcede
Update gitignore
Bear-Witness-98 Jul 17, 2024
6ac63e3
Merge pull request #5 from Bear-Witness-98/feature/api
Bear-Witness-98 Jul 17, 2024
e4536d9
Removed unused imports
Bear-Witness-98 Jul 17, 2024
2bb3174
Update model storing directory
Bear-Witness-98 Jul 17, 2024
a90e15b
Chande model loading
Bear-Witness-98 Jul 17, 2024
19bc0a4
Change directory for storing the model
Bear-Witness-98 Jul 17, 2024
d5735db
Merge pull request #6 from Bear-Witness-98/bugfix/api-model-train
Bear-Witness-98 Jul 17, 2024
b4d1ed5
Update poetry with appropriate versions
Bear-Witness-98 Jul 17, 2024
ebaaace
Add dockerfile
Bear-Witness-98 Jul 17, 2024
c83e428
Update dockerfile command
Bear-Witness-98 Jul 17, 2024
ab3fe51
Add deploy script, and docker fix
Bear-Witness-98 Jul 17, 2024
1ceb3c6
update library version for stress test
Bear-Witness-98 Jul 17, 2024
6c922fa
update stress tests url
Bear-Witness-98 Jul 17, 2024
6200cb6
Update doc
Bear-Witness-98 Jul 17, 2024
3952da4
Add copy of model to dockerfile
Bear-Witness-98 Jul 17, 2024
adb8029
Update dockerfile and poetry lock
Bear-Witness-98 Jul 17, 2024
54ef2fc
Add change to dockerfile
Bear-Witness-98 Jul 17, 2024
db63318
Add training to deploy script
Bear-Witness-98 Jul 17, 2024
1816d7c
Merge pull request #7 from Bear-Witness-98/feature/deploy-api
Bear-Witness-98 Jul 17, 2024
c27b1fe
Moved ci/cd files to appropriate directory.
Bear-Witness-98 Jul 19, 2024
0afcff8
First ci update
Bear-Witness-98 Jul 19, 2024
699de0a
Add paths to testing
Bear-Witness-98 Jul 19, 2024
debb054
Add .github folder to trigger
Bear-Witness-98 Jul 19, 2024
c7ca0ff
add '' to paths
Bear-Witness-98 Jul 19, 2024
f181f04
Add commands to ci
Bear-Witness-98 Jul 19, 2024
7d9ea2a
Fix syntax error
Bear-Witness-98 Jul 19, 2024
e1522e6
Add module files to needed paths
Bear-Witness-98 Jul 19, 2024
de5f770
setup python version
Bear-Witness-98 Jul 19, 2024
8d3e9d0
Check directory
Bear-Witness-98 Jul 19, 2024
5ac5991
Checkout code
Bear-Witness-98 Jul 19, 2024
ce920a4
Fix installation
Bear-Witness-98 Jul 19, 2024
652eddb
add virtualenv folder to executing machine
Bear-Witness-98 Jul 19, 2024
b35f987
Add test dependencies to poetry
Bear-Witness-98 Jul 19, 2024
ee5c8fc
Modified dependencies installation
Bear-Witness-98 Jul 19, 2024
a33bdea
Add model retrival
Bear-Witness-98 Jul 19, 2024
cac268e
Add api test
Bear-Witness-98 Jul 19, 2024
a892d27
rename job
Bear-Witness-98 Jul 19, 2024
72b59d9
Add cd workflow
Bear-Witness-98 Jul 19, 2024
4328d1d
Fix syntax error
Bear-Witness-98 Jul 19, 2024
3ce6e1b
Check secret upload
Bear-Witness-98 Jul 19, 2024
c4574f2
Add gcp authentication
Bear-Witness-98 Jul 19, 2024
79df92c
Add github actions workflows to PRs on main.
Bear-Witness-98 Jul 19, 2024
068627d
Merge pull request #8 from Bear-Witness-98/feature/ci-cd
Bear-Witness-98 Jul 19, 2024
cb041ab
Improved CI and add documentation
Bear-Witness-98 Jul 19, 2024
5cb72ac
Improved cd, and reviewd ci
Bear-Witness-98 Jul 19, 2024
d922565
Merge pull request #10 from tryolabs/feature/improve-ci-cd
Bear-Witness-98 Jul 19, 2024
17d7402
Add doc
Bear-Witness-98 Jul 19, 2024
3d5120c
Merge pull request #11 from tryolabs/bugfix/ci-cd-doc
Bear-Witness-98 Jul 19, 2024
e3a37b3
Update deployment documentation.
Bear-Witness-98 Jul 19, 2024
e444cd0
Merge pull request #12 from tryolabs/feature/improve-deployment
Bear-Witness-98 Jul 19, 2024
24fa19c
Add more fastapi-like value checking
Bear-Witness-98 Jul 19, 2024
7198237
Remove debug function.
Bear-Witness-98 Jul 19, 2024
71b13b8
Update api and doc
Bear-Witness-98 Jul 19, 2024
4255a34
typo
Bear-Witness-98 Jul 19, 2024
14d0c01
Merge pull request #13 from tryolabs/feature/improve-api
Bear-Witness-98 Jul 19, 2024
415f861
Update pandas version
Bear-Witness-98 Jul 19, 2024
a68d6d4
Improve model
Bear-Witness-98 Jul 19, 2024
f9ad8cf
Add documentation on the modelling stage, and change model used
Bear-Witness-98 Jul 19, 2024
f435ebe
Merge pull request #14 from tryolabs/feature/improve-model
Bear-Witness-98 Jul 19, 2024
0ba97d6
Merge branch 'main' into develop
Bear-Witness-98 Jul 19, 2024
f76f40c
add xgboost
Bear-Witness-98 Jul 20, 2024
f9173b2
Update poetry lock
Bear-Witness-98 Jul 20, 2024
c9a0a83
Merge pull request #18 from tryolabs/bugfix/add-xgboost-dependency
Bear-Witness-98 Jul 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 12 additions & 18 deletions .github/workflows/cd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,46 +3,40 @@ name: 'Continuous Delivery'
on:
pull_request:
branches:
- develop
- release
- main
paths:
- 'challenge/**'
- 'scripts/**'
- '.github/**'
- 'pyproject.toml'
- 'poetry.lock'

jobs:
run_testing:
deploy_and_test:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- name: checkout code
uses: actions/checkout@v4

- name: Set up Google Cloud SDK
uses: google-github-actions/auth@v1
with:
credentials_json: ${{ secrets.GCP_CREDENTIAL }}
- name: Check directory
run: |
ls
credentials_json: ${{ secrets.GCP_CREDENTIALS }}

- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.10.8'
- name: Python version
run: |
python --version

- name: Install poetry and virtualenv
run: |
pip install poetry

- name: Install dependencies
run: |
poetry config virtualenvs.create false
poetry lock --no-update
poetry install
- name: Push to prod

- name: Push to production
run: |
bash scripts/deploy.sh

- name: Run stress test
run: |
make stress-test
28 changes: 12 additions & 16 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,43 +4,39 @@ on:
pull_request:
branches:
- develop
- release
- main
paths:
- 'challenge/**'
- 'scripts/**'
- '.github/**'
- 'pyproject.toml'
- 'poetry.lock'

jobs:
run_testing:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- name: Check directory
run: |
ls
- name: Set up Python
- name: checkout code
uses: actions/checkout@v4

- name: set up Python
uses: actions/setup-python@v2
with:
python-version: '3.10.8'
- name: Python version
run: |
python --version
- name: Install poetry and virtualenv

- name: Install poetry
run: |
pip install poetry

- name: Install dependencies
run: |
poetry config virtualenvs.create false
poetry lock --no-update
poetry install

- name: Get model
run:
python challenge/model.py

- name: Run model test
run: |
make model-test

- name: Run api test
run: |
make api-test
119 changes: 65 additions & 54 deletions challenge/api.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,13 @@
import sys
from datetime import datetime, timezone

import fastapi
import pandas as pd
from fastapi import HTTPException
from pydantic import BaseModel
from pydantic import BaseModel, validator

from challenge.model import DelayModel


def print_to_file(whatever: any):
with open("file.txt", "a") as sys.stdout:
print(whatever)


valid_opera_values = [
VALID_OPERA_VALUES = [
"american airlines",
"air canada",
"air france",
Expand All @@ -39,90 +33,107 @@ def print_to_file(whatever: any):
"lacsa",
]

valid_tipo_vuelo_values = [
VALID_TIPO_VUELO_VALUES = [
"I",
"N",
]

valid_mes_values = range(1, 13)


def valid_tipo_vuelo(tipo_vuelo: str) -> bool:
return tipo_vuelo in valid_tipo_vuelo_values
VALID_MES_VALUES = range(1, 13)


def valid_opera(opera: str) -> bool:
return opera in valid_opera_values


def valid_mes(mes_value: int) -> bool:
return mes_value in valid_mes_values
app = fastapi.FastAPI()
model = DelayModel()
model.load_model("models")


class Flight(BaseModel):
OPERA: str
TIPOVUELO: str
MES: int


class FlightData(BaseModel):
flights: list[Flight]


app = fastapi.FastAPI()
model = DelayModel()
model.load_model("models")


def flight_data_to_pandas(flight_data: FlightData) -> pd.DataFrame:
flight_data_dict = {"OPERA": [], "TIPOVUELO": [], "MES": []}
for elem in flight_data.flights:
if not valid_opera(elem.OPERA.lower()):
@validator("OPERA")
def valid_opera(cls, opera_value: str):
if opera_value.lower() not in VALID_OPERA_VALUES:
raise HTTPException(
status_code=400,
detail=(
f"Value for tipo vuelo not valid. Recieved {elem.OPERA},"
f" expected one from {[v for v in valid_opera_values]}"
f"Value for tipo vuelo not valid. Recieved {opera_value}, "
f"expected one from {VALID_OPERA_VALUES}"
),
)
if not valid_tipo_vuelo(elem.TIPOVUELO.capitalize()):
return opera_value

@validator("TIPOVUELO")
def valid_tipo_vuelo(cls, tipo_vuelo_value: str):
if tipo_vuelo_value.capitalize() not in VALID_TIPO_VUELO_VALUES:
raise HTTPException(
status_code=400,
detail=(
f"Value for tipo vuelo not valid. Recieved {elem.TIPOVUELO},"
f" expected one from {[v for v in valid_tipo_vuelo_values]}"
f"Value for tipo vuelo not valid. Recieved {tipo_vuelo_value}, "
f"expected one from {VALID_TIPO_VUELO_VALUES}"
),
)
if not valid_mes(elem.MES):
return tipo_vuelo_value

@validator("MES")
def valid_mes(cls, mes_value: int):
if mes_value not in VALID_MES_VALUES:
raise HTTPException(
status_code=400,
detail=(
f"Value for tipo vuelo not valid. Recieved {elem.MES},"
f" expected one from {valid_mes_values}"
f"Value for tipo vuelo not valid. Recieved {mes_value}, "
f"expected one from {VALID_MES_VALUES}"
),
)
return mes_value


class FlightData(BaseModel):
flights: list[Flight]


def flight_data_to_pandas(flight_data: FlightData) -> pd.DataFrame:
flight_data_dict = {"OPERA": [], "TIPOVUELO": [], "MES": []}
for elem in flight_data.flights:
flight_data_dict["OPERA"].append(elem.OPERA)
flight_data_dict["TIPOVUELO"].append(elem.TIPOVUELO)
flight_data_dict["MES"].append(elem.MES)

return pd.DataFrame(flight_data_dict)


@app.get("/", status_code=200)
async def root() -> dict:
return {
"message": (
"welcome to the api for predicting flight delay. Use the /health "
"endpoint to get server status, and the /predict endpoint to get your "
"prediction from input data."
)
}


@app.get("/health", status_code=200)
async def get_health() -> dict:
return {"status": "OK"}


@app.post("/predict", status_code=200)
async def post_predict(flight_data: FlightData) -> dict:
# get data and convert to pandas dataframe

flight_data_df = flight_data_to_pandas(flight_data)
preprocessed_data = model.preprocess(flight_data_df)

column_order = model._model.feature_names_in_
preprocessed_data = preprocessed_data[column_order]

pred = model.predict(preprocessed_data)

return {"predict": pred}
try:
# get data and convert to pandas dataframe
flight_data_df = flight_data_to_pandas(flight_data)
preprocessed_data = model.preprocess(flight_data_df)

# sorts column to feed the model
pred = model.predict(preprocessed_data)

return {"predict": pred}
except Exception as e:
# there may be exceptions we don't want to send to the clients, so log them in
# an internal file for debugging. Just as a cheap solution.
with open("error_logs.txt", "a") as f:
f.write(f"{datetime.now(timezone.utc)}: encounter error {e}")
raise HTTPException(
status_code=500, detail="Internal server error during prediction."
)
Loading
Loading