Skip to content
This repository was archived by the owner on Nov 23, 2023. It is now read-only.

Commit 00930c2

Browse files
committed
refactor(python): Use Poetry for package management
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
1 parent 75be132 commit 00930c2

File tree

11 files changed

+2141
-82
lines changed

11 files changed

+2141
-82
lines changed

.github/workflows/ci.yml

+29-27
Original file line numberDiff line numberDiff line change
@@ -22,32 +22,31 @@ jobs:
2222
- name: Install Python dependencies
2323
run: |
2424
python -m pip install --upgrade pip
25-
pip install -r requirements-dev.txt
26-
pip install -r infra/requirements.txt
27-
pip install -r backend/endpoints/datasets/requirements.txt
25+
python -m pip install poetry
26+
python -m poetry install
2827
2928
- name: Check last commit message
3029
if: github.event_name == 'push'
3130
run: |
32-
gitlint
31+
poetry run gitlint
3332
3433
- name: Check all commit messages in Pull Request
3534
if: github.event_name == 'pull_request'
3635
run: >
37-
gitlint --commits
36+
poetry run gitlint --commits
3837
origin/${{ github.base_ref }}..${{ github.event.pull_request.head.sha }}
3938
4039
- name: Check Python code formatting
4140
run: |
42-
black . --check --diff
41+
poetry run black . --check --diff
4342
4443
- name: Check Python code quality
4544
run: |
46-
pylint backend/ infra/
45+
poetry run pylint backend/ infra/
4746
4847
- name: Check Python code import statements
4948
run: |
50-
isort . --check --diff
49+
poetry run isort . --check --diff
5150
5251
5352
test:
@@ -69,12 +68,12 @@ jobs:
6968
- name: Install Python dependencies
7069
run: |
7170
python -m pip install --upgrade pip
72-
pip install -r requirements-dev.txt
73-
pip install -r infra/requirements.txt
71+
python -m pip install poetry
72+
python -m poetry install
7473
7574
- name: Run unit tests
7675
run: |
77-
pytest tests/
76+
poetry run pytest tests/
7877
7978
8079
test-infra:
@@ -96,9 +95,8 @@ jobs:
9695
- name: Install Python dependencies
9796
run: |
9897
python -m pip install --upgrade pip
99-
pip install -r requirements-dev.txt
100-
pip install -r infra/requirements.txt
101-
pip install -r backend/endpoints/datasets/requirements.txt
98+
python -m pip install poetry
99+
python -m poetry install
102100
103101
- name: Use Node.js 12.x for CDK deployment
104102
uses: actions/[email protected]
@@ -110,7 +108,7 @@ jobs:
110108
run: npm install -g aws-cdk
111109

112110
- name: Print CDK version
113-
run: cdk --version
111+
run: poetry run cdk --version
114112

115113
- name: Configure AWS credentials
116114
uses: aws-actions/configure-aws-credentials@v1
@@ -128,21 +126,21 @@ jobs:
128126
129127
- name: Deploy AWS stack for testing
130128
run: |
131-
cdk bootstrap aws://unknown-account/ap-southeast-2
132-
cdk deploy --require-approval never geospatial-data-lake
129+
poetry run cdk bootstrap aws://unknown-account/ap-southeast-2
130+
poetry run cdk deploy --require-approval never geospatial-data-lake
133131
working-directory: infra
134132

135133
- name: Run AWS infra tests
136134
run: |
137-
pytest infra/tests/
135+
poetry run pytest infra/tests/
138136
139137
- name: Run AWS backend tests
140138
run: |
141-
pytest backend/tests/
139+
poetry run pytest backend/tests/
142140
143141
- name: Destroy AWS stack used for testing
144142
run: |
145-
cdk destroy --force geospatial-data-lake
143+
poetry run cdk destroy --force geospatial-data-lake
146144
working-directory: infra
147145

148146

@@ -167,8 +165,8 @@ jobs:
167165
- name: Install Python dependencies
168166
run: |
169167
python -m pip install --upgrade pip
170-
pip install -r requirements-dev.txt
171-
pip install -r infra/requirements.txt
168+
python -m pip install poetry
169+
python -m poetry install --no-dev
172170
173171
- name: Use Node.js 12.x for CDK deployment
174172
uses: actions/[email protected]
@@ -180,7 +178,7 @@ jobs:
180178
run: npm install -g aws-cdk
181179

182180
- name: Print CDK version
183-
run: cdk --version
181+
run: poetry run cdk --version
184182

185183
# NONPROD DEPLOYMENT
186184
- name: (NonProd) Configure AWS credentials
@@ -199,9 +197,11 @@ jobs:
199197
if: >
200198
github.ref == 'refs/heads/master'
201199
&& github.repository == 'linz/geospatial-data-lake'
200+
env:
201+
DEPLOY_ENV: nonprod
202202
run: |
203-
cdk bootstrap aws://unknown-account/ap-southeast-2
204-
DEPLOY_ENV=nonprod cdk deploy --require-approval never geospatial-data-lake
203+
poetry run cdk bootstrap aws://unknown-account/ap-southeast-2
204+
poetry run cdk deploy --require-approval never geospatial-data-lake
205205
working-directory: infra
206206

207207
# PROD DEPLOYMENT
@@ -221,7 +221,9 @@ jobs:
221221
if: >
222222
startsWith(github.ref, 'release')
223223
&& github.repository == 'linz/geospatial-data-lake'
224+
env:
225+
DEPLOY_ENV: prod
224226
run: |
225-
cdk bootstrap aws://unknown-account/ap-southeast-2
226-
DEPLOY_ENV=prod cdk deploy --require-approval never geospatial-data-lake
227+
poetry run cdk bootstrap aws://unknown-account/ap-southeast-2
228+
poetry run cdk deploy --require-approval never geospatial-data-lake
227229
working-directory: infra

README.md

+8-13
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,19 @@ $ python3 -m venv .venv
1414
$ source .venv/bin/activate
1515
```
1616

17-
* Upgrade pip and install the required dependencies
17+
* Upgrade pip
1818

1919
```bash
2020
$ pip install --upgrade pip
2121
```
2222

23+
* [Install Poetry](https://python-poetry.org/docs/#installation)
24+
25+
* Install the dependencies:
26+
27+
```bash
28+
$ poetry install
29+
```
2330

2431
## AWS CDK Environment (AWS Infrastructure)
2532
* Install NVM (use latest version)
@@ -50,12 +57,6 @@ $ npm install -g aws-cdk
5057

5158

5259
## AWS Infrastructure Deployment (CDK Stack)
53-
* Install Python CDK dependencies
54-
55-
```bash
56-
$ pip install -r infra/requirements.txt
57-
```
58-
5960
* Get AWS credentials (see: https://www.npmjs.com/package/aws-azure-login)
6061

6162
```bash
@@ -72,12 +73,6 @@ $ cdk deploy --profile <geospatial-data-lake-nonprod|geospatial-data-lake-prod>
7273

7374

7475
## Development
75-
* Install Python development dependencies
76-
77-
```bash
78-
$ pip install -r requirements-dev.txt
79-
```
80-
8176
* Install commit-msg git hook
8277

8378
```bash

backend/endpoints/datasets/bundle.sh

-8
This file was deleted.

bundle.bash

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#!/usr/bin/env bash
2+
3+
set -o errexit -o noclobber -o nounset
4+
5+
work_dir="$(mktemp --directory)"
6+
all_requirements_file="${work_dir}/all-requirements.txt"
7+
backend_requirements_file="${work_dir}/backend-requirements.txt"
8+
9+
# Get requirements file for entries in requirements.txt
10+
poetry export --output="$all_requirements_file" --without-hashes
11+
grep --file=backend/requirements.txt "$all_requirements_file" > "$backend_requirements_file"
12+
13+
pip install --requirement="$backend_requirements_file" --target=/asset-output
14+
cp --archive --update --verbose backend/endpoints /asset-output/

infra/data_stores/data_lake_stack.py

+15-18
Original file line numberDiff line numberDiff line change
@@ -50,25 +50,22 @@ def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
5050
Tags.of(db_datasets_table).add("ApplicationLayer", "application-db")
5151

5252
# Lambda Handler Functions
53-
lambda_path = "../backend/endpoints/datasets"
54-
dataset_handler_function = aws_lambda.Function(
55-
self,
56-
"datasets-endpoint-function",
57-
function_name="datasets-endpoint",
58-
handler="endpoints.datasets.entrypoint.lambda_handler",
59-
runtime=aws_lambda.Runtime.PYTHON_3_6,
60-
code=aws_lambda.Code.from_asset(
61-
path=os.path.dirname(lambda_path),
62-
bundling=core.BundlingOptions(
63-
image=aws_lambda.Runtime.PYTHON_3_6.bundling_docker_image, # pylint:disable=no-member
64-
command=[
65-
"bash",
66-
"-c",
67-
open(f"{lambda_path}/bundle.sh", "r").read(),
68-
],
53+
project_path = ".."
54+
with open(os.path.join(project_path, "bundle.bash"), "r") as bundler:
55+
dataset_handler_function = aws_lambda.Function(
56+
self,
57+
"datasets-endpoint-function",
58+
function_name="datasets-endpoint",
59+
handler="endpoints.datasets.entrypoint.lambda_handler",
60+
runtime=aws_lambda.Runtime.PYTHON_3_6,
61+
code=aws_lambda.Code.from_asset(
62+
path=project_path,
63+
bundling=core.BundlingOptions(
64+
image=aws_lambda.Runtime.PYTHON_3_6.bundling_docker_image, # pylint:disable=no-member
65+
command=["bash", "-c", bundler.read()],
66+
),
6967
),
70-
),
71-
)
68+
)
7269
db_datasets_table.add_global_secondary_index(
7370
index_name="datasets_title",
7471
partition_key=aws_dynamodb.Attribute(name="sk", type=aws_dynamodb.AttributeType.STRING),

infra/requirements.txt

-6
This file was deleted.

0 commit comments

Comments
 (0)