Skip to content

Commit 6722a77

Browse files
committed
CI add Dockerfile and CI
1 parent 2cb96ea commit 6722a77

File tree

4 files changed

+158
-0
lines changed

4 files changed

+158
-0
lines changed

.github/workflows/test.yml

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Test that the docker image builds correctly
2+
# and that running the ingestion and scoring programs works
3+
# Also use cache for the data and the docker image.
4+
name: Test Docker Image
5+
6+
on:
7+
push:
8+
branches: [ main ]
9+
pull_request:
10+
11+
jobs:
12+
test:
13+
runs-on: ubuntu-latest
14+
env:
15+
DOCKER_IMAGE_NAME: docker-codabench-test
16+
17+
steps:
18+
- name: Checkout repository
19+
uses: actions/checkout@v3
20+
21+
- name: Set up Docker Buildx
22+
uses: docker/setup-buildx-action@v2
23+
24+
- name: Prepare buildx cache dir
25+
run: mkdir -p /tmp/.buildx-cache
26+
27+
- name: Cache Docker buildx layers
28+
uses: actions/cache@v3
29+
with:
30+
path: /tmp/.buildx-cache
31+
key: ${{ runner.os }}-buildx-${{ hashFiles('**/Dockerfile') }}-${{ github.ref }}
32+
restore-keys: |
33+
${{ runner.os }}-buildx-${{ hashFiles('**/Dockerfile') }}-
34+
35+
- name: Build Docker image
36+
run: |
37+
docker build --progress=plain \
38+
--cache-from=type=local,src=/tmp/.buildx-cache \
39+
--cache-to=type=local,dest=/tmp/.buildx-cache,mode=max \
40+
-t ${{ env.DOCKER_IMAGE_NAME }} .
41+
42+
- name: Prepare input/output directories (per README)
43+
run: |
44+
set -e
45+
python setup_data.py
46+
47+
- name: Test ingestion program
48+
run: |
49+
docker run --rm -it -u root \
50+
-v "./ingestion_program":"/app/ingestion_program" \
51+
-v "./dev_phase/input_data":/app/input_data \
52+
-v "./ingestion_res":/app/output \
53+
-v "./solution":/app/ingested_program \
54+
--name ingestion ${{ env.DOCKER_IMAGE_NAME }} \
55+
python /app/ingestion_program/ingestion.py
56+
57+
- name: Test scoring program
58+
run: docker run --rm -it -u root \
59+
-v "./scoring_program":"/app/scoring_program" \
60+
-v "./dev_phase/reference_data":/app/input/ref \
61+
-v "./ingestion_res":/app/input/res \
62+
-v "./scoring_res":/app/output \
63+
--name scoring ${{ env.DOCKER_IMAGE_NAME }} \
64+
python /app/scoring_program/scoring.py

Dockerfile

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Step 1: Start from an official Docker image with desired base environment
2+
# Good starting points are the official codalab images or
3+
# pytorch images with CUDA support:
4+
# - Codalab: codalab/codalab-legacy:py39
5+
# - Codalab GPU: codalab/codalab-legacy:gpu310
6+
# - Pytorch: pytorch/pytorch:2.8.0-cuda12.6-cudnn9-runtime
7+
FROM codalab/codalab-legacy:py39
8+
9+
# Set environment variables to prevent interactive prompts
10+
ENV DEBIAN_FRONTEND=noninteractive
11+
12+
# Step 2: Install system-level dependencies (if any)
13+
# e.g., git, wget, or common libraries for OpenCV like libgl1
14+
RUN pip install -U pip
15+
16+
# Step 3: Copy and pre-install all Python dependencies
17+
# This 'requirements.txt' file should list pandas, scikit-learn, timm, etc.
18+
# Place it in the same directory as this Dockerfile.
19+
COPY requirements.txt /tmp/requirements.txt
20+
RUN pip install --no-cache-dir -r /tmp/requirements.txt

README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Template to create a codabench bundle for ML competition in python
2+
3+
The code in this repository is a template to create a codabench bundle for a machine learning competition in python.
4+
It sets up a dummy classification task, evaluated with accuracy metric on a public and private test set.
5+
6+
## Structure of the bundle
7+
8+
- `ingestion_program/`: contains the ingestion program that will be run on participant's submissions. It is responsible for loading the code from the submission, passing the training data to train the model, and generating predictions on the test datasets.
9+
- `scoring_program/`: contains the scoring program that will be run to evaluate the predictions generated by the ingestion program. It loads the predictions and the ground truth labels, computes the evaluation metric (accuracy in this case), and outputs the score.
10+
- `solution/`: contains a sample solution submission that participants can use as a reference. Here, this is a simple Random Forest classifier.
11+
- `*_phase/`: contains the data for a given phase, including input data and reference labels. Running `setup_data.py` will generate dummy data for a development phase.
12+
- `competition.yaml`: configuration file for the codabench competition, specifying phases, tasks, and evaluation metrics.
13+
- `pages/`: contains markdown files that will be rendered as web pages in the codabench competition.
14+
15+
## Extra scripts in this repository
16+
17+
- `setup_data.py`: script to generate dummy data for the competition. This should be changed to load and preprocess real data for a given competition.
18+
- `create_bundle.py`: script to create the codabench bundle archive from the repository structure.
19+
- `Dockerfile`: Dockerfile to build the docker image that will be used to run the ingestion and scoring programs.
20+
21+
## Instruction to create the codabench bundle
22+
23+
Make sure that the `setup_data.py` script has been run to generate the data for the competition.
24+
25+
Then, run the `create_bundle.py` script to create the codabench bundle archive:
26+
27+
```bash
28+
python create_bundle.py
29+
```
30+
You can then upload the generated `bundle.zip` file to codabench to create the competition on this [page](https://www.codabench.org/competitions/upload/).
31+
32+
33+
## Instructions to test the bundle locally
34+
35+
36+
To test the ingestion program, run:
37+
38+
```bash
39+
python ingestion_program/ingestion.py --data-dir dev_phase/input_data/ --output-dir ingestion_res/ --submission-dir solution/
40+
```
41+
42+
To test the scoring program, run:
43+
```bash
44+
python scoring_program/scoring.py --reference-dir dev_phase/reference_data/ --output-dir scoring_res --prediction-dir ingestion_res/
45+
```
46+
47+
48+
### Setting up and testing the docker image
49+
50+
You can build the docker image locally from the `Dockerfile` with:
51+
52+
```bash
53+
docker build -t docker-image .
54+
```
55+
56+
To test the docker image locally, run:
57+
58+
```bash
59+
docker run --rm -it -u root \
60+
-v "./ingestion_program":"/app/ingestion_program" \
61+
-v "./dev_phase/input_data":/app/input_data \
62+
-v "./ingestion_res":/app/output \
63+
-v "./solution":/app/ingested_program \
64+
--name ingestion docker-image \
65+
python /app/ingestion_program/ingestion.py
66+
67+
docker run --rm -it -u root \
68+
-v "./scoring_program":"/app/scoring_program" \
69+
-v "./dev_phase/reference_data":/app/input/ref \
70+
-v "./ingestion_res":/app/input/res \
71+
-v "./scoring_res":/app/output \
72+
--name scoring docker-image \
73+
python /app/scoring_program/scoring.py
74+
```

logo.png

3.71 KB
Loading

0 commit comments

Comments
 (0)