Skip to content

Anatomy of a ramp kit

Balazs Kegl edited this page Oct 31, 2017 · 3 revisions

The base directory of any ramp-kit should look like

<ramp_kit_name>/
├── README.md
├── download_data.py (optional)
├── problem.py
├── requirements.txt
├── <ramp_kit_name>_starting_kit.ipynb
├── data
└── submissions
  • README.md: brief description of the ramp-kit subject with a link to the notebook.

  • download_data.py: Python script to download the data from our server.

    To retrieve the data open a terminal and run

    $ python download_data.py
    
  • problem.py: main Python script.

    This file uses building blocks from ramp-workflow to parametrize the setup of the given problem. It may use more complex workflows or cross-validation schemes, but this complexity is usually hidden in the implementation of ramp-workflow elements, the goal being to keep problem.py as simple as possible.

  • requirements.txt: text file containing the required Python libraries to run the starting-kit locally.

    To install them (using a virtual environment is preferred), run in a terminal

    $ pip install -r requirements.txt
    
  • <ramp_kit_name>_starting_kit.ipynb: Jupyter notebook that describes the predictive problem.

    It also presents the data set, the workflow, and usually some exploratory analysis and data visualization.

  • data: directory storing the local train and test files used to run ramp_test_submission.

    When cloning a ramp-kit, the data may not be included in the repository. In such occurence, use the available download script download_data.py to retrieve it.

  • submissions: directory containing the workflow elements the participants are expected to implement.

    Every ramp-kit comes with a starting_kit: a working implementation of the pipeline elements to serve as an exemple for the participants. These are used in the notebook to provide an end-to-end example of a fully working pipeline.

    ├── submissions
    │   ├── starting_kit
    │       ├── <file1>.py
    │       └── <file2>.py
    │   └── <your_first_submission>
    │       ├── <file1>.py
    │       └── <file2>.py
    

    Each submission written by a participant shall consist of a new directory within submissions. The name of this new directory will serve as the ID for the submission. This directory must contain the same filenames as the starting_kit.