Skip to content

Commit

Permalink
update readme and colab link
Browse files Browse the repository at this point in the history
Summary: as title

Reviewed By: YLouWashU

Differential Revision:
D63205323

Privacy Context Container: L1257120

fbshipit-source-id: ffcf2464f7595b3573d49885fcfad5a6a580a4f0
  • Loading branch information
echo-xiao9 authored and facebook-github-bot committed Sep 21, 2024
1 parent af6bb5f commit 3b83ab2
Showing 1 changed file with 114 additions and 24 deletions.
138 changes: 114 additions & 24 deletions readme.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,134 @@
# ATEK
Aria train and evaluation kits.
# Aria Training and Evaluation toolkit (ATEK)

## How to set up ATEK on AWS
### 1. Make sure you have the right AWS access and can ssh to the frl surreal research AWS cluster.
### 2. In the submit node, go to the /work_1a/$USER folder to clone the ATEK repo. Make sure the github ssh access set up properly.
```
git clone [email protected]:fairinternal/ATEK.git
```
Today, we’re introducing ATEK, an e2e framework for training and evaluating deep learning models on [Aria](https://www.projectaria.com/) data, for both 3D egocentric-specific and general machine perception tasks.

### 3. Create ATEK conda env (MUST BE in node shell)
```
srun -N 1 --gres=gpu:1 --pty bash
```

Init mamba, note that mamba is a drop in replacement for the conda which is much faster in the complex cloud environment such
as the AWS for package management. Conda also works but mamba is a better choice here.
```
export PATH=/data/home/$USER/miniconda3/bin:$PATH
For full documentation, you can navitagete to:

- [Installation](docs/INSTALL.md)
- [Quick start](#quick-start)
- [Machine learning tasks supported by ATEK](docs/ml_tasks.md)
- [ATEK Data Store](docs/atek_data_store.md)
- [Core code snippets](docs/core_code_snippets.md)
- [Technical specifications](docs/technical_specifications.md)

<a id="quick-start"></a>
## Colab notebook
Data preprocessing, inference and evaluation example

[![Aria VRS Data Provider](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/facebookresearch/ATEK/tree/main/examples/ATEK_CoLab_Notebook.ipynb)

## Quick start (TODO: add pypi installation)

1. Install miniconda3

Follow [instructions](https://docs.anaconda.com/free/miniconda/) to install miniconda3 and re-open a new terminal after install for this to take effect

2. Install mamba and initialize.

Mamba is a python venv tooling that is similar to Conda, but supposed to handle dependencies more elegantly.

```bash
export PATH=/home/$USER/miniconda3/bin:$PATH
conda init bash
conda install mamba -n base -c conda-forge
mamba init
```

Then create a conda environment with the ATEK environment.yaml
3. clone ATEK code:

```
git clone https://github.com/facebookresearch/ATEK.git
cd ATEK
```

4. Create an ATEK environment and Install ATEK lib:

```bash
mamba env create -f env_atek.yml
mamba activate atek
python3 -m pip install -e ./
```

### 4. Setup the pybind library
In mamba atek env
Verify the libs are installed correctly

```bash
mamba list atek
```
pip install -e ${ATEK_SRC_DIR}

Details in [Complete installation guide](docs/INSTALL.md) includes:

1. Python installation from source code, on local Fedora + Macbook.
2. Python installation from source code on AWS.
3. We will also provide 2 types of installation flavor: `[default]` and `[demo]`, where the latter would require more dependencies e.g. SAM2, Omni3D, etc.

## Download example data

To download, you should use the dataverse_url_parser.py script in ATEK's lib, with --download-wds-to-local flag. You can select which preprocessing config to download, train/validation split, and number of sequences to download. Here we will use cubercnn configuration as example. Please download input json [here, TODO: get link](link) and specify its path in atek_json_path. Replace output_data_dir with the path you want to put your data.

```python
python3 tools/dataverse_url_parser.py \
--config-name cubercnn \
--input-json-path ${atek_json_path} \
--output-folder-path ${output_data_dir}/
--max-num-sequences 2
--download-wds-to-local
```
To directly stream from data store, you can also use dataverse_url_parser.py script without the --download-wds_to-local flag, which will just create 3 yaml files, streamable_all/train/validation_tars.yaml, each containing the urls of the WDS shard files. These yaml files can be consumed by ATEK lib.

### 5. Connect to the jupyter
In Atek env, setup the jupyter
```python
python3 tools/dataverse_url_parser.py \
--config-name cubercnn \
--input-json-path ${atek_json_path} \
--output-folder-path ${output_data_dir}/
--max-num-sequences 2
```
jupyter lab --no-browser --port 9095 --ServerApp.token=''
You can visualize the WDS content, using the streamable yaml files.
```python
# Loading local WDS files
tar_file_urls = load_yaml_and_extract_tar_list(yaml_path = os.path.join(data_dir, "streamable_yamls", "streamable_validation_tars.yaml"))

# Batch size is None so that no collation is invoked
atek_dataloader = create_native_atek_dataloader(urls = tar_file_urls, batch_size=None, repeat_flag=False)

# Loop over all samples in DataLoader and visualize
atek_visualizer = NativeAtekSampleVisualizer(viz_prefix = "dataloading_visualizer", conf = viz_conf)
for atek_sample_dict in atek_dataloader:
# First convert it back to ATEK data sample and visualize
atek_visualizer.plot_atek_sample_as_dict(atek_sample_dict)
```


## Benchmarking, inference, and training examples
The script expects provide 2 csv files as input, groundtruth and predictions, both in the same predefined ATEK format (see below). The benchmarking script will compute and report a number of detection metrics. See example training and evaluation [jupyter notebook](/examples/demo_3_training_and_eval.ipynb) for more details.

Currently we support 2 ML tasks in benchmakring.
1. Static 3D object detection.
2. 3D surface reconstruction.


Usage:
```bash
python tools/benchmarking_static_object_detection.py \
--pred-csv {workdir}/eval_results/prediction_obbs.csv \
--gt-csv {workdir}/eval_results/gt_obbs.csv \
--output-file {workdir}/eval_results/atek_metrics.json
```

## Data-preprocessing, and visualization examples
Handling raw sensor data from Project Aria can be challenging due to the need for detailed knowledge of various Aria specifications, such as camera calibration, sensor behavior, and data synchronization.

ATEK simplifies this process by offering robust processing functionalities for all types of Aria data. This approach replaces complex data processing pipelines with just a few API calls, using simple configuration JSON files, making it more accessible and efficient for developers to get started.

Refer to the data preprocesing [jupyter notebook](examples/demo_1_data_preprocessing.ipynb) to see the details.

## Machine learning tasks supported by ATEK

ATEK now support static 3D object detection and surface reconstruction tasks. Find out more on [Machine learning tasks supported by ATEK](docs/ml_tasks.md) section.


## License
ATEK is Apache 2.0 licensed, as found in the LICENSE file.
<img alt="license" src="https://img.shields.io/badge/License-Apache--2.0-blue.svg"/>


## Contributors

0 comments on commit 3b83ab2

Please sign in to comment.