Adding video, some final fixes.

Summary: Adding video, and made minor fixes to docs. allow-large-files Reviewed By: echo-xiao9 Differential Revision: D63348509 fbshipit-source-id: 7e066911296a4b2518ef314d035433218c433fd2
facebookresearch · Sep 24, 2024 · 7717cf2 · 7717cf2
1 parent 70592af
commit 7717cf2
Show file tree

Hide file tree

Showing 10 changed files with 796 additions and 747 deletions.
diff --git a/docs/ATEK_Data_Store.md b/docs/ATEK_Data_Store.md
@@ -1,6 +1,6 @@
 # ATEK Data Store
 
-ATEK Data Store is a data platform where preprocessed open Aria datasets in WebDataset (WDS) formats, with selected preprocessing configurations, are available for users to directly download and load into PyTorch.
+ATEK Data Store is a data platform where preprocessed open Aria datasets in [WebDataset](https://github.com/webdataset/webdataset) (WDS) formats, with selected preprocessing configurations, are available for users to directly download and load into PyTorch.
 
 ## ATEK datasets in WDS format
 
@@ -26,7 +26,7 @@ To access the data:
 
 1. Click the **access link** in the above table, you can find the **Access The Dataset** button on the bottom of the page. Input your email address, you will be redirected to a page where you will find a button to download **[dataset] in ATEK format (PyTorch ready)**.
 
-  ![Download button](./images/atek_data_store_download_button.png)
+  <img src="./images/atek_data_store_download_button.png" width="600">
 
 2. This will download a json file, e.g. `[dataset_name]_ATEK_download_urls.json`, that contains the URLs of the actual preprocessed data. Note that for the same dataset, all preprocessing configuration's URLs are contained in the same json file.
 
@@ -42,14 +42,15 @@ To access the data:
 
   where :
 
-  - `--config-name` specifies which [preprocessing configuration](./preprocessing_configurations.md) you would like to download.
+  - `--config-name` specifies which [preprocessing configuration](./preprocessing_configurations.md) you would like to download. You should choose one from [this table](#atek-datasets-in-wds-format).
   - `--download-wds-to-local` user can remove this flag to create **streamable** yaml files.
 
     User can also specify other options including maximum number of sequences to download, training validation split ratio, etc. See [src code](../tools/atek_wds_data_downloader.py) for details.
 
 4. **Note that these URLs will EXPIRE AFTER 30 DAYS**, user will need to re-download and re-generate the streamable yaml files.
 
-These steps will download ATEK preprocessed WebDataset files with the following folder structure. Note that if the download breaks in the middle, simply run it again to pick up from the middle.
+## Downloaded WDS files
+Following the above steps will download ATEK preprocessed WebDataset files with the following folder structure. Note that if the download breaks in the middle, simply run it again to pick up from the middle.
 
 ```bash
 ./downloaded_local_wds

diff --git a/docs/Install.md b/docs/Install.md
@@ -2,7 +2,7 @@
 
 We provided 2 ways to install ATEK:
 
-1. If you just need **the core functionalities of ATEK**, including data pre-processing, data loading, and visualization, you can simply [install ATEK's core lib](#core-lib-installation)
+1. If you just need the core functionalities of ATEK, including data pre-processing, data loading, and visualization, you can simply [install **ATEK core lib**](#core-lib-installation)
 2. If you want to run the CubeRCNN demos and all task-specific evaluation benchmarking, you can follow this guide to [install **full dependencies**](#install-all-dependencies-using-mambaconda).
 
 ## Core lib installation

diff --git a/docs/example_cubercnn_customization.md b/docs/example_cubercnn_customization.md
@@ -139,22 +139,18 @@ print(f"Loading WDS into CubeRCNN format, each sample contains the following key
 
 ## CuberCNN model trainng / inference
 
-With the created Pytorch DataLoader, user will be able to easily run model training / inference for CubeRCNN model:
+With the created Pytorch DataLoader, user will be able to easily run model training or inference for CubeRCNN model.
 
+**Training script**
 ```python
-# Load pre-trained model for training / inference
-model_config, model = create_inference_model(
-    model_config_file, model_ckpt_path, use_cpu_only = use_cpu_only
-)
+# Load pre-trained model for training
+model_config, model = create_training_model(model_config_file, model_ckpt_path)
 
-# training / inference loop
+# Training loop
 for cubercnn_input_data in tqdm(
     cubercnn_dataloader,
-    desc="Training / Inference progress: ",
+    desc="Training progress: ",
 ):
-    # Inference step
-    cubercnn_model_output = model(cubercnn_input_data)
-
     # Training step
     loss = model(cubercnn_input_data)
     losses = sum(loss.values())
@@ -163,3 +159,21 @@ for cubercnn_input_data in tqdm(
     optimizer.step()
 ...
 ```
+
+
+**Inference script**
+
+```python
+# Load pre-trained model for inference
+model_config, model = create_inference_model(model_config_file, model_ckpt_path)
+
+# Inference loop
+for cubercnn_input_data in tqdm(
+    cubercnn_dataloader,
+    desc="Training progress: ",
+):
+    # Inference step
+    cubercnn_model_output = model(cubercnn_input_data)
+    ...
+```
+# Inference step
diff --git a/docs/example_training.md b/docs/example_training.md
@@ -88,4 +88,4 @@ Here is the tensorboard results for ASE trained results, where the left 2 figure
 
 Once model training is finished, user can proceed to [example_inference.md] to run model inference on the trained weights.
 
-We also provided 2 sets of CubeRCNN trained weights by us, one on ASE 10K dataset, the other on ADT dataset. The weights can be downloaded [here](https://www.projectaria.com/async/sample/download/?bucket=adt&filename=ATEK_example_model_weights.tar)
+We also provided 2 sets of CubeRCNN trained weights by us, one on ASE 10K dataset, the other on ADT dataset. The weights can be downloaded [here](https://www.projectaria.com/async/sample/download/?bucket=atek&filename=ATEK_example_model_weights.tar). By downloading this file, you acknowledge that you have read, understood, and agree to be bound by the terms of the [CC-BY-NC 4.0 license](https://creativecommons.org/licenses/by-nc/4.0/deed.en) software license.
diff --git a/docs/images/atek_github_video_small.webm b/docs/images/atek_github_video_small.webm
diff --git a/docs/preprocessing.md b/docs/preprocessing.md
@@ -17,9 +17,9 @@ Before ATEK, users will need to hand-craft all these code on their own, which is
 
 ## Simple customization through preprocessing config
 
-ATEK allows user to **customize the preprocessing workflow by simply modifying the preprocessing configuration yaml file** (see [preprocessing_configurations.md](./preprocessing_configurations.md) for details).
+ATEK allows user to **customize the preprocessing workflow by simply modifying the preprocessing configuration yaml file** (see [Preprocessing configurations page](./preprocessing_configurations.md) for details).
 
-The following is the core code to load an open Aria data sequence, preprocess according to a given configuration file, and write the preprocessed results to disk as WebDataset ([full example](../examples/Demo_1_data_preprocessing.ipynb)). We also use a visualization library based on `ReRun` to visualize the preprocessed results. The results are stored as `Dict` in memory containing tensors, strings, and sub-dicts, and also saved to local disk in WebDataset (WDS) format for further use.
+The following is the core code to load an open Aria data sequence, preprocess according to a given configuration file, and write the preprocessed results to disk as WebDataset. We also use a visualization library based on `ReRun` to visualize the preprocessed results. The results are stored as `Dict` in memory containing tensors, strings, and sub-dicts, and also saved to local disk in WebDataset (WDS) format for further use.
 
 ```python
 from omegaconf import OmegaConf
@@ -35,11 +35,11 @@ num_samples = preprocessor.process_all_samples(write_to_wds_flag = True, viz_fla
 
 ### `create_general_atek_preprocessor_from_conf`
 
-This is a factory method that initializes a `GeneralAtekPreprocessor` based on a configuration object. It selects the appropriate preprocessor configuration for ATEK using the `atek_config_name` field in the provided Omega configuration. See [here](./preprocessing_configurations.md) for currently supported configs.
+This is a factory method that initializes a `GeneralAtekPreprocessor` based on a configuration object. It selects the appropriate preprocessor configuration for ATEK using the `atek_config_name` field in the provided Omega configuration.
 
 #### Parameters
 
-- **conf** (`DictConfig`): Configuration object with preprocessing settings. The `atek_config_name` key specifies the preprocessor type,
+- **conf** (`DictConfig`): Configuration object with preprocessing settings.
 - **raw_data_folder** (`str`): Path to the folder with raw data files.
 - **sequence_name** (`str`): Name of the data sequence to process.
 - **output_wds_folder** (`Optional[str]`): Path for saving preprocessed data in WebDataset (WDS) format. If `None`, data is not saved in WDS format.