Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated READMEs for examples: LLM Embedding-Based Named Entity Recognition, nuScenes, Objectron, Open Photogrammetry Format, Raw Mesh #5653

Merged
merged 11 commits into from
Apr 4, 2024
5 changes: 5 additions & 0 deletions docs/cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
"andreasnaoum",
"Angjoo",
"Ankush",
"anns",
"arflow",
"arkit",
"arkitscene",
Expand Down Expand Up @@ -51,6 +52,7 @@
"booktitle",
"braindump",
"bringup",
"calib",
"callstack",
"callstacks",
"camino",
Expand Down Expand Up @@ -344,6 +346,7 @@
"Tete",
"Tewari",
"Texcoord",
"texcoords",
"thiserror",
"Tian",
"timepanel",
Expand Down Expand Up @@ -371,6 +374,8 @@
"upcasting",
"upsampling",
"upvote",
"UMAP",
"umap",
"urdf",
"URDF",
"ureq",
Expand Down
91 changes: 84 additions & 7 deletions examples/python/llm_embedding_ner/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
<!--[metadata]
title = "LLM Embedding-Based Named Entity Recognition"
tags = ["LLM", "embeddings", "classification", "huggingface", "text"]
description = "Visualize the BERT-based named entity recognition NER with UMAP Embeddings."
thumbnail = "https://static.rerun.io/llm-embedding/999737b3b78d762e70116bc23929ebfde78e18c6/480w.png"
thumbnail_dimensions = [480, 480]
-->

<picture>
<img src="https://static.rerun.io/llm_embedding_ner/d98c09dd6bfa20ceea3e431c37dc295a4009fa1b/full.png" alt="">
<source media="(max-width: 480px)" srcset="https://static.rerun.io/llm_embedding_ner/d98c09dd6bfa20ceea3e431c37dc295a4009fa1b/480w.png">
Expand All @@ -13,15 +13,92 @@ thumbnail_dimensions = [480, 480]
<source media="(max-width: 1200px)" srcset="https://static.rerun.io/llm_embedding_ner/d98c09dd6bfa20ceea3e431c37dc295a4009fa1b/1200w.png">
</picture>

This example visualizes [BERT-based named entity recognition (NER)](https://huggingface.co/dslim/bert-base-NER). It works by splitting text into tokens, feeding the token sequence into a large language model (BERT) to retrieve embeddings per token. The embeddings are then classified.
Visualize the [BERT-based named entity recognition (NER)](https://huggingface.co/dslim/bert-base-NER) with UMAP Embeddings.

# Used Rerun Types
[`TextDocument`](https://www.rerun.io/docs/reference/types/archetypes/text_document), [`AnnotationContext`](https://www.rerun.io/docs/reference/types/archetypes/annotation_context), [`Points3D`](https://www.rerun.io/docs/reference/types/archetypes/points3d)

# Background
This example splits text into tokens, feeds the token sequence into a large language model (BERT), which outputs an embedding per token.
The embeddings are then classified into four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC). The embeddings are projected to a 3D space using UMAP[https://umap-learn.readthedocs.io/en/latest/], and visualized together with all other data in Rerun.

# Logging and Visualizing with Rerun
The visualizations in this example were created with the following Rerun code:

## Text
The logging begins with the original text. Following this, the tokenized version is logged for further analysis, and the named entities identified by the NER model are logged separately.
All texts are logged using [`TextDocument`](https://www.rerun.io/docs/reference/types/archetypes/text_document) as a Markdown document to preserves structure and formatting.
### Original Text
```python
rr.log("text", rr.TextDocument(text, media_type=rr.MediaType.MARKDOWN))
```

### Tokenized Text
```python
rr.log("tokenized_text", rr.TextDocument(markdown, media_type=rr.MediaType.MARKDOWN))
```

### Named Entities
```python
rr.log("named_entities", rr.TextDocument(named_entities_str, media_type=rr.MediaType.MARKDOWN))
```

## UMAP Embeddings

[//]: # (The embeddings to UMAP facilitates the exploration, understanding, and evaluation of the NER model's output in a more interpretable and visually appealing manner.)

UMAP is used in this example for dimensionality reduction and visualization of the embeddings generated by a Named Entity Recognition (NER) model.
UMAP preserves the essential structure and relationships between data points, and helps in identifying clusters or patterns within the named entities.

After transforming the embeddings to UMAP, the next step involves defining labels for classes using [`AnnotationContext`](https://www.rerun.io/docs/reference/types/archetypes/annotation_context).
These labels help in interpreting the visualized data.
Subsequently, the UMAP embeddings are logged as [`Points3D`](https://www.rerun.io/docs/reference/types/archetypes/points3d) and visualized in a three-dimensional space.
The visualization can provide insights into how the NER model is performing and how different types of entities are distributed throughout the text.

To run this example use

```python
# Define label for classes and set none class color to dark gray
annotation_context = [
rr.AnnotationInfo(id=0, color=(30, 30, 30)),
rr.AnnotationInfo(id=1, label="Location"),
rr.AnnotationInfo(id=2, label="Person"),
rr.AnnotationInfo(id=3, label="Organization"),
rr.AnnotationInfo(id=4, label="Miscellaneous"),
]
rr.log("/", rr.AnnotationContext(annotation_context))
```

```python
rr.log(
"umap_embeddings",
rr.Points3D(umap_embeddings, class_ids=class_ids),
rr.AnyValues(**{"Token": token_words, "Named Entity": entity_per_token(token_words, ner_results)}),
)
```


# Run the Code
To run this example, make sure you have the Rerun repository checked out and the latest SDK installed:
```bash
# Setup
pip install --upgrade rerun-sdk # install the latest Rerun SDK
git clone [email protected]:rerun-io/rerun.git # Clone the repository
cd rerun
git checkout latest # Check out the commit matching the latest SDK release
```
Install the necessary libraries specified in the requirements file:
```bash
pip install -r examples/python/llm_embedding_ner/requirements.txt
python examples/python/llm_embedding_ner/main.py
```

You can specify your own text using
To experiment with the provided example, simply execute the main Python script:
```bash
python examples/python/llm_embedding_ner/main.py # run the example
```
You can specify your own text using:
```bash
python examples/python/llm_embedding_ner/main.py [--text TEXT]
```
If you wish to customize it, explore additional features, or save it use the CLI with the `--help` option for guidance:
```bash
main.py [--text TEXT]
python examples/python/llm_embedding_ner/main.py --help
```
101 changes: 96 additions & 5 deletions examples/python/nuscenes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,103 @@ build_args = ["--seconds=5"]
<source media="(max-width: 1200px)" srcset="https://static.rerun.io/nuscenes/64a50a9d67cbb69ae872551989ee807b195f6b5d/1200w.png">
</picture>

This example visualizes the [nuScenes dataset](https://www.nuscenes.org/) using Rerun. The dataset
contains lidar data, radar data, color images, and labeled bounding boxes.
Visualize the [nuScenes dataset](https://www.nuscenes.org/) including lidar, radar, images, and bounding boxes data.

## Used Rerun Types
[`Transform3D`](https://www.rerun.io/docs/reference/types/archetypes/transform3d), [`Points3D`](https://www.rerun.io/docs/reference/types/archetypes/points3d), [`Boxes3D`](https://www.rerun.io/docs/reference/types/archetypes/boxes3d), [`Pinhole`](https://www.rerun.io/docs/reference/types/archetypes/pinhole), [`Image`](https://ref.rerun.io/docs/python/0.14.1/common/image_helpers/#rerun.ImageEncoded)<sup>*</sup>

andreasnaoum marked this conversation as resolved.
Show resolved Hide resolved
# Logging and Visualizing with Rerun

The nuScenes dataset includes data from a full suite of sensors on autonomous vehicles: 6 cameras, 1 LIDAR, 5 RADAR, GPS, and IMU.

The visualizations in this example were created with the following Rerun code:

## Sensor Calibration

First, pinhole cameras and sensor poses are initialized to offer a 3D view and camera perspective. This is achieved using the [`Pinhole`](https://www.rerun.io/docs/reference/types/archetypes/pinhole) and [`Transform3D`](https://www.rerun.io/docs/reference/types/archetypes/transform3d) archetypes.

```python
rr.log(
f"world/ego_vehicle/{sensor_name}",
rr.Transform3D(
translation=calibrated_sensor["translation"],
rotation=rr.Quaternion(xyzw=rotation_xyzw),
from_parent=False,
),
timeless=True,
)
```

```python
rr.log(
f"world/ego_vehicle/{sensor_name}",
rr.Pinhole(
image_from_camera=calibrated_sensor["camera_intrinsic"],
width=sample_data["width"],
height=sample_data["height"],
),
timeless=True,
)
```

## Vehicle Pose

As the vehicle is moving, its pose needs to be updated. Consequently, the positions of pinhole cameras and sensors must also be adjusted using [`Transform3D`](https://www.rerun.io/docs/reference/types/archetypes/transform3d).
```python
rr.log(
"world/ego_vehicle",
rr.Transform3D(
translation=ego_pose["translation"],
rotation=rr.Quaternion(xyzw=rotation_xyzw),
from_parent=False,
),
)
```

## LiDAR Data
LiDAR data is logged as [`Points3D`](https://www.rerun.io/docs/reference/types/archetypes/points3d) archetype.
```python
rr.log(f"world/ego_vehicle/{sensor_name}", rr.Points3D(points, colors=point_colors))
```

## Camera Data
Camera data is logged as encoded images using [`ImageEncoded`](https://ref.rerun.io/docs/python/0.14.1/common/image_helpers/#rerun.ImageEncoded).
```python
rr.log(f"world/ego_vehicle/{sensor_name}", rr.ImageEncoded(path=data_file_path))
```

## Radar Data
Radar data is logged similar to LiDAR data, as [`Points3D`](https://www.rerun.io/docs/reference/types/archetypes/points3d).
```python
rr.log(f"world/ego_vehicle/{sensor_name}", rr.Points3D(points, colors=point_colors))
```

## Annotations

Annotations are logged as [`Boxes3D`](https://www.rerun.io/docs/reference/types/archetypes/boxes3d), containing details such as object positions, sizes, and rotation.
```python
rr.log("world/anns", rr.Boxes3D(sizes=sizes, centers=centers, rotations=rotations, class_ids=class_ids))
```


# Run the Code
To run this example, make sure you have Python version at least 3.9, the Rerun repository checked out and the latest SDK installed:
```bash
# Setup
pip install --upgrade rerun-sdk # install the latest Rerun SDK
git clone [email protected]:rerun-io/rerun.git # Clone the repository
cd rerun
git checkout latest # Check out the commit matching the latest SDK release
```
Install the necessary libraries specified in the requirements file:
```bash
pip install -r examples/python/nuscenes/requirements.txt
python examples/python/nuscenes/main.py
```

Requires at least Python 3.9 to run.
To experiment with the provided example, simply execute the main Python script:
```bash
python examples/python/nuscenes/main.py # run the example
```
If you wish to customize it, explore additional features, or save it use the CLI with the `--help` option for guidance:
```bash
python examples/python/nuscenes/main.py --help
```
96 changes: 93 additions & 3 deletions examples/python/objectron/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,101 @@ build_args = ["--frames=150"]
<img src="https://static.rerun.io/objectron/8ea3a37e6b4af2e06f8e2ea5e70c1951af67fea8/full.png" alt="Objectron example screenshot">
</picture>

Example of using the Rerun SDK to log the [Objectron](https://github.com/google-research-datasets/Objectron) dataset.

> The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment.
Visualize the [Google Research Objectron](https://github.com/google-research-datasets/Objectron) dataset, which contains camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment.

## Used Rerun Types
[`Points3D`](https://www.rerun.io/docs/reference/types/archetypes/points3d), [`Boxes3D`](https://www.rerun.io/docs/reference/types/archetypes/boxes3d), [`Image`](https://ref.rerun.io/docs/python/0.14.1/common/image_helpers/#rerun.ImageEncoded)<sup>*</sup>, [`Transform3D`](https://www.rerun.io/docs/reference/types/archetypes/transform3d), [`Pinhole`](https://www.rerun.io/docs/reference/types/archetypes/pinhole)

andreasnaoum marked this conversation as resolved.
Show resolved Hide resolved
# Logging and Visualizing with Rerun

The visualizations in this example were created with the following Rerun code:

## Timelines

For each processed frame, all data sent to Rerun is associated with the two [`timelines`](https://www.rerun.io/docs/concepts/timelines) `time` and `frame_idx`.

```python
rr.set_time_sequence("frame", sample.index)
rr.set_time_seconds("time", sample.timestamp)
```

## Video

Pinhole camera is utilized for achieving a 3D view and camera perspective through the use of the [`Pinhole`](https://www.rerun.io/docs/reference/types/archetypes/pinhole) and [`Transform3D`](https://www.rerun.io/docs/reference/types/archetypes/transform3d) archetypes.

```python
rr.log(
"world/camera",
rr.Transform3D(translation=translation, rotation=rr.Quaternion(xyzw=rot.as_quat())),
)
```

```python
rr.log(
"world/camera",
rr.Pinhole(
resolution=[w, h],
image_from_camera=intrinsics,
camera_xyz=rr.ViewCoordinates.RDF,
),
)
```
The input video is logged as a sequence of [`ImageEncoded`](https://ref.rerun.io/docs/python/0.14.1/common/image_helpers/#rerun.ImageEncoded) objects to the `world/camera` entity.
```python
rr.log("world/camera", rr.ImageEncoded(path=sample.image_path))
```

## Sparse Point Clouds

Sparse point clouds from `ARFrame` are logged as [`Points3D`](https://www.rerun.io/docs/reference/types/archetypes/points3d) archetype to the `world/points` entity.

```python
rr.log("world/points", rr.Points3D(positions, colors=[255, 255, 255, 255]))
```

## Annotated Bounding Boxes

Bounding boxes annotated from `ARFrame` are logged as [`Boxes3D`](https://www.rerun.io/docs/reference/types/archetypes/boxes3d), containing details such as object position, sizes, center and rotation.

```python
rr.log(
f"world/annotations/box-{bbox.id}",
rr.Boxes3D(
half_sizes=0.5 * np.array(bbox.scale),
centers=bbox.translation,
rotations=rr.Quaternion(xyzw=rot.as_quat()),
colors=[160, 230, 130, 255],
labels=bbox.category,
),
timeless=True,
)
```

# Run the Code
To run this example, make sure you have Python version at least 3.9, the Rerun repository checked out and the latest SDK installed:
```bash
# Setup
pip install --upgrade rerun-sdk # install the latest Rerun SDK
git clone [email protected]:rerun-io/rerun.git # Clone the repository
cd rerun
git checkout latest # Check out the commit matching the latest SDK release
```
Install the necessary libraries specified in the requirements file:
```bash
pip install -r examples/python/objectron/requirements.txt
python examples/python/objectron/main.py
```
To experiment with the provided example, simply execute the main Python script:
```bash
python examples/python/objectron/main.py # run the example
```

You can specify the objectron recording:
```bash
python examples/python/objectron/main.py --recording {bike,book,bottle,camera,cereal_box,chair,cup,laptop,shoe}
```

If you wish to customize it, explore additional features, or save it use the CLI with the `--help` option for guidance:
```bash
python examples/python/objectron/main.py --help
```
Loading
Loading