Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
34f0f33
build: add Pyright configuration with type safety improvements
dsmedia Oct 26, 2025
c4e68e5
feat: add gallery examples registry mapping examples to datasets
dsmedia Oct 26, 2025
bacce84
feat: update gallery examples for Altair v6 canonical names
dsmedia Oct 26, 2025
fed676a
chore: add .worktrees to gitignore
dsmedia Feb 4, 2026
3034d3f
refactor: Data Package v2 compliance for gallery_examples
dsmedia Feb 5, 2026
5968143
fix: formatting
dsmedia Feb 5, 2026
073ee98
fix: format and lint
dsmedia Feb 5, 2026
7f474e9
Remove accidental Mypy type ignores
dsmedia Feb 5, 2026
4b44472
feat(techniques): add stack and timeUnit transform detection
dsmedia Feb 6, 2026
57cad48
feat(techniques): add layout:* category for Vega-only algorithmic tra…
dsmedia Feb 6, 2026
682c44e
feat(techniques): add Vega-only geo transforms (graticule, geopoint, …
dsmedia Feb 6, 2026
b97a418
feat(techniques): add Vega-only hierarchy data transforms (stratify, …
dsmedia Feb 6, 2026
239748a
feat(techniques): add Vega-only data transforms (kde2d, dotbin, count…
dsmedia Feb 6, 2026
2725af6
refactor(techniques): reorganize TECHNIQUE_PATTERNS with clear sectio…
dsmedia Feb 6, 2026
dfbc7b8
feat(techniques): regenerate gallery_examples.json with expanded dete…
dsmedia Feb 6, 2026
e682242
chore: rebuild datapackage with expanded technique vocabulary
dsmedia Feb 6, 2026
eb2cf4e
feat(techniques): add Altair methods-syntax detection patterns
dsmedia Feb 6, 2026
04a3016
chore: rebuild datapackage after Altair pattern improvements
dsmedia Feb 6, 2026
1498b49
style: format test file with ruff
dsmedia Feb 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ build
node_modules
*/**/__pycache__
.venv
*/**/*.ipynb
*/**/*.ipynb
.worktrees
17 changes: 17 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,23 @@ For datasets requiring processing:
- Ensure reproducibility with deterministic outputs and fixed random seeds when applicable
- See `scripts/flights.py` as an example

### Gallery Examples Registry

The `gallery_examples.json` file catalogs examples from Vega, Vega-Lite, and Altair galleries, tracking which datasets each example uses.

**When to regenerate:**
- After new releases of Vega, Vega-Lite, or Altair that add/remove examples
- When examples are renamed or reorganized upstream
- Periodically (e.g., quarterly) to pick up new examples

**Commands:**
```bash
npm run update-gallery # Regenerate the file
npm run update-gallery -- --dry-run --verbose # Test without writing
```

Configuration lives in `_data/gallery_examples.toml`. Runtime is ~2-4 minutes (fetches ~470 specs).

## Metadata and Documentation

We follow the [Data Package Standard 2.0](https://datapackage.org/) with:
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ Visualizations built with these datasets are showcased in several galleries:
- [Altair Example Gallery](https://altair-viz.github.io/gallery/index.html)
- [Observable Vega Examples](https://observablehq.com/@vega)

The [gallery_examples.json](gallery_examples.json) file provides a cross-reference catalog mapping ~470 gallery examples to the datasets they use, enabling dataset-first exploration across all three ecosystems.

## Data Usage Note

- The datasets are designed for instructional and demonstration purposes.
Expand Down
77 changes: 77 additions & 0 deletions _data/datapackage_additions.toml
Original file line number Diff line number Diff line change
Expand Up @@ -1888,3 +1888,80 @@ path = "https://download.geonames.org/export/zip/"
name = "CC-BY-4.0"
title = "Creative Commons Attribution 4.0 International"
path = "https://creativecommons.org/licenses/by/4.0/"

# ==============================================================================
# Gallery Examples Registry (meta-resource, not in /data/)
# ==============================================================================

[[resources]]
path = "gallery_examples.json"
description = """Cross-reference catalog mapping gallery examples to vega-datasets resources.
Tracks which datasets from the vega-datasets collection are used in example
visualizations across Vega, Vega-Lite, and Altair galleries. Enables discovery
of visualization patterns by dataset or technique, supports learning paths,
and provides structured context for AI coding assistants."""

[[resources.sources]]
title = "Vega Gallery"
path = "https://vega.github.io/vega/examples/"

[[resources.sources]]
title = "Vega-Lite Gallery"
path = "https://vega.github.io/vega-lite/examples/"

[[resources.sources]]
title = "Altair Gallery"
path = "https://altair-viz.github.io/gallery/"

[[resources.licenses]]
name = "BSD-3-Clause"
title = "The 3-Clause BSD License"
path = "https://opensource.org/license/bsd-3-clause"

[resources.schema]

[[resources.schema.fields]]
name = "id"
type = "integer"
description = "Unique sequential identifier for the example"

[[resources.schema.fields]]
name = "gallery_name"
type = "string"
description = "Name of the gallery this example belongs to"
constraints = { enum = ["vega", "vega-lite", "altair"] }

[[resources.schema.fields]]
name = "example_name"
type = "string"
description = "Human-readable example title"

[[resources.schema.fields]]
name = "example_url"
type = "string"
description = "URL to rendered example in the gallery"

[[resources.schema.fields]]
name = "spec_url"
type = "string"
description = "URL to source specification or code"

[[resources.schema.fields]]
name = "categories"
type = "array"
description = "Tags or categories for the example (e.g., 'Bar Charts', 'Interactive')"

[[resources.schema.fields]]
name = "description"
type = "string"
description = "Optional description of what the example demonstrates (may be null)"

[[resources.schema.fields]]
name = "datasets"
type = "array"
description = "Dataset names referencing resource.name in this package"

[[resources.schema.fields]]
name = "techniques"
type = "array"
description = "Visualization techniques used (e.g., 'transform:window', 'interaction:selection')"
103 changes: 103 additions & 0 deletions _data/gallery_examples.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Configuration for gallery_examples collection script
# This file externalizes URLs, mappings, and settings to make maintenance easier.

# ============================================================================
# Altair Dataset Name Mappings
# ============================================================================
#
# LEGACY SUPPORT: This section provides backward compatibility for Altair v5.x.
#
# As of Altair v6 (PR #3859, merged 2025-10-26), Altair uses canonical
# vega-datasets names directly. When tracking Altair main branch (v6+),
# this mapping section should remain empty.
#
# Manual mappings may be added if:
# 1. Tracking older Altair releases (v5.x) that use camelCase API names
# 2. Custom Altair forks with different naming conventions
# 3. Testing against historical Altair versions
#
# Format: altair_api_name = "canonical_datapackage_name"

[altair.name_mapping]
# Empty mapping section - kept for backward compatibility with Altair v5.x
#
# Altair v6+ (PR #3859, merged 2025-10-26) uses canonical vega-datasets names
# directly, so no mappings are needed when tracking Altair main branch.
#
# VERSION TRACKING: This configuration currently tracks Altair main branch (v6+).
# The git reference is hardcoded in generate_gallery_examples.py (line 1135).
# To track a specific Altair version, you must modify the Python script to use
# a different git ref (e.g., "v5.4.1" or "v6.0.0" instead of "main").
#
# TESTING WITH ALTAIR V5: If you need to regenerate examples from Altair v5.x
# (e.g., for comparison or regression testing):
# 1. Modify generate_gallery_examples.py to fetch from Altair v5.x tag
# 2. Uncomment the camelCase mappings below
# 3. Run the script to regenerate gallery_examples.json
# 4. After testing, restore this section and the script to v6+ configuration
#
# Altair v5.x mappings (uncomment if testing with v5.x):
# londonBoroughs = "london_boroughs"
# londonCentroids = "london_centroids"
# londonTubeLines = "london_tube_lines"

# ============================================================================
# Data Source URLs
# ============================================================================
#
# URLs for fetching gallery metadata and dataset catalog.
# All URLs point to the main/master branch for stable releases.

[sources]
# Vega-datasets canonical dataset catalog
datapackage_url = "https://raw.githubusercontent.com/vega/vega-datasets/main/datapackage.json"

# Vega-Lite gallery examples metadata
vega_lite_examples_url = "https://raw.githubusercontent.com/vega/vega-lite/main/site/_data/examples.json"

# Vega gallery examples metadata
vega_examples_url = "https://raw.githubusercontent.com/vega/vega/main/docs/_data/examples.json"

# Altair example directories
# The script fetches Python files from both syntax styles
#
# STABILITY NOTE: The script currently fetches from Altair's main branch
# (hardcoded in generate_gallery_examples.py). This tracks the latest Altair v6+
# development but creates a moving target dependency. For production stability,
# consider pinning to a specific Altair release tag after v6.0.0 is released.
#
# Current approach assumes Altair main is stable post-v6 merge (PR #3859).
altair_examples_dirs = [
"tests/examples_methods_syntax",
"tests/examples_arguments_syntax",
]

# ============================================================================
# Output Configuration
# ============================================================================
#
# Default output settings for the generated JSON file.

[output]
# Default output file path (relative to repository root)
# Can be overridden with --output CLI argument
default_output_path = "gallery_examples.json"

# Dry run mode (doesn't write output file)
# Can be overridden with --dry-run CLI flag
dry_run = false

# ============================================================================
# Network Settings
# ============================================================================
#
# HTTP request configuration for fetching remote resources.

[network]
# Timeout in seconds for HTTP requests
# Used for fetching metadata files and individual example specifications
timeout = 30

# Maximum number of retries for failed requests (future use)
# Currently not implemented, but reserved for potential retry logic
max_retries = 3
91 changes: 90 additions & 1 deletion datapackage.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
}
],
"version": "3.2.1",
"created": "2026-02-02T13:19:39.437222+00:00",
"created": "2026-02-06T03:35:56.119502+00:00",
"resources": [
{
"name": "icon_7zip",
Expand Down Expand Up @@ -3993,6 +3993,95 @@
}
]
}
},
{
"name": "gallery_examples",
"type": "json",
"description": "Cross-reference catalog mapping gallery examples to vega-datasets resources.\nTracks which datasets from the vega-datasets collection are used in example\nvisualizations across Vega, Vega-Lite, and Altair galleries. Enables discovery\nof visualization patterns by dataset or technique, supports learning paths,\nand provides structured context for AI coding assistants.",
"licenses": [
{
"name": "BSD-3-Clause",
"title": "The 3-Clause BSD License",
"path": "https://opensource.org/license/bsd-3-clause"
}
],
"sources": [
{
"title": "Vega Gallery",
"path": "https://vega.github.io/vega/examples/"
},
{
"title": "Vega-Lite Gallery",
"path": "https://vega.github.io/vega-lite/examples/"
},
{
"title": "Altair Gallery",
"path": "https://altair-viz.github.io/gallery/"
}
],
"path": "gallery_examples.json",
"scheme": "file",
"format": "json",
"mediatype": "application/json",
"encoding": "utf-8",
"hash": "sha1:35e63b2ef7e5d9d802710e81bb10ab62898563f0",
"bytes": 300960,
"schema": {
"fields": [
{
"name": "id",
"type": "integer",
"description": "Unique sequential identifier for the example"
},
{
"name": "gallery_name",
"type": "string",
"description": "Name of the gallery this example belongs to",
"constraints": {
"enum": [
"vega",
"vega-lite",
"altair"
]
}
},
{
"name": "example_name",
"type": "string",
"description": "Human-readable example title"
},
{
"name": "example_url",
"type": "string",
"description": "URL to rendered example in the gallery"
},
{
"name": "spec_url",
"type": "string",
"description": "URL to source specification or code"
},
{
"name": "categories",
"type": "array",
"description": "Tags or categories for the example (e.g., 'Bar Charts', 'Interactive')"
},
{
"name": "description",
"type": "string",
"description": "Optional description of what the example demonstrates (may be null)"
},
{
"name": "datasets",
"type": "array",
"description": "Dataset names referencing resource.name in this package"
},
{
"name": "techniques",
"type": "array",
"description": "Visualization techniques used (e.g., 'transform:window', 'interaction:selection')"
}
]
}
}
]
}
Loading