docs: new dataset source, from vega_dataset to altair.dataset#3859
docs: new dataset source, from vega_dataset to altair.dataset#3859
Conversation
- from vega_datasets import data + from altair.datasets import data
Is that then end of the error? Was expecting the nature of the header error to follow. |
|
I was able to make a minimal working example reproducing the error with only Vega and VegaFusion: Open the Chart in the Vega Editor import json
import vegafusion as vf
vega_spec = json.loads("""
{
"$schema": "https://vega.github.io/schema/vega/v6.json",
"data": [
{
"name": "source_0",
"url": "https://cdn.jsdelivr.net/npm/vega-datasets@v3.2.0/data/co2-concentration.csv",
"format": {
"type": "csv",
"parse": {
"Date": "date"
}
}
}
]
}
""")
# This will raise the HTTP error
datasets, warnings = vf.runtime.pre_transform_datasets(vega_spec, ["source_0"])
|
|
Finally all tests are passing! @joelostblom would you be willing to do a review? |
|
@mattijn Looks like in 577ed87 you had to remove If so, would it help you if I added |
|
That dataset was removed from vega/vega-datasets a few years ago, in this PR vega/vega-datasets#187. |
|
Thanks for all the work you have done on getting the sample data integrating into altair @mattijn ! I will aim to have a look at this in more detail during next week. One question I have now already is how we are releasing and announcing the changes to the sample data import convention. My understanding is that we are not breaking anything from
Do you think we should include a similar callout somewhere in the docs (maybe in the "Specifying Data" page) to emphasize the move to |
|
On the main branch, we moved to vegalite version 6 already, so I think we can bump this as part of a major release. I like the suggestion to have a note in the specifying data section, "With the release of Altair 6 etc" |
|
Add a note in this commit b1c648f at the end of the first paragraph in this section: https://altair-viz.github.io/gallery/index.html, where the source of the datasets in the examples are discussed. |
There was a problem hiding this comment.
Looks great overall! Thanks for all your work getting this updated! I think it is really convenient that the same import convention is kept while having the data in the altair package directly.
I clicked through the entire user guides and the examples, and noticed a few places where the charts were not rendering, which I have commented on inline. I noticed a couple of things not related to this PR too which I will fix separately.
Add a note in this commit b1c648f at the end of the first paragraph in this section: https://altair-viz.github.io/gallery/index.html, where the source of the datasets in the examples are discussed.
Looks good!
Co-authored-by: Joel Ostblom <joel.ostblom@gmail.com>
address all feedback from @joelostblom by the following commit: aa5d353
|
Thanks @joelostblom for the review! Addressed all of them. Had a thorough check as well, and all charts are rendering now, as far as I can see. Merging this PR since all tests are green on CI 🥳 |
Altair PR #3859 (merged 2025-10-26) migrated from vega_datasets package to altair.datasets module with canonical vega-datasets naming. This updates the gallery examples collection to track Altair v6+ main branch. Changes: - Empty [altair.name_mapping] section (was: londonBoroughs → london_boroughs) - Comments now document legacy v5.x support instead of temporary workaround - Add pattern for fully qualified altair.datasets.data.X.url syntax - Refactor extract_altair_api_datasets() with explicit name_mapping parameter - Regenerate gallery_examples.json (470 examples, all with canonical names) Type safety improvements: - extract_altair_api_datasets() now accepts name_mapping as parameter instead of accessing global _config directly - Explicit None default for Altair v6+ (no mapping needed) - Better testability and separation of concerns Backward compatibility: - Mapping section preserved (empty) with documentation for v5.x users - Historical camelCase examples commented out for reference - Function signature supports both v5 (with mapping) and v6 (without) Configuration notes: - Currently tracks Altair main branch (v6+ development) - Git ref hardcoded in Python script (line 1135) - documented in TOML - Stability note added: consider pinning to release tag when v6.0.0 available - Testing procedure documented for v5.x regression testing All three galleries (Vega, Vega-Lite, Altair) now use consistent canonical dataset naming from datapackage.json. Related: vega/altair#3859 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Updated test expectations and example files to work with the new vega-datasets source. Fixed field name references from underscores to spaces (eg., IMDB_Rating → IMDB Rating) and updated row count expectations in transformed data tests.