feat: Add Species Habitat Dataset for Faceted Map Examples #684

dsmedia · 2025-02-11T03:11:05Z

will close #683

This PR adds a new dataset to vega-datasets containing county-level species habitat distribution data for the United States. The dataset is designed to support categorical faceted map examples in Vega visualization libraries, addressing needs discussed in vega/altair#1711 and vega/vega-datasets#683.

Implementation Status

To Do

Dataset Details

Current Implementation (`species.csv`)

Structure

Format: CSV
Content: County-level habitat percentages for four US species (American Bullfrog, American Robin, White-tailed Deer, Common Gartersnake)
Location: data/ directory
Rows: 11,651
Columns: 6

Fields

Field	Type	Description
`item_id`	string	Unique identifier for the species data item on ScienceBase
`common_name`	string	Common name of the species (e.g., "American Bullfrog")
`scientific_name`	string	Scientific name of the species (e.g., "Lithobates catesbeianus")
`gap_species_code`	string	GAP Species Code, a unique identifier for the species within the GAP dataset (e.g., "aAMBUx")
`county_id`	int	Combined state and county FIPS code, identifying the US county. Four or five digits (i.e. does not pad with leading zero)
`habitat_yearround_pct`	float	Percentage of the county area that is classified as year-round suitable habitat for the species (rounded to 4 decimal places)

Data Generation

The dataset is generated using scripts/species.py, which implements the approach suggested by @mattijn in this comment. The script:

Downloads habitat raster data from USGS ScienceBase
Processes county-level habitat percentages using exactextract
Generates data/species.csv

Known Issues

Runtime warning: Spatial reference system of input features does not exactly match raster
Species column currently uses USGS codes (e.g., "aAMBUx", "bAMROx")
- Will be updated to use descriptive species names
Percentage precision to be optimized for typical use cases
Check for documentation purposes if Alaska data is out of scope

Future work

Consider expanding the dataset (in a backward compatible way) to include summer and winter habitat data, and summer/winter/all_season data on range.

Validation of Habitat Percentage Calculation

This image compares the USGS potential habitat map (within this zip file) for bullfrogs (zoomed into the southeastern United States, for clarity) with the output of our code, demonstrating the correct implementation of zonal statistics.

Left: USGS potential habitat map for bullfrogs. This is a raster dataset showing predicted suitable habitat (purple areas) at a 30-meter resolution. It is not aggregated by any administrative boundaries.
Right: Generated choropleth map showing the percentage of suitable habitat within each county. Lighter colors indicate a higher percentage of the county is classified as suitable habitat, while darker colors indicate a lower percentage.

The code successfully overlays the USGS raster data with county boundaries (from us-10m.json, a vector dataset) and uses exactextract to calculate the percentage of each county that falls within the USGS-defined habitat. The resulting map visually confirms that the habitat percentages are being calculated correctly, with higher percentages in counties that overlap significantly with the USGS's predicted habitat area.

Species Habitat Visualization Code

import altair as alt
import pyarrow as pa
import requests
import io
from vega_datasets import data

# Download and load the Arrow file
url = "https://github.com/vega/vega-datasets/raw/c4ff52d22bfa2e70a1b40554cc4c56df216838f3/data/species.arrow"
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
table = pa.ipc.RecordBatchFileReader(io.BytesIO(response.content)).read_all()

# Disable row limit for Altair (if using vegafusion, this may not be needed)
alt.data_transformers.disable_max_rows()

# Load US counties topology
counties = alt.topo_feature(data.us_10m.url, 'counties')

# Filter for the specific species ---
target_item_id = "58fa3f0be4b0b7ea54524859"

# Use a try-except block for more robust error handling
try:
    # Filter the table based on 'item_id'
    filtered_table = table.filter(pa.compute.equal(table.column('item_id'), target_item_id))

    # Check if the item_id was found
    if filtered_table.num_rows == 0:
        raise ValueError(f"item_id '{target_item_id}' not found in the dataset.")

    # Get the species name (assuming it's unique for the given item_id)
    species_name = filtered_table.column('CommonName')[0].as_py()

except pa.lib.ArrowInvalid as e:
    raise ValueError(f"Invalid item_id format or column not found: {e}") from e
except Exception as e:
    raise RuntimeError(f"An unexpected error occurred: {e}") from e

# --- END MODIFICATION ---

# Create a single chart for the filtered species
chart = alt.Chart(counties, title=species_name).mark_geoshape(tooltip=True).encode(
    color=alt.Color(
        'percent_habitat:Q',
        scale=alt.Scale(
            domain=[0, 1],
            scheme='viridis',
            zero=True,
            nice=False
        ),
        title='Habitat Percentage',
        legend=alt.Legend(format=".0%")
    ),
    tooltip=[
        alt.Tooltip('county_id:N', title='County ID'),
        alt.Tooltip('percent_habitat:Q', title='Habitat %', format='.2%')
    ]
).transform_lookup(
    lookup='id',
    from_=alt.LookupData(
        data=filtered_table,  # Use the filtered table directly
        key='county_id',
        fields=['percent_habitat']
    )
).transform_filter(
    "indexof(['2', '15'], '' + floor(datum.id / 1000)) == -1"
).transform_calculate(
    percent_habitat="datum.percent_habitat === null ? 0 : datum.percent_habitat"
).project(type='albersUsa').properties(
    width=600, height=400
)

# Display the chart
chart
</details>

- Replace direct URL downloads with ScienceBase API client (sciencebasepy) - Remove niquests dependency in favor of native ScienceBase download handling - Enhance logging for better debugging of file downloads and extractions - Add explicit ZIP file cleanup after TIF extraction - Update dependencies to include sciencebasepy and setuptools

dsmedia · 2025-02-13T12:25:06Z

Just as a reference for development, this will generate a complete list of species and all species-level metadata.

Generation script

# /// script
# requires-python = ">=3.8"
# dependencies = [
#     "pandas",
#     "sciencebasepy",
#     "tqdm",
#     "requests",
# ]
# ///
"""
This script retrieves and processes species identifier data from ScienceBase.

It fetches all child items under a specified parent item ID, extracts identifiers
like ECOS and ITIS codes, and compiles the data into a CSV file. The script
utilizes parallel processing to efficiently handle a large number of items and
includes error handling and retry mechanisms for robustness.
"""
import pandas as pd
from sciencebasepy import SbSession
from collections import defaultdict
from tqdm import tqdm
import concurrent.futures
import time
import requests  # Import the requests library


def get_all_item_ids(parent_id: str = "527d0a83e4b0850ea0518326") -> list[str]:
    """
    Retrieves all child item IDs from a given ScienceBase parent item.

    This function efficiently fetches all item IDs that are children of the
    specified parent item ID. It is optimized for performance by using the
    `get_child_ids` method from the ScienceBase API.

    Args:
        parent_id: The ScienceBase item ID of the parent item.
                   Defaults to "527d0a83e4b0850ea0518326" (Habitat Map parent item).

    Returns:
        A list of strings, where each string is a ScienceBase item ID.
        Returns an empty list if there are errors or no child items found.
    """
    print("Retrieving species item IDs from ScienceBase...")
    sb = SbSession()

    try:
        parent_item = sb.get_item(parent_id)
        print(f"Found parent item: '{parent_item.get('title', 'No Title')}'")

        all_ids = list(sb.get_child_ids(parent_id))  # Efficiently get all child IDs

        print(f"Found {len(all_ids)} species items.")
        return all_ids

    except Exception as e:
        print(f"Error retrieving items from ScienceBase: {e}")
        import traceback
        traceback.print_exc()  # Print full traceback for debugging
        return []


def process_single_item(item_id: str, sb: SbSession) -> dict:
    """
    Processes a single ScienceBase item to extract relevant identifier data.

    This function fetches the JSON representation of a ScienceBase item, extracts
    the title and identifiers (like ECOS, ITIS), and returns the data as a dictionary.
    It includes error handling and retry logic for network issues, particularly
    HTTP errors.

    Args:
        item_id: The ScienceBase item ID to process.
        sb: An authenticated SbSession object for interacting with ScienceBase.

    Returns:
        A dictionary containing the extracted item data, including 'item_id',
        'title', and any identifiers found (e.g., 'ECOS', 'ITIS').
        Returns None if the item could not be processed due to errors,
        after retries in case of transient network issues, or if the item is not found (404).
    """
    try:
        item_json = sb.get_item(item_id)
        title = item_json.get('title', 'Unknown Title')

        item_data = defaultdict(lambda: 'Not Available')  # Default value for missing identifiers
        item_data['item_id'] = item_id
        item_data['title'] = title

        if 'identifiers' in item_json:
            for identifier in item_json['identifiers']:
                scheme = identifier.get('scheme', 'Unknown Scheme')
                key = identifier.get('key', 'No Value')
                clean_scheme = scheme.split('/')[-1] if '/' in scheme else scheme  # Extract last part of scheme
                item_data[clean_scheme] = key

        return dict(item_data)

    except requests.exceptions.HTTPError as e:
        print(f"\nHTTPError processing item {item_id}: {e}")
        if e.response.status_code == 404:
            print(f"Item {item_id} not found (404 error). Skipping.")
            return None  # Indicate item not found
        else:
            retries = 3
            for i in range(retries):
                try:
                    print(f"Retrying item {item_id} (attempt {i+1}/{retries})...")
                    time.sleep(2**i)  # Exponential backoff
                    item_json = sb.get_item(item_id)  # Retry fetching the item
                    title = item_json.get('title', 'Unknown Title')

                    item_data = defaultdict(lambda: 'Not Available')
                    item_data['item_id'] = item_id
                    item_data['title'] = title
                    if 'identifiers' in item_json:
                        for identifier in item_json['identifiers']:
                            scheme = identifier.get('scheme', 'Unknown Scheme')
                            key = identifier.get('key', 'No Value')
                            clean_scheme = scheme.split('/')[-1] if '/' in scheme else scheme
                            item_data[clean_scheme] = key
                    return dict(item_data)
                except requests.exceptions.RequestException:
                    if i == retries - 1:
                        print(f"Failed to retrieve item {item_id} after multiple retries.")
                        return None  # Indicate failure after retries
    except Exception as e:
        print(f"\nError processing item {item_id}: {e}")
        return None  # Indicate processing failure


def explore_species_identifiers_parallel(item_ids: list[str], max_workers: int = 10) -> pd.DataFrame:
    """
    Explores and extracts identifiers from ScienceBase items in parallel.

    This function processes a list of ScienceBase item IDs concurrently using
    a thread pool executor. It fetches data for each item using the
    `process_single_item` function and aggregates the results into a pandas DataFrame.

    Args:
        item_ids: A list of ScienceBase item IDs to process.
        max_workers: The maximum number of threads to use for parallel processing.
                     Adjust this value based on your system and network to optimize
                     performance without overloading resources.

    Returns:
        A pandas DataFrame where each row represents a ScienceBase item,
        and columns include 'item_id', 'title', and identifier schemes
        (e.g., 'ECOS', 'ITIS'). Returns an empty DataFrame if no data is processed.
    """
    sb = SbSession()  # Create a single session for all threads
    all_schemes = set()
    results = []

    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [executor.submit(process_single_item, item_id, sb) for item_id in item_ids]

        for future in tqdm(concurrent.futures.as_completed(futures), total=len(item_ids), desc="Processing species items"):
            result = future.result()
            if result:
                results.append(result)
                all_schemes.update(key for key in result if key not in ('item_id', 'title')) # Update schemes efficiently

    if not results:
        return pd.DataFrame()  # Return empty DataFrame if no results

    df = pd.DataFrame(results)

    # Ensure all identifier scheme columns are present and reorder
    for scheme in all_schemes:
        if scheme not in df.columns:
            df[scheme] = 'Not Available'
    cols = ['item_id', 'title'] + sorted(list(all_schemes))
    df = df[cols]

    return df


def main():
    """
    Main function to execute the species identifier exploration script.

    This function orchestrates the process of:
    1. Retrieving all species item IDs from ScienceBase.
    2. Processing these items in parallel to extract identifiers.
    3. Generating summary statistics of the extracted data.
    4. Saving the results to a CSV file named 'all_species_identifiers.csv'.
    5. Displaying the first few rows of the resulting DataFrame.
    """
    all_ids = get_all_item_ids()  # Get all species item IDs

    if not all_ids:
        print("No species IDs found. Exiting.")
        return

    print("\nProcessing species item identifiers...")
    df = explore_species_identifiers_parallel(all_ids, max_workers=10)  # Process in parallel

    if df.empty:
        print("No data was processed successfully. Exiting.")
        return

    print("\nIdentifier Summary:")
    print(f"Total species items processed: {len(df)}")
    print("\nIdentifier columns found and count of non 'Not Available' values:")
    for col in df.columns:
        # Corrected line to exclude NaN values from the count
        non_empty_count = df[col].ne('Not Available').sum() - df[col].isna().sum()
        print(f"- {col}: {non_empty_count} items have this identifier")

    output_file = 'all_species_identifiers.csv'
    df.to_csv(output_file, index=False)
    print(f"\nResults saved to '{output_file}'")

    print("\nFirst 5 rows of the data:")
    pd.set_option('display.max_columns', None)  # Display all columns
    pd.set_option('display.max_colwidth', None) # Display full column width
    print(df.head()) # Replaced display(df.head()) with print()


if __name__ == "__main__":
    main()

USGS report with domain-specific explanations

💡Note : initial character of GAP_SpeciesCode refers to: amphibians, birds, mammals, reptiles.

💡Note : The dataset includes 1,590 species and 129 subspecies. The hierarchical relationship between species and subspecies can be inferred from the final character of GAP_SpeciesCode. For example, mPASHc and mPASHp refer to Sorex pacificus cascadensis and Sorex pacificus pacificus, respectively, which is part of mPASHx, Pacific Shrew. The species parent should contain the habitat locations for all subspecies.

Column Name	Sample Values
item_id	58fa4341e4b0b7ea54524b8f; 58fa6abfe4b0b7ea545258e1; 58fa6b9ce4b0b7ea54525905
title	San Simeon Slender Salamander (Batrachoseps incognitus) aSSSSx_CONUS_2001v1 Habitat Map; Hooded Skunk (Mephitis macroura) mHOSKx_CONUS_2001v1 Habitat Map; Santa Rosa Island Fox (Urocyon littoralis santarosae) mISFOr_CONUS_2001v1 Habitat Map
CommonName	San Simeon Slender Salamander; Hooded Skunk; Santa Rosa Island Fox
GAP_SpeciesCode	aSSSSx; mHOSKx; mISFOr
ScientificName	Batrachoseps incognitus; Mephitis macroura; Urocyon littoralis santarosae
doi	doi:10.5066/F7X065BT; doi:10.5066/F7FJ2F60; doi:10.5066/F7WM1BSQ
itis_tsn_validMatch	668244; 180563; 726908
itis_tsn_validSpeciesConceptEncompassingUnrecognizedSpeciesConcept	174233; 174092; 174233
itis_tsn_validSpeciesForUnrecognizedSubspecies	180347; 180310; 552479
itis_tsn_validSpeciesFromSubspecies	997723
itis_tsn_validSpeciesFromUnrecognizedSuppopulation	552521
itis_tsn_validSpeciesWithFormerName	174024; 914103; 174238
itis_tsn_validSubspeciesFromSpecies	208992; 209163; 668673
iucn_id_verified	59125; 41634; 103798425
nsid_acceptedMatch	102298; 103207; 102640
nsid_acceptedSpeciesConceptEncompassingUnrecognizedSpeciesConcept	106286; 106231
nsid_acceptedSpeciesFromRecognizedGenus	102290; 101405; 103451
nsid_acceptedSpeciesFromSubspecies	902215
nsid_acceptedSpeciesFromUnrecognizedSubspecies	102073; 101780; 104212
nsid_acceptedSupspeciesFromSpecies	102277

Restructures the species habitat analysis script: - Implement modular architecture with ScienceBaseClient, RasterSet, and HabitatDataProcessor classes for better maintainability - Integrate sciencebasepy for ZIP-based habitat map downloads from USGS ScienceBase, with automatic cleanup of temporary files - Add multi-format output support (CSV/Parquet/Arrow) with Arrow as default, using dictionary encoding for optimized storage and performance - Enhance metadata by including species common and scientific names from ScienceBase API - Add comprehensive CLI arguments for configuration and debug logging - Improve robustness with better error handling and type annotations

dsmedia · 2025-02-16T13:43:51Z

@dangotbanned would welcome your thoughts on whether i'm on the right track here.

dangotbanned · 2025-02-16T15:01:58Z

@dangotbanned would welcome your thoughts on whether i'm on the right track here.

Thanks @dsmedia

I've only skimmed through, but could you switch from argparse to configuring via a .toml?

You'd be able to remove the defaults and the parsing logic from the script - or at least just validate the options in one place

See for more info: - https://typing.readthedocs.io/en/latest/spec/aliases.html#type-statement - https://typing.readthedocs.io/en/latest/spec/literal.html#literal

dangotbanned · 2025-02-16T16:13:29Z

@dsmedia everything appears to be working so far 🎉

scripts/species.py

* Switch to using zipfile.Path for more Pythonic ZIP file handling * Enforce expectation of exactly one TIF file per ZIP * Add error handling for unexpected file counts

Refactors `species.py` to use TOML configuration (`_data/species.toml`) instead of `argparse`, improving flexibility and maintainability. Settings include `item_ids`, `vector_fp`, `output_dir`, `output_format`, and `debug`. Relative paths (e.g., `../data/us-10m.json`) are resolved relative to the TOML file, and basic validation is added. `RasterSet.extract_tifs_from_zips()` now uses `zipfile.Path` and enforces a single `.tif` file per ZIP, raising a `RuntimeError` otherwise. Type hinting fixes are also included.

dsmedia · 2025-02-19T11:32:09Z

Currently this PR provides county-level percentages of year-round predicted habitat for four species. I'm considering expanding the dataset to include both predicted and known ranges across different seasons (summer-only, winter-only, and year-round) for the four species, matching the scope of the USGS report on this dataset. This would:

Better represent the full richness of the USGS GAP species data
Enable more complex faceted visualization examples
Provide an engaging use case for Vega's interactive selections and cross-filtering across temporal and spatial dimensions

This expanded scope would also make the dataset more versatile for the broader Vega community beyond the original faceted map use case. Before formally proposing this expansion, I'll first generate a test dataset to evaluate size, performance, and data quality.

mattijn · 2025-02-19T11:59:28Z

Thanks for being so sophisticated with this PR @dsmedia!

I'm personally not really up to date with the arrow format. If the source can be parsed into a dataframe or referenced by url in the Vega editor than it is perfect.

Currently the data is in row oriented format as it is the easiest way to facet the chart, but it means we pass the threshold of 5K rows. If we store the data column oriented we stay within the 5K rows limit, but it means there will be additional melt transforms in the specification. I'm leaning towards keeping it as is to simplify the chart specification.

I understand that you want to provide wider access to all options of the data source, but don't lose yourself in all available possibilities. What you currently have is really great already.

Once this referenced feature request is in, vega/vega-lite#9389 we can revisit this dataset to make a map like your screenshot to highlight pixel based seasonal differences for a single or few counties.

But if you do have great ideas to approach this, go for it. I just want to say that what you have is awesome already!

- Rename species columns for consistency and clarity: - GAP_Species -> gap_species_code - percent_habitat -> habitat_yearround_pct - CommonName -> common_name - ScientificName -> scientific_name - Expand module docstring with detailed information about: - Data sources and resolution - Projection details - Habitat value meanings - Output format options - Improve code comments for future extensibility - reference habitat data source in species.toml - list alternative output format options in toml The changes prepare for potential future addition of seasonal habitat data (and summer/winter habitat data) while maintaining backward compatibility.

dsmedia · 2025-02-20T04:29:46Z

I'm personally not really up to date with the arrow format. If the source can be parsed into a dataframe or referenced by url in the Vega editor than it is perfect.

Happy to revise to csv if that is deemed best here. I'm curious though if this could be a nice opportunity to create the first standalone dataset in arrow format in the repo. (flights-200k exists in this format now but exists alongside a json variant.) The arrow format was suggested by @domoritz (along with csv) if this was intended for use beyond Altair. The compactness of arrow (with its columnar format) seemed to give it the edge. @dangotbanned could perhaps speak more directly to usability/suitability of the arrow format? I do see that Altair docs indicate arrow is supported though experimental?

as a DataFrame that supports the DataFrame Interchange Protocol (contains a dataframe attribute), e.g. polars and pyarrow. This is experimental.

I understand that you want to provide wider access to all options of the data source, but don't lose yourself in all available possibilities. What you currently have is really great already.

Once this referenced feature request is in, vega/vega-lite#9389 we can revisit this dataset to make a map like your screenshot to highlight pixel based seasonal differences for a single or few counties.

But if you do have great ideas to approach this, go for it. I just want to say that what you have is awesome already!

This is great perspective. In aa4516f I've kept it as is (with the all-season habitat data only) but reformulated the column names to permit a backward-compatible update in the future. (The data column is now called habitat_yearround_pct, so it would be just a drop-in to add habitat_summer_pct and/or range_winter_pct, for example.)

Separately, @mattijn, exactextract is generating this warning in the terminal.

(RuntimeWarning: Spatial reference system of input features does not exactly match raster)

If harmless, could you help explain/document why this happens (and what it means practically), or can we revise the code to address the conditions causing the warning?

dangotbanned · 2025-02-21T08:16:37Z

Data Format Comparison

File Sizes
* species.csv: 953.5 KB

* species.arrow: 651.6 KB

* species.parquet: 82.7 KB
DataFrame Shapes (rows, columns)
* CSV: (11651, 6)

* Arrow: (11651, 6)

* Parquet: (11651, 6)

Perfect, thanks @dsmedia

In that case, a .csv is still very much within the range (by file size) of some of the larger datasets.
If you were to pick only one format, I would pick that for compatibility.

Providing multiple formats will also be a welcome choice after (vega/altair#3631), since those can be prioritised where a backend supports it 😊

Having all 3 also makes sense, now that (vega/vega#3961) looks achievable

dsmedia · 2025-02-21T12:31:00Z

Great. I'll update the PR to switch to CSV. When support for the other formats is enabled, we can consider generating parquet and/or arrow for a wider swath of the datasets.

dsmedia · 2025-02-21T13:03:27Z

@dangotbanned in the diff, I noticed that the string "git+" seems to be appended below in the datapackage.md and datapackage.json files, in a place unrelated to the species dataset. Would you be able to have a quick look to see what's going on there?

dangotbanned · 2025-02-21T13:17:35Z

@dangotbanned in the diff, I noticed that the string "git+" seems to be appended below in the datapackage.md and datapackage.json files, in a place unrelated to the species dataset. Would you be able to have a quick look to see what's going on there?

Looks like its pulling through the new url since https://github.com/vega/vega-datasets/pull/682/files#diff-7ae45ad102eab3b6d7e7896acd08c427a9b25b346470d7bc6507b6481575d519

@domoritz

domoritz · 2025-02-21T15:37:12Z

I added it since it seems to be recommended by https://publint.dev/[email protected]

mattijn · 2025-02-22T18:56:33Z

Nice that you added a section on validation in OP! Cool to have this dataset to display the suitable habitat fractions by county! Idid a quick check of this section. You currently state:

Darker colors indicate a higher percentage of the county is classified as suitable habitat, while lighter colors indicate a lower percentage.

But where you write "Darker" it should be "Lighter" and where you write "lighter" it is "darker".

In the spec you currently have:

alt.Tooltip('county_id:N', title='County ID'),

Which gives null for counties that are not included in the dataset filtered_table from where you lookup the suitable habitat fractions.

but it is better to do

alt.Tooltip('id:N', title='County ID'),

Since the id is within the counties dataset that contains all counties:

Apart from that all good!

dangotbanned · 2025-02-22T21:15:30Z

scripts/species.py

+    # --- Basic Configuration Validation ---
+    if not isinstance(item_ids, list) or not item_ids:
+        logger.error("`item_ids` in the TOML file must be a non-empty list.")
+        msg = "`item_ids` must be a non-empty list."
+        raise TypeError(msg)
+
+    if not isinstance(vector_fp, str | Path):
+        logger.error("`vector_fp` must be a string or Path.")
+        msg = "`vector_fp` must be a string or Path."
+        raise TypeError(msg)
+    vector_fp = Path(vector_fp)
+
+    if not isinstance(output_dir, str | Path):
+        logger.error("`output_dir` must be a string or Path.")
+        msg = "`output_dir` must be a string or Path."
+        raise TypeError(msg)
+    output_dir = Path(output_dir)
+
+    if output_format not in {"csv", "parquet", "arrow"}:
+        logger.error("Invalid `output_format`: %s", output_format)
+        msg = f"Invalid `output_format`: {output_format}"
+        raise ValueError(msg)
+
+    if not isinstance(debug, bool):
+        logger.error("`debug` must be a boolean.")
+        msg = "`debug` must be a boolean."
+        raise TypeError(msg)


I've not seen this pattern before of both logging and raising an error.
@dsmedia I think just raising the error would be enough?

simplified now - thank you

scripts/species.py

This commit introduces a new example to the Altair documentation showcasing choropleth maps faceted by category. The example visualizes the distribution of suitable habitat for different species across US counties, using the proposed new Species Habitat dataset from vega-datasets (vega/vega-datasets#684). Key features of this example: - Demonstrates the use of `alt.Chart.mark_geoshape()` for geographical visualizations. - Shows how to create faceted maps for comparing categorical data across geographic regions. - Utilizes the `transform_lookup` and `transform_calculate` transforms for data manipulation within Altair. - Uses a CSV data file temporarily hosted in the vega-datasets repository branch (pending dataset merge). This example addresses issue #1711, which requested a faceted map example for the Altair documentation. Co-authored-by: Mattijn van Hoek <[email protected]>

See - https://numpydoc.readthedocs.io/en/latest/format.html#sections - https://docs.astral.sh/ruff/settings/#lintpydocstyle

dangotbanned · 2025-02-22T21:40:28Z

@dsmedia sorry I haven't had a chance to get to this properly yet

If everyone else is happy, don't let me be the blocker 👍
Left some quick comments and fixed the docs a bit in (9a5d0c2)

Co-authored-by: Dan Redding <[email protected]>

- Remove redundant error logging before raising exceptions

Modify species dataset processing to explicitly set Continental US counties to 0% habitat when unassigned, while keeping Alaska and Hawaii counties as null values. This simplifies Altair visualization by removing the need for transform_filter and allows for better projection choices.

This commit enhances the GAP habitat analysis script to focus specifically on coterminous (contiguous) US counties by implementing a geographic filtering system based on FIPS codes. ## Changes: - Added geographic filtering configuration in TOML to exclude Alaska, Hawaii, Puerto Rico, and Virgin Islands - Refactored county data loading into smaller, specialized methods for better maintainability - Modified habitat analysis to explicitly handle coterminous US counties, filling missing values with zeros - Updated documentation to clarify the geographic scope of the analysis - Improved error handling and logging, especially for county boundary processing - Changed default output format from arrow to csv ## Technical Details: - The `HabitatDataProcessor` class now accepts a full configuration object - Added `_filter_to_coterminous_us` method to process excluded areas based on FIPS codes - Enhanced `save_results` to maintain complete county coverage with consistent data formatting - Configuration now includes a structured `[processing.geographic_filter]` section This change ensures consistent habitat analysis across the coterminous United States by standardizing which areas are included in the analysis.

Use `uv run --group="geo-species" scripts/species.py`

dangotbanned · 2025-02-27T16:51:28Z

pyproject.toml

 [dependency-groups]
 dev = ["ipython[kernel]>=8.30.0", "ruff>=0.8.2", "taplo>=0.9.3"]
+geo-species = [
+  "exactextract",
+  "geopandas",
+  "pandas[pyarrow]",
+  "rasterio",
+  "requests",
+  "sciencebasepy",
+  "setuptools",
+]


Tried running the script and was a bit alarmed by the install size.
Definitely a case where I wouldn't want to install that on each run - so following (1415aae) they're now locked.

Use the command below to run the script:

uv run --group="geo-species" scripts/species.py

dangotbanned · 2025-02-27T17:02:38Z

scripts/species.py

+        for item_id in item_ids:
+            try:
+                item_json = self.sb.get_item(item_id)
+                files_info = self.sb.get_item_file_info(item_json)
+
+                for file_info in files_info:
+                    if "HabMap" in file_info["name"] and file_info["name"].endswith(
+                        ".zip"
+                    ):
+                        zip_path = temp_dir / file_info["name"]
+                        self.sb.download_file(file_info["url"], str(zip_path))
+                        downloaded_zips.append(zip_path)
+
+            except Exception as e:  # Catch generic Exception
+                if isinstance(e, requests.exceptions.RequestException):
+                    logger.error(
+                        "Error downloading files for item ID %s: %s", item_id, e
+                    )
+                else:
+                    logger.error(
+                        "An unexpected error occurred for item ID %s: %s", item_id, e
+                    )
+                continue  # Go to the next item_id
+
+        return sorted(downloaded_zips)  # ALWAYS return the list


Been stuck here for a while with no progress.

System usage seems way too high to be downloading

Screenshot

Logs

PS C:\Users\danie\Documents\GitHub\vega-datasets> uv run --group="geo-species" scripts/species.py Installed 1 package in 24ms 2025-02-27 16:52:39 [INFO] Initializing GAP habitat analysis pipeline 2025-02-27 16:52:39 [INFO] Loading county boundaries from local file: C:\Users\danie\Documents\GitHub\vega-datasets\data\us-10m.json 2025-02-27 16:52:40 [INFO] Successfully loaded county data with these columns: ['id', 'geometry'] 2025-02-27 16:52:40 [INFO] Filtering out 4 areas by FIPS code to focus on conterminous US 2025-02-27 16:52:40 [INFO] Excluding 27 counties with FIPS prefix 02 (Alaska) 2025-02-27 16:52:40 [INFO] Excluded county FIPS codes: 02013, 02016, 02020, 02050, 02060, 02068, 02070, 02090, 02100, 02110, 02122, 02130, 02150, 02164, 02170, 02180, 02185, 02188, 02201, 02220, 02232, 02240, 02261, 02270, 02280, 02282, 02290 02261, 02270, 02280, 02282, 02290 2025-02-27 16:52:40 [INFO] Excluding 4 counties with FIPS prefix 15 (Hawaii) 2025-02-27 16:52:40 [INFO] Excluded county FIPS codes: 15001, 15003, 15007, 15009 2025-02-27 16:52:40 [INFO] Excluding 74 counties with FIPS prefix 72 (Puerto Rico) 72049, 72051, 72053, 72054, 72055, 72057, 72059, 72061, 72063, 2025-02-27 16:52:40 [INFO] Excluded county FIPS codes: 72001, 72003, 72005, 72007, 72009, 72011, 72013, 72017, 72019, 72021, 72023, 72025, 72027, 72029, 72031, 72035, 72037, 72039, 72041, 72043, 72045, 72047, 72129, 72131, 72133, 72135, 72137, 72139, 72141, 72143, 72145, 72049, 72051, 72053, 72054, 72055, 72057, 72059, 72061, 72063, 72065, 72069, 72071, 72073, 72075, 72077, 72079, 72081, 72083, 72085, 72087, 72089, 72091, 72093, 72095, 72097, 72099, 72101, 72103, 72105, 72107, 72109, 72111, 72113, 72115, 72119, 72121, 72123, 72125, 72127, 72129, 72131, 72133, 72135, 72137, 72139, 72141, 72143, 72145, 72147, 72149, 72151, 72153 2025-02-27 16:52:40 [INFO] Excluding 3 counties with FIPS prefix 78 (Virgin Islands) 2025-02-27 16:52:40 [INFO] Excluded county FIPS codes: 78010, 78020, 78030 2025-02-27 16:52:40 [INFO] Analyzing 3105 counties in conterminous US 2025-02-27 16:52:40 [WARNING] Removed 4 counties with invalid boundaries 2025-02-27 16:52:40 [INFO] Step 1/3: Downloading and extracting habitat data files 2025-02-27 16:52:40 [INFO] Retrieving species information from ScienceBase 2025-02-27 16:52:41 [INFO] Downloading habitat data files _HabMap_2001v1.zip downloading https://www.sciencebase.gov/catalog/file/get/58fa449fe4b0b7ea54524c5e?f=__disk__dc%2F2f%2Fa7%2Fdc2fa725a0d36b4f9ceffed282019cf27b9a37f3 to C:\Users\danie\AppData\Local\Temp\tmpnfbi0fvb\bAMROx_CONUS_HabMap_2001v1.zip

Feeling like this might point towards the issue https://docs.astral.sh/ruff/rules/blind-except/

Traceback on CTRL+C

Traceback (most recent call last): File "C:\Users\danie\Documents\GitHub\vega-datasets\scripts\species.py", line 791, in <module> main() File "C:\Users\danie\Documents\GitHub\vega-datasets\scripts\species.py", line 787, in main processor.run() File "C:\Users\danie\Documents\GitHub\vega-datasets\scripts\species.py", line 702, in run tif_files, species_info = self.process_habitat_data(temp_dir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\danie\Documents\GitHub\vega-datasets\scripts\species.py", line 470, in process_habitat_data zip_files = self.sciencebase_client.download_zip_files(self.item_ids, temp_dir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\danie\Documents\GitHub\vega-datasets\scripts\species.py", line 119, in download_zip_files self.sb.download_file(file_info["url"], str(zip_path)) File "C:\Users\danie\Documents\GitHub\vega-datasets\.venv\Lib\site-packages\sciencebasepy\SbSession.py", line 919, in download_file for chunk in response.iter_content(chunk_size=1024): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\danie\Documents\GitHub\vega-datasets\.venv\Lib\site-packages\requests\models.py", line 820, in generate yield from self.raw.stream(chunk_size, decode_content=True) File "C:\Users\danie\Documents\GitHub\vega-datasets\.venv\Lib\site-packages\urllib3\response.py", line 967, in stream data = self.read(amt=amt, decode_content=decode_content) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\danie\Documents\GitHub\vega-datasets\.venv\Lib\site-packages\urllib3\response.py", line 879, in read data = self._raw_read(amt) ^^^^^^^^^^^^^^^^^^^ File "C:\Users\danie\Documents\GitHub\vega-datasets\.venv\Lib\site-packages\urllib3\response.py", line 792, in _raw_read with self._error_catcher(): ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\danie\AppData\Roaming\uv\python\cpython-3.12.8-windows-x86_64-none\Lib\contextlib.py", line 141, in __exit__ def __exit__(self, typ, value, traceback): KeyboardInterrupt

@dsmedia I've added the rule, but not fixed it (https://github.com/vega/vega-datasets/actions/runs/13571913413/job/37938835261?pr=684)

Was hoping to review in more detail - but I can't get beyond this step sadly 😞

For me it works fine, downloading data in ~2 minutes and analysis in ~11 minutes:

uv run --group="geo-species" scripts/species.py Using CPython 3.12.9 interpreter at: /Users/mattijnvanhoek/miniconda3/envs/stable/bin/python Creating virtual environment at: .venv Built stringcase==1.2.0 Built progress==1.6 Built petl==1.7.15 Installed 104 packages in 352ms 2025-02-27 21:32:06 [INFO] Initializing GAP habitat analysis pipeline 2025-02-27 21:32:06 [INFO] Loading county boundaries from local file: /Users/mattijnvanhoek/vega/vega-datasets/data/us-10m.json 2025-02-27 21:32:17 [INFO] Successfully loaded county data with these columns: ['id', 'geometry'] 2025-02-27 21:32:17 [INFO] Filtering out 4 areas by FIPS code to focus on conterminous US 2025-02-27 21:32:17 [INFO] Excluding 27 counties with FIPS prefix 02 (Alaska) 2025-02-27 21:32:17 [INFO] Excluded county FIPS codes: 02013, 02016, 02020, 02050, 02060, 02068, 02070, 02090, 02100, 02110, 02122, 02130, 02150, 02164, 02170, 02180, 02185, 02188, 02201, 02220, 02232, 02240, 02261, 02270, 02280, 02282, 02290 2025-02-27 21:32:17 [INFO] Excluding 4 counties with FIPS prefix 15 (Hawaii) 2025-02-27 21:32:17 [INFO] Excluded county FIPS codes: 15001, 15003, 15007, 15009 2025-02-27 21:32:17 [INFO] Excluding 74 counties with FIPS prefix 72 (Puerto Rico) 2025-02-27 21:32:17 [INFO] Excluded county FIPS codes: 72001, 72003, 72005, 72007, 72009, 72011, 72013, 72017, 72019, 72021, 72023, 72025, 72027, 72029, 72031, 72035, 72037, 72039, 72041, 72043, 72045, 72047, 72049, 72051, 72053, 72054, 72055, 72057, 72059, 72061, 72063, 72065, 72069, 72071, 72073, 72075, 72077, 72079, 72081, 72083, 72085, 72087, 72089, 72091, 72093, 72095, 72097, 72099, 72101, 72103, 72105, 72107, 72109, 72111, 72113, 72115, 72119, 72121, 72123, 72125, 72127, 72129, 72131, 72133, 72135, 72137, 72139, 72141, 72143, 72145, 72147, 72149, 72151, 72153 2025-02-27 21:32:17 [INFO] Excluding 3 counties with FIPS prefix 78 (Virgin Islands) 2025-02-27 21:32:17 [INFO] Excluded county FIPS codes: 78010, 78020, 78030 2025-02-27 21:32:17 [INFO] Analyzing 3105 counties in conterminous US 2025-02-27 21:32:40 [WARNING] Removed 4 counties with invalid boundaries 2025-02-27 21:32:40 [INFO] Step 1/3: Downloading and extracting habitat data files 2025-02-27 21:32:40 [INFO] Retrieving species information from ScienceBase 2025-02-27 21:32:42 [INFO] Downloading habitat data files downloading https://www.sciencebase.gov/catalog/file/get/58fa449fe4b0b7ea54524c5e?f=__disk__dc%2F2f%2Fa7%2Fdc2fa725a0d36b4f9ceffed282019cf27b9a37f3 to /var/folders/b3/pctp5dwn6t7fm1t4sc4tknb00000gp/T/tmpdfo6ou2c/bAMROx_CONUS_HabMap_2001v1.zip downloading https://www.sciencebase.gov/catalog/file/get/58fa817ae4b0b7ea54525c2f?f=__disk__57%2F6f%2F3b%2F576f3bb4fe3167204a20961184da3f2aa9440b52 to /var/folders/b3/pctp5dwn6t7fm1t4sc4tknb00000gp/T/tmpdfo6ou2c/mWTDEx_CONUS_HabMap_2001v1.zip downloading https://www.sciencebase.gov/catalog/file/get/58fa3f0be4b0b7ea54524859?f=__disk__81%2F26%2F6f%2F81266f6d642fe5a03255f19055afd9d4d169a06d to /var/folders/b3/pctp5dwn6t7fm1t4sc4tknb00000gp/T/tmpdfo6ou2c/aAMBUx_CONUS_HabMap_2001v1.zip downloading https://www.sciencebase.gov/catalog/file/get/58fe0f4fe4b0074928294636?f=__disk__8d%2F6b%2F91%2F8d6b91c206bafddac994009aa7a2ce5ef2d44066 to /var/folders/b3/pctp5dwn6t7fm1t4sc4tknb00000gp/T/tmpdfo6ou2c/rCOGAx_CONUS_HabMap_2001v1.zip 2025-02-27 21:34:33 [INFO] Extracting raster files 2025-02-27 21:34:34 [INFO] Step 2/3: Beginning habitat analysis for 4 species /Users/mattijnvanhoek/vega/vega-datasets/.venv/lib/python3.12/site-packages/exactextract/exact_extract.py:330: RuntimeWarning: Spatial reference system of input features does not exactly match raster. warnings.warn( Processing 45019: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100.0/100 [10:56<00:00, 6.57s/it] 2025-02-27 21:45:37 [INFO] Step 3/3: Saving analysis results 2025-02-27 21:45:37 [INFO] Analysis complete. Results saved to: /Users/mattijnvanhoek/vega/vega-datasets/data

And with vega/altair@f1a1b5a was able to reduce the chart specifications with around 10 lines for the corresponding charts.

For me it works fine, downloading data in ~2 minutes and analysis in ~11 minutes:

Hmm 🤔

@mattijn maybe windows is the issue?

Yours

downloading https://www.sciencebase.gov/catalog/file/get/58fa449fe4b0b7ea54524c5e?f=__disk__dc%2F2f%2Fa7%2Fdc2fa725a0d36b4f9ceffed282019cf27b9a37f3 to /var/folders/b3/pctp5dwn6t7fm1t4sc4tknb00000gp/T/tmpdfo6ou2c/bAMROx_CONUS_HabMap_2001v1.zip

Mine

downloading https://www.sciencebase.gov/catalog/file/get/58fa449fe4b0b7ea54524c5e?f=__disk__dc%2F2f%2Fa7%2Fdc2fa725a0d36b4f9ceffed282019cf27b9a37f3 to C:\Users\danie\AppData\Local\Temp\tmp2jvviidr\bAMROx_CONUS_HabMap_2001v1.zip

It was working for me back in (9a778c6)

#684 (comment)

Changed some things around and it seems to have created the file - but not filling it

vega-datasets/scripts/species.py

Line 119 in 8551a8d

self.sb.download_file(file_info["url"], str(zip_path))

https://github.com/DOI-USGS/sciencebasepy/blob/329192d6a5d5bb9039dc35441dede4ff084b03a9/sciencebasepy/SbSession.py#L912

It was working for me back in (9a778c6)

#684 (comment)

The same thing happened to me earlier today, and when checking I could see that the issue was that the download speed had been throttled significantly. If we can add some kind of counter it should at least demonstrate the progress.

https://github.com/vega/vega-datasets/pull/684/files#r1973999830 https://docs.astral.sh/ruff/rules/#flake8-blind-except-ble

dsmedia · 2025-03-02T02:02:08Z

Issue with exactextract installation

I encountered an installation error when trying to run the script with uv run --group="geo-species" scripts/species.py. The error occurs because:

The current release of exactextract (0.2.0) has its URLs defined in the wrong section of its pyproject.toml file
When installing with newer build tools (like scikit-build-core 0.11.0, released just 2 days ago), this causes a validation error:
```
scikit_build_core._vendor.pyproject_metadata.errors.ConfigurationError: Extra keys present in "project": 'url'
```

This has been fixed in a PR that was already merged (isciences/exactextract#155), but the fix is only available in the development version.

dsmedia added 5 commits February 11, 2025 02:28

feat: add species.csv and generation script

d2cc2f6

chore: fix Ruff linting warnings

f9c979c

chore: formatting with Ruff

93f06ee

fix: ruff formatting

8311060

dsmedia added 2 commits February 16, 2025 13:36

chore: ruff format

853d5ce

dangotbanned added 2 commits February 16, 2025 16:12

fix(typing): declare aliases as types, use Literal correctly

9a778c6

See for more info: - https://typing.readthedocs.io/en/latest/spec/aliases.html#type-statement - https://typing.readthedocs.io/en/latest/spec/literal.html#literal

Merge remote-tracking branch 'upstream/main' into add-species-dataset

728a4a7

dangotbanned reviewed Feb 16, 2025

View reviewed changes

scripts/species.py Outdated Show resolved Hide resolved

dangotbanned reviewed Feb 16, 2025

View reviewed changes

scripts/species.py Outdated Show resolved Hide resolved

dsmedia added 4 commits February 17, 2025 00:04

refactor: use zipfile.Path for TIF extraction

2a1baba

* Switch to using zipfile.Path for more Pythonic ZIP file handling * Enforce expectation of exactly one TIF file per ZIP * Add error handling for unexpected file counts

fix: taplo fmt

cb01bd3

feat: update species to align with altair issue

c4ff52d

dsmedia mentioned this pull request Feb 18, 2025

Faceted map example chart vega/altair#1711

Open

6 tasks

fix: remove hardcoded habitat ids; update urls.ts

6cee9a4

dsmedia added 2 commits February 20, 2025 04:03

docs: add species.arrow to datapackage_additions

62d4e60

dsmedia mentioned this pull request Feb 20, 2025

fix: facet fails on geoshape vega/vega-lite#9292

Open

dsmedia marked this pull request as ready for review February 20, 2025 13:00

dsmedia requested a review from mattijn February 20, 2025 13:03

dsmedia self-assigned this Feb 20, 2025

dsmedia added 4 commits February 21, 2025 12:44

feat: replace arrow with csv

12895cd

feat: update species.toml to reflect csv not arrow

ee414a9

feat: reflect csv in main script and datapackage

9f3267b

feat: update datapackage

e40833e

mattijn approved these changes Feb 22, 2025

View reviewed changes

dangotbanned reviewed Feb 22, 2025

View reviewed changes

scripts/species.py Outdated Show resolved Hide resolved

dsmedia mentioned this pull request Feb 22, 2025

docs(example): Add Faceted map using Species Habitat dataset vega/altair#3809

Draft

docs: somewhat adhere to numpydoc

9a5d0c2

See - https://numpydoc.readthedocs.io/en/latest/format.html#sections - https://docs.astral.sh/ruff/settings/#lintpydocstyle

dsmedia and others added 2 commits February 22, 2025 16:57

Update scripts/species.py

2e92ed4

Co-authored-by: Dan Redding <[email protected]>

refactor: simplify error handling and logging patterns

afcf17a

- Remove redundant error logging before raising exceptions

dsmedia mentioned this pull request Feb 24, 2025

docs: add testing guidelines and create CONTRIBUTING.md #687

Draft

1 task

dsmedia marked this pull request as draft February 26, 2025 11:14

dsmedia and others added 3 commits February 27, 2025 12:30

typo

cd5c243

refactor: move inline dependencies into a group

1415aae

Use `uv run --group="geo-species" scripts/species.py`

dangotbanned reviewed Feb 27, 2025

View reviewed changes

chore: add tqdm

effe713

dangotbanned reviewed Feb 27, 2025

View reviewed changes

ci(ruff): add BLE001 rule

8551a8d

https://github.com/vega/vega-datasets/pull/684/files#r1973999830 https://docs.astral.sh/ruff/rules/#flake8-blind-except-ble

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Species Habitat Dataset for Faceted Map Examples #684

feat: Add Species Habitat Dataset for Faceted Map Examples #684

dsmedia commented Feb 11, 2025 •

edited by mattijn

Loading

dsmedia commented Feb 13, 2025 •

edited

Loading

dsmedia commented Feb 16, 2025

dangotbanned commented Feb 16, 2025

dangotbanned commented Feb 16, 2025

dsmedia commented Feb 19, 2025

mattijn commented Feb 19, 2025

dsmedia commented Feb 20, 2025

dangotbanned commented Feb 21, 2025 •

edited

Loading

Data Format Comparison

File Sizes

DataFrame Shapes (rows, columns)

dsmedia commented Feb 21, 2025

dsmedia commented Feb 21, 2025

dangotbanned commented Feb 21, 2025

domoritz commented Feb 21, 2025

mattijn commented Feb 22, 2025

dangotbanned Feb 22, 2025

dsmedia Feb 22, 2025

dangotbanned commented Feb 22, 2025

dangotbanned Feb 27, 2025

dangotbanned Feb 27, 2025 •

edited

Loading

dangotbanned Feb 27, 2025

dangotbanned Feb 27, 2025

mattijn Feb 27, 2025

dangotbanned Feb 27, 2025

dangotbanned Feb 27, 2025

dangotbanned Feb 27, 2025 •

edited

Loading

dsmedia Mar 2, 2025

dsmedia commented Mar 2, 2025

feat: Add Species Habitat Dataset for Faceted Map Examples #684

Are you sure you want to change the base?

feat: Add Species Habitat Dataset for Faceted Map Examples #684

Conversation

dsmedia commented Feb 11, 2025 • edited by mattijn Loading

Implementation Status

To Do

Dataset Details

Current Implementation (species.csv)

Structure

Fields

Data Generation

Known Issues

Future work

Validation of Habitat Percentage Calculation

dsmedia commented Feb 13, 2025 • edited Loading

dsmedia commented Feb 16, 2025

dangotbanned commented Feb 16, 2025

dangotbanned commented Feb 16, 2025

dsmedia commented Feb 19, 2025

mattijn commented Feb 19, 2025

dsmedia commented Feb 20, 2025

dangotbanned commented Feb 21, 2025 • edited Loading

Data Format Comparison

File Sizes

DataFrame Shapes (rows, columns)

dsmedia commented Feb 21, 2025

dsmedia commented Feb 21, 2025

dangotbanned commented Feb 21, 2025

domoritz commented Feb 21, 2025

mattijn commented Feb 22, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dangotbanned commented Feb 22, 2025

Choose a reason for hiding this comment

dangotbanned Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yours

Mine

Choose a reason for hiding this comment

dangotbanned Feb 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsmedia commented Mar 2, 2025

Issue with exactextract installation

dsmedia commented Feb 11, 2025 •

edited by mattijn

Loading

Current Implementation (`species.csv`)

dsmedia commented Feb 13, 2025 •

edited

Loading

dangotbanned commented Feb 21, 2025 •

edited

Loading

dangotbanned Feb 27, 2025 •

edited

Loading

dangotbanned Feb 27, 2025 •

edited

Loading