Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions site/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ help: # Show help for each of the Makefile recipes.
serve: # Clean, build, and run the docs site locally.
dev/serve.sh

.PHONY: serve-dev
serve-dev:
dev/serve-dev.sh

.PHONY: build
build: # Clean and build the docs site locally.
dev/build.sh
Expand Down
20 changes: 20 additions & 0 deletions site/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ The docs are built, run, and released using [make](https://www.gnu.org/software/
> [deploy](dev/deploy.sh): Clean, build, and deploy the Iceberg docs site.
> help: Show help for each of the Makefile recipes.
> [serve](dev/serve.sh): Clean, build, and run the site locally.
> [serve-dev](dev/serve-dev.sh): Fast iterative development mode - only builds nightly and latest.
> [lint](dev/lint.sh): Scan markdown files for style issues.
> [lint-fix](dev/lint.sh): Run linting with auto-fix on the markdown files.

Expand Down Expand Up @@ -129,6 +130,25 @@ To clear all build files, run `clean`.
make clean
```

#### Fast iterative development mode

When working on the documentation, building all historical versions significantly slows down the build process. For faster iteration during development, use the `serve-dev` recipe:

```sh
make serve-dev
```

This development mode:
- **Only builds `nightly` and `latest` versions** - Skips all historical versions
- **Significantly reduces build time** - Typically 5-10x faster than building all versions
- **Uses the `--dirty` flag** - Only rebuilds changed files for even faster iteration
- **Perfect for iterative development** - Great for working on documentation content

The development mode sets the `ICEBERG_DEV_MODE=true` environment variable and uses a simplified mkdocs configuration (`mkdocs-dev.yml`) that only includes the most recent versions.

> [!NOTE]
> Development mode is only for local iteration. Always use `make serve` or `make build` before creating a pull request to ensure all versioned docs build correctly.

#### Testing local changes on versioned docs

When you build the docs as described above, by default the versioned docs are mounted from the upstream remote repository called `iceberg_docs`. One exception is the `nightly` version that is a soft link to the local `docs/` folder.
Expand Down
49 changes: 31 additions & 18 deletions site/dev/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -92,14 +92,11 @@ create_nightly () {
cd -
}

# Finds and retrieves the latest version of the documentation based on the directory structure.
# Assumes the documentation versions are numeric folders within 'docs/docs/'.
# Finds and retrieves the latest version of the documentation from mkdocs.yml.
# Reads the icebergVersion from the extra section of mkdocs.yml.
get_latest_version () {
# Find the latest numeric folder within 'docs/docs/' structure
local latest=$(ls -d docs/docs/[0-9]* | sort -V | tail -1)

# Extract the version number from the latest directory path
local latest_version=$(basename "${latest}")
# Extract the icebergVersion from mkdocs.yml in the site directory
local latest_version=$(grep "icebergVersion:" mkdocs.yml | sed -E "s/.*icebergVersion:[[:space:]]*['\"]?([^'\"]+)['\"]?.*/\1/")

# Output the latest version number
echo "${latest_version}"
Expand Down Expand Up @@ -168,29 +165,45 @@ update_version () {

# Sets up local worktrees for the documentation and performs operations related to different versions.
pull_versioned_docs () {
# Retrieve the latest version of documentation for processing
local latest_version=$(get_latest_version)

# Output the latest version for debugging purposes
echo "Latest version is: ${latest_version}"

echo " --> pull versioned docs"

# Ensure the remote repository for documentation exists and is up-to-date
create_or_update_docs_remote

# Ensure the remote repository for documentation exists and is up-to-date
create_or_update_docs_remote

# Add local worktrees for documentation and javadoc either from the remote repository
# or from a local branch.
local docs_branch="${ICEBERG_VERSIONED_DOCS_BRANCH:-${REMOTE}/docs}"
local javadoc_branch="${ICEBERG_VERSIONED_JAVADOC_BRANCH:-${REMOTE}/javadoc}"
git worktree add -f docs/docs "${docs_branch}"
git worktree add -f docs/javadoc "${javadoc_branch}"

# Retrieve the latest version of documentation for processing
local latest_version=$(get_latest_version)

# Output the latest version for debugging purposes
echo "Latest version is: ${latest_version}"
# Check if running in dev mode (only build nightly and latest for faster iteration)
if [ "${ICEBERG_DEV_MODE:-false}" = "true" ]; then
echo " --> running in DEV MODE - only building nightly and latest"
echo " --> This significantly reduces build time by skipping historical versions"

# Create docs worktree with sparse checkout for latest version only
git worktree add --no-checkout -f docs/docs "${docs_branch}"
(cd docs/docs && git sparse-checkout init --cone && git sparse-checkout set "${latest_version}" && git checkout)

# Create javadoc worktree with sparse checkout for latest version only
git worktree add --no-checkout -f docs/javadoc "${javadoc_branch}"
(cd docs/javadoc && git sparse-checkout init --cone && git sparse-checkout set "${latest_version}" && git checkout)
else
# Full checkout of all versions
git worktree add -f docs/docs "${docs_branch}"
git worktree add -f docs/javadoc "${javadoc_branch}"
fi

# Create the 'latest' version of documentation
create_latest "${latest_version}"

# Create the 'nightly' version of documentation
create_nightly
create_nightly
}

check_markdown_files () {
Expand Down
37 changes: 37 additions & 0 deletions site/dev/serve-dev.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Development mode serve script - only builds nightly and latest for fast iteration

source dev/common.sh

set -e

export ICEBERG_DEV_MODE=true

echo "=========================================="
echo "RUNNING IN DEVELOPMENT MODE"
echo "Only building nightly and latest versions"
echo "=========================================="
echo ""

./dev/setup_env.sh

# Using mkdocs serve with --dirty flag for even faster rebuilds
# The --dirty flag means only changed files are rebuilt
"${VENV_DIR}/bin/python3" -m mkdocs serve --dirty --watch . -f mkdocs-dev.yml
107 changes: 107 additions & 0 deletions site/mkdocs-dev.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# This is a development navigation configuration that only includes
# nightly and latest documentation for faster build times during
# iterative development. Use this with ICEBERG_DEV_MODE=true.

INHERIT: ./mkdocs.yml

nav:
- Home: index.md
- Quickstart:
- Spark: spark-quickstart.md
- Hive: hive-quickstart.md
- Docs:
- Java:
- Nightly: '!include docs/docs/nightly/mkdocs.yml'
- Latest (1.10.0): '!include docs/docs/latest/mkdocs.yml'
- Other Implementations:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we can avoid copying other parts of mkdocs.yml. It will be easier to maintain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, this is more of a short term solution. both nightly and latest are just sym links to the actual "latest version" (1.10).

long term i want to explore using mike for versioning

- Python: https://py.iceberg.apache.org/
- Rust: https://rust.iceberg.apache.org/
- Go: https://go.iceberg.apache.org/
- C++: https://github.com/apache/iceberg-cpp/
- Third-party:
- Catalogs:
- Apache Gravitino: https://gravitino.apache.org/
- Apache Polaris: https://polaris.apache.org/
- Boring Catalog: https://github.com/boringdata/boring-catalog
- DataHub: https://docs.datahub.com/docs/iceberg-catalog
- Google BigLake metastore: https://cloud.google.com/bigquery/docs/blms-manage-resources
- Lakekeeper: https://github.com/lakekeeper/lakekeeper
- Integrations:
- Amazon Athena: https://docs.aws.amazon.com/athena/latest/ug/querying-iceberg.html
- Amazon Data Firehose: https://docs.aws.amazon.com/firehose/latest/dev/apache-iceberg-destination.html
- Amazon EMR: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-iceberg-use-cluster.html
- Amazon Redshift: https://docs.aws.amazon.com/redshift/latest/dg/querying-iceberg.html
- Apache Amoro: integrations/amoro.md
- Apache Doris: https://doris.apache.org/docs/dev/lakehouse/catalogs/iceberg-catalog
- Apache Druid: https://druid.apache.org/docs/latest/development/extensions-contrib/iceberg/
- BladePipe: https://www.bladepipe.com/docs/dataMigrationAndSync/datasource_func/Iceberg/props_for_iceberg_ds
- ClickHouse: https://clickhouse.com/docs/en/engines/table-engines/integrations/iceberg
- Daft: integrations/daft.md
- Databend: https://docs.databend.com/guides/access-data-lake/iceberg
- Dremio: https://docs.dremio.com/data-formats/apache-iceberg/
- DuckDB: https://duckdb.org/docs/preview/core_extensions/iceberg/overview
- Estuary: https://docs.estuary.dev/reference/Connectors/materialization-connectors/apache-iceberg/
- Firebolt: https://docs.firebolt.io/reference-sql/functions-reference/table-valued/read_iceberg
- Google BigQuery: https://cloud.google.com/bigquery/docs/iceberg-tables
- Impala: https://impala.apache.org/docs/build/html/topics/impala_iceberg.html
- Memiiso Debezium: https://memiiso.github.io/debezium-server-iceberg/
- Nimtable: https://github.com/nimtable/nimtable
- OLake: https://olake.io/docs
- Presto: https://prestodb.io/docs/current/connector/iceberg.html
- Redpanda: https://docs.redpanda.com/current/manage/iceberg/about-iceberg-topics
- RisingWave: integrations/risingwave.md
- Snowflake: https://docs.snowflake.com/en/user-guide/tables-iceberg
- Starburst: https://docs.starburst.io/latest/connector/iceberg.html
- Starrocks: https://docs.starrocks.io/en-us/latest/data_source/catalog/iceberg_catalog
- Tinybird: https://www.tinybird.co/docs/forward/get-data-in/table-functions/iceberg
- Trino: https://trino.io/docs/current/connector/iceberg.html
- Releases: releases.md
- Project:
- Contributing: contribute.md
- Multi-engine support: multi-engine-support.md
- How to release: how-to-release.md
- ASF:
- Sponsorship: https://www.apache.org/foundation/sponsorship.html
- Events: https://www.apache.org/events/current-event.html
- Privacy: https://privacy.apache.org/policies/privacy-policy-public.html
- License: https://www.apache.org/licenses/
- Security: https://www.apache.org/security/
- Sponsors: https://www.apache.org/foundation/sponsors.html
- Community:
- Community: community.md
- Talks: talks.md
- Vendors: vendors.md
- Specification:
- Terms: terms.md
- REST Catalog Spec: rest-catalog-spec.md
- Table Spec: spec.md
- View spec: view-spec.md
- Puffin spec: puffin-spec.md
- AES GCM Stream spec: gcm-stream-spec.md
- Implementation status: status.md

exclude_docs: |
!.asf.yaml
docs/
javadoc/
!docs/nightly/
!javadoc/nightly/
!docs/latest/
!javadoc/latest/