-
Notifications
You must be signed in to change notification settings - Fork 295
docs: Restructure docs to target users #4875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements a comprehensive documentation restructuring to shift focus from targeting Daft developers to targeting end users. The changes reorganize the entire documentation hierarchy with several key transformations:
Structural Reorganization:
- Consolidates 'I/O' and 'Catalogs' sections into a unified 'Connectors' section, making it easier for users to find information about data sources
- Introduces a new 'Modalities' section for working with different data types (images, text, JSON, URLs)
- Creates a 'Running Custom Python Code' section for UDFs, GPU usage, and external APIs
- Renames 'Advanced' topics to 'Optimization and Debugging', making the content more approachable
- Moves API reference documentation to a dedicated 'api' directory
User-Focused Improvements:
- Updates titles to be action-oriented (e.g., 'Apache Iceberg' → 'Reading from and Writing to Apache Iceberg')
- Adds
navigation.expand
feature in MkDocs to automatically expand navigation sections - Implements comprehensive redirect mappings to maintain backward compatibility
- Creates consolidated overview pages that combine related functionality
Content Integration:
- Merges catalog functionality into the connectors documentation under a 'Daft Catalogs' section
- Creates a new
docs/connectors/index.md
that provides function tables for all major data sources alongside catalog examples - Establishes placeholder files for new sections like
docs/modalities/index.md
anddocs/optimization/index.md
The restructuring follows a logical user journey: quickstart → connectors → data modalities → custom code → scaling → optimization. This aligns with how users typically interact with Daft, starting from basic data connections and progressing to advanced optimization techniques.
Confidence score: 2/5
- This PR contains several concerning issues that could break the user experience and leave important functionality undocumented
- Multiple critical documentation files have been completely emptied (GPU, UDFs, Images, Text, JSON, URLs modalities) without replacement content, which will result in broken navigation links and missing guidance for users
- The S3 Tables connector documentation has code examples with missing imports and inconsistent variable names that could confuse users
- Files requiring immediate attention:
docs/custom-code/gpu.md
,docs/custom-code/udfs.md
,docs/modalities/images.md
,docs/modalities/text.md
,docs/modalities/json.md
,docs/modalities/urls.md
,docs/connectors/s3tables.md
32 files reviewed, 4 comments
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4875 +/- ##
==========================================
+ Coverage 78.95% 79.20% +0.25%
==========================================
Files 893 893
Lines 124879 124421 -458
==========================================
- Hits 98599 98553 -46
+ Misses 26280 25868 -412
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements a comprehensive documentation restructuring to transition from developer-focused to user-focused content. The changes reorganize the entire documentation hierarchy, moving content from nested directories (like resources/
, migration/
) to more accessible top-level locations. Key structural changes include:
- New navigation structure: The main navigation now prioritizes user workflows with sections like "Data Connectors", "Running Custom Python Code", and "Modalities" (for handling different data types like text, images, JSON)
- Content consolidation: Benchmark visualizations moved from
docs/resources/benchmarks/
todocs/benchmarks/
, telemetry docs moved to root level, and Spark Connect API documentation relocated todocs/api/
- User-focused content: New comprehensive guides for UDFs, image processing, JSON handling, and URL/file operations with practical examples and real-world use cases
- Marketing integration: Added performance claims and links to blog posts highlighting Daft's competitive advantages
- Placeholder structure: Created stub files with "User guide coming soon!" messages for sections still in development (custom connectors, GPU documentation, text processing)
The restructuring maintains backward compatibility through redirect mappings in mkdocs.yml
and includes quality improvements like fixing broken code examples, correcting grammar issues, and adding missing import statements. The new structure emphasizes Daft's multimodal data processing capabilities and provides clear pathways for users to understand and implement various data workflows.
Confidence score: 4/5
• This PR is generally safe to merge with mostly structural reorganization and content improvements, though some documentation inconsistencies need attention
• The score reflects minor issues with placeholder content linking to incomplete guides, some inconsistent code examples, and undefined functions in documentation samples
• Files needing attention: docs/custom-code/udfs.md
(undefined functions and incorrect column references), docs/modalities/json.md
(incomplete sections and typos), docs/modalities/images.md
(inconsistent syntax examples)
25 files reviewed, 10 comments
The content is not completely done, but the structure is there and already much better. So just going to blast ahead and merge it. |
## Summary - Fixed broken UDF documentation link in `docs/api/udf.md` - Changed from `../core_concepts.md#user-defined-functions-udf` to `../custom-code/udfs.md` ## Context The `core_concepts.md` → `index.md` redirect added in #4875 causes anchor links to be lost. Found 17 other anchor links that need similar fixes: - `#expressions` - `#datatypes` - `#dataframe` - `#aggregations-and-grouping` - `#schemas-and-types` - `#reading-data` - `#writing-data` - `#sql` - `#window-functions` - `#multimodal-data` - And others This PR addresses only the verified UDF link fix. Other fixes to follow.
## Summary - Remove obsolete `docs/core_concepts.md` file - Remove obsolete `docs/migration/dask_migration.md` file - Remove broken anchor link references to core_concepts.md throughout API documentation - Update window functions reference to point to working tutorial link ## Context Fixes broken links caused by #4875. The docs restructuring in #4875 made core_concepts.md and dask_migration.md obsolete, but left many anchor link references that no longer work. This PR removes the source files and cleans up all the broken references. --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…tual-Inc#5062) ## Summary - Remove obsolete `docs/core_concepts.md` file - Remove obsolete `docs/migration/dask_migration.md` file - Remove broken anchor link references to core_concepts.md throughout API documentation - Update window functions reference to point to working tutorial link ## Context Fixes broken links caused by Eventual-Inc#4875. The docs restructuring in Eventual-Inc#4875 made core_concepts.md and dask_migration.md obsolete, but left many anchor link references that no longer work. This PR removes the source files and cleans up all the broken references. --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…tual-Inc#5062) ## Summary - Remove obsolete `docs/core_concepts.md` file - Remove obsolete `docs/migration/dask_migration.md` file - Remove broken anchor link references to core_concepts.md throughout API documentation - Update window functions reference to point to working tutorial link ## Context Fixes broken links caused by Eventual-Inc#4875. The docs restructuring in Eventual-Inc#4875 made core_concepts.md and dask_migration.md obsolete, but left many anchor link references that no longer work. This PR removes the source files and cleans up all the broken references. --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Changes Made
Our current docs target daft developers instead of users. Let's flip the script.
Here's the build: https://docs.daft.ai/en/desmond-restructure-docs/