Skip to content

Conversation

@pauloxnet
Copy link
Member

@pauloxnet pauloxnet commented Sep 11, 2025

Overview

This pull request addresses issue #1650, which proposed migrating the search vector field for Document models to a Generated Field. The goal is to leverage the database's capabilities to automatically update search-related data, eliminating the need for manual index updates.

Key Changes

  • Replaces the manually updated SearchVectorField with a database-generated vector field (using Django's GeneratedField).
  • The search vector is now always in sync with the Document's data, as the database computes the value automatically whenever the document changes.
  • Removes the need for custom management commands or signals to update the search index after document changes.
  • Refactors any logic that previously depended on manual updates to now use the generated field directly.

Why This Matters

Previously, the search index was updated via a management command after document changes. This process introduced a delay between the update of a document and the update of the search index. During this window, users searching for recently updated documentation often encountered blank results, leading to confusion and frequent new issues being opened by contributors and users who could not find expected documentation.

With the Generated Field:

  • Index updates are instant: The search vector is recalculated as soon as the Document changes in the database.
  • No more blank results: Users will get up-to-date search results without waiting for a scheduled command or manual action.
  • Reduces maintenance overhead: Volunteers no longer need to monitor or troubleshoot delayed index updates.
  • Simplifies codebase: Removes the complexity associated with keeping the search index in sync.

Review Notes

  • Please verify that all code paths previously relying on manual index update now use the generated field.
  • Check that tests for document creation and update reflect the new instant searchability.
  • Confirm that documentation has been updated to reflect the new workflow for search indexing.

Impact

This change will significantly improve the reliability and responsiveness of documentation search on djangoproject.com. It removes a longstanding source of user confusion and contributor frustration, and leverages modern Django/database features for maintainability and performance.


Thank you for reviewing! Please raise any questions or concerns about edge cases, performance, or deployment.

@sentry-io
Copy link

sentry-io bot commented Sep 11, 2025

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: docs/management/commands/update_docs.py

Function Unhandled Issue
_get_doc_releases DocumentRelease.DoesNotExist: DocumentRelease matching query does not exist. docs.management.commands.upda...
Event Count: 331

Did you find this useful? React with a 👍 or 👎

@pauloxnet pauloxnet requested review from a team September 11, 2025 21:17
@pauloxnet pauloxnet self-assigned this Sep 11, 2025
@pauloxnet pauloxnet added docs search python Pull requests that update Python code DjangoCon 🦄 labels Sep 11, 2025
Copy link
Member

@tobiasmcnulty tobiasmcnulty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR and cleaning out the unused code/docs! I am enjoying learning more about GeneratedFields.

I don't have experience with full-text search nor generated fields, so review from someone else would be great if possible.

If not, or if it's easier to deploy and test this on the preview server, I'm okay with that too.

@pauloxnet pauloxnet requested a review from Copilot September 12, 2025 05:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request replaces the manually updated search field with a database-generated vector field for Document models to address issue #1650. The change eliminates the delay between document updates and search index updates by leveraging Django's GeneratedField to automatically compute search vectors in the database.

  • Replaces SearchVectorField named search with a GeneratedField named vector that automatically updates when documents change
  • Removes manual search index update commands and related management infrastructure
  • Updates search queries to use the new vector field instead of search

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
docs/models.py Replaces manual search field with auto-generated vector field using GeneratedField with language-specific search vector expressions
docs/search.py Refactors search vector definition into a function to support the generated field approach
docs/tests/test_models.py Removes tests for manual search update methods and updates test setup to work with auto-generated vectors
docs/tests/test_views.py Removes manual search update call from test setup
docs/management/commands/update_index.py Deletes the entire management command file as manual indexing is no longer needed
docs/management/commands/update_docs.py Removes search index update logic and command-line options
docs/migrations/0007_add_vector_search.py Database migration to transition from search to vector field
docker-entrypoint.dev.sh Removes commented manual index update command
README.rst Updates documentation to reflect removal of manual indexing commands

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Member

@tobiasmcnulty tobiasmcnulty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks the updates.

Note to self (or @bmispelon, or whoever ends up deploying this): We'll need to remove --update-index from the Ansible repo at the same time. I'll make a companion PR.

Copy link
Member

@adamchainz adamchainz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look through, a few comments on the models change, but otherwise it seems perfectly cromulent to me!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

docs/models.py:1

  • The test creates a document but no longer explicitly tests that the search_vector field is populated automatically. Consider adding an assertion to verify the generated field is working correctly after document creation.
import datetime

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@adamchainz
Copy link
Member

Looks like you got unlucky with a Coveralls failure in the test run there:

coveralls.exception.CoverallsException: Could not submit coverage: 504 Server Error: Gateway Timeout for url: https://coveralls.io/api/v1/jobs

@pauloxnet
Copy link
Member Author

Looks like you got unlucky with a Coveralls failure in the test run there:

Coveralls has been just removed :)
#2182

Copy link
Member

@adamchainz adamchainz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's gooooo

@pauloxnet
Copy link
Member Author

Note to self (or @bmispelon, or whoever ends up deploying this): We'll need to remove --update-index from the Ansible repo at the same time. I'll make a companion PR.

@tobiasmcnulty Do you think there's something that needs to be done on the Ops side before merging this PR? If so, please proceed when you can. (I've no intention of pushing from my side) :)

In fact, let me know if I can make this easier for you in any way. Thanks again for your help.

@tobiasmcnulty tobiasmcnulty merged commit 52217c8 into django:main Sep 20, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DjangoCon 🦄 docs python Pull requests that update Python code search

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants