-
-
Notifications
You must be signed in to change notification settings - Fork 233
Creating chunks for OWASP chapters #1693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary by CodeRabbit
Summary by CodeRabbit
WalkthroughThis change generalizes the Changes
Assessment against linked issues
Assessment against linked issues: Out-of-scope changesNo out-of-scope changes found. Possibly related PRs
Suggested labels
✨ Finishing Touches
🧪 Generate Unit Tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (7)
backend/apps/ai/Makefile
(1 hunks)backend/apps/ai/admin.py
(1 hunks)backend/apps/ai/common/constants.py
(1 hunks)backend/apps/ai/management/commands/ai_create_chapter_chunks.py
(1 hunks)backend/apps/ai/management/commands/ai_create_slack_message_chunks.py
(2 hunks)backend/apps/ai/migrations/0004_alter_chunk_unique_together_chunk_content_type_and_more.py
(1 hunks)backend/apps/ai/models/chunk.py
(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
backend/apps/ai/management/commands/ai_create_slack_message_chunks.py (1)
backend/apps/slack/models/message.py (1)
text
(83-85)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (1)
backend/apps/ai/models/chunk.py (1)
Chunk
(13-88)
🪛 Pylint (3.3.7)
backend/apps/ai/management/commands/ai_create_slack_message_chunks.py
[error] 10-13: Unable to import 'apps.ai.common.constants'
(E0401)
backend/apps/ai/migrations/0004_alter_chunk_unique_together_chunk_content_type_and_more.py
[convention] 1-1: Missing module docstring
(C0114)
[convention] 1-1: Module name "0004_alter_chunk_unique_together_chunk_content_type_and_more" doesn't conform to snake_case naming style
(C0103)
[error] 3-3: Unable to import 'django.db.models.deletion'
(E0401)
[error] 4-4: Unable to import 'django.db'
(E0401)
[convention] 7-7: Missing class docstring
(C0115)
[refactor] 7-7: Too few public methods (0/2)
(R0903)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 7-7: Unable to import 'openai'
(E0401)
[error] 8-8: Unable to import 'django.core.management.base'
(E0401)
[error] 10-13: Unable to import 'apps.ai.common.constants'
(E0401)
[error] 14-14: Unable to import 'apps.ai.models.chunk'
(E0401)
[error] 15-15: Unable to import 'apps.owasp.models.chapter'
(E0401)
[convention] 18-18: Missing class docstring
(C0115)
[convention] 21-21: Missing function or method docstring
(C0116)
[convention] 35-35: Missing function or method docstring
(C0116)
[warning] 35-35: Unused argument 'args'
(W0613)
[refactor] 130-130: Too many branches (25/12)
(R0912)
[refactor] 130-130: Too many statements (57/50)
(R0915)
[warning] 42-42: Attribute 'openai_client' defined outside init
(W0201)
[warning] 108-108: Attribute 'last_request_time' defined outside init
(W0201)
backend/apps/ai/models/chunk.py
[error] 3-3: Unable to import 'django.contrib.contenttypes.fields'
(E0401)
[error] 4-4: Unable to import 'django.contrib.contenttypes.models'
(E0401)
[error] 5-5: Unable to import 'django.db'
(E0401)
[error] 6-6: Unable to import 'langchain.text_splitter'
(E0401)
[error] 7-7: Unable to import 'pgvector.django'
(E0401)
[convention] 16-16: Missing class docstring
(C0115)
[refactor] 16-16: Too few public methods (0/2)
(R0903)
🪛 Flake8 (7.2.0)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 62-62: whitespace before ':'
(E203)
🔇 Additional comments (8)
backend/apps/ai/Makefile (1)
5-7
: LGTM! Consistent target implementation.The new Makefile target follows the established pattern and naming conventions. The implementation is clean and consistent with the existing
ai-create-slack-message-chunks
target.backend/apps/ai/common/constants.py (1)
1-4
: LGTM! Good centralization of timing constants.The timing constants are well-named and the values (1.2s minimum interval, 2s default offset) are reasonable for API rate limiting. This centralization improves maintainability and eliminates code duplication across management commands.
backend/apps/ai/admin.py (1)
12-14
: LGTM! Admin fields properly updated for generic relations.The changes correctly reflect the model refactoring from message-specific to generic content objects. Using
content_type
in the display andobject_id
in search fields is appropriate for the new generic foreign key structure.backend/apps/ai/management/commands/ai_create_slack_message_chunks.py (2)
10-13
: LGTM! Proper use of centralized constants.The import of timing constants from the shared module eliminates code duplication and improves maintainability. The static analysis import error is likely a false positive since the constants file exists in the codebase.
82-84
: LGTM! Correct adaptation to generic content model.The changes properly update the
Chunk.update_data
call to use the new generic content object pattern:
text
parameter moved to first positionmessage
parameter replaced withcontent_object=message
This aligns with the model refactoring to support any content type via Django's content types framework.
backend/apps/ai/migrations/0004_alter_chunk_unique_together_chunk_content_type_and_more.py (1)
14-42
: object_id default aligns with model; migration approvedVerified that
object_id
is defined withdefault=0
in backend/apps/ai/models/chunk.py (line 22), which matches the migration’sdefault=0
. No changes required unless you intend to allow aNULL
value forobject_id
—the current setup is consistent.backend/apps/ai/models/chunk.py (2)
27-37
: Good implementation of dynamic content representation!The
__str__
method effectively handles different content types by checking for common attributes (name
,key
) before falling back to string representation.
54-88
: Well-implemented generic content type support!The
update_data
method has been successfully refactored to support any content object type while maintaining the same functionality. The updated docstring clearly reflects the change.
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great refactoring, here are a few items to address before merging:
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
Outdated
Show resolved
Hide resolved
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
Outdated
Show resolved
Hide resolved
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (1)
19-210
: Address comprehensive feedback from previous reviews.The previous reviews have identified several important issues that should be addressed:
- Code structure improvements: Missing docstrings, unused parameters, and attribute initialization
- Code complexity: The
extract_chapter_content
method has high cyclomatic complexity and should be refactored into smaller helper methods- Formatting issues: Whitespace before colon in slice operation
- List comprehension complexity: The complex list comprehension with walrus operator should be simplified
These issues collectively impact code maintainability and readability. Please address the previous review feedback to improve the overall code quality.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
backend/apps/ai/Makefile
(1 hunks)backend/apps/ai/common/constants.py
(1 hunks)backend/apps/ai/management/commands/ai_create_chapter_chunks.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (1)
Learnt from: M-ayank2005
PR: OWASP/Nest#1282
File: frontend/src/pages/About.tsx:94-98
Timestamp: 2025-04-04T05:23:43.562Z
Learning: In the OWASP/Nest frontend project, DOMPurify is used to sanitize HTML content before rendering it with dangerouslySetInnerHTML, following security best practices to prevent XSS vulnerabilities.
🧬 Code Graph Analysis (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (4)
backend/apps/ai/models/chunk.py (4)
Chunk
(13-88)bulk_save
(40-42)split_text
(45-52)update_data
(55-88)backend/apps/ai/management/commands/ai_create_slack_message_chunks.py (3)
Command
(18-93)handle
(21-43)create_chunks
(45-93)backend/apps/owasp/graphql/nodes/chapter.py (1)
suggested_location
(54-56)backend/apps/owasp/graphql/nodes/common.py (1)
related_urls
(18-20)
🪛 checkmake (0.2.2)
backend/apps/ai/Makefile
[warning] 1-1: Missing required phony target "all"
(minphony)
[warning] 1-1: Missing required phony target "clean"
(minphony)
[warning] 1-1: Missing required phony target "test"
(minphony)
🪛 Flake8 (7.2.0)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 60-60: whitespace before ':'
(E203)
🪛 Pylint (3.3.7)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 7-7: Unable to import 'openai'
(E0401)
[error] 8-8: Unable to import 'django.core.management.base'
(E0401)
[error] 10-14: Unable to import 'apps.ai.common.constants'
(E0401)
[error] 15-15: Unable to import 'apps.ai.models.chunk'
(E0401)
[error] 16-16: Unable to import 'apps.owasp.models.chapter'
(E0401)
[convention] 19-19: Missing class docstring
(C0115)
[convention] 22-22: Missing function or method docstring
(C0116)
[convention] 34-34: Missing function or method docstring
(C0116)
[warning] 34-34: Unused argument 'args'
(W0613)
[refactor] 127-127: Too many branches (24/12)
(R0912)
[refactor] 127-127: Too many statements (55/50)
(R0915)
[warning] 41-41: Attribute 'openai_client' defined outside init
(W0201)
[warning] 105-105: Attribute 'last_request_time' defined outside init
(W0201)
🔇 Additional comments (2)
backend/apps/ai/common/constants.py (1)
1-5
: LGTM! Well-defined constants for AI app configuration.The constants are appropriately named and have reasonable values for API rate limiting and text processing. The centralization of these values will improve maintainability across the AI app.
backend/apps/ai/Makefile (1)
1-3
: LGTM! Follows established pattern and proper ordering.The new Makefile target is correctly implemented following the same pattern as the existing
ai-create-slack-message-chunks
target and is properly ordered alphabetically.
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
backend/tests/apps/ai/models/chunk_test.py (1)
127-164
: Test correctly validates no-save behavior.The test properly verifies that
save()
is not called whensave=False
, maintaining the expected behavior with generic foreign keys.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
backend/tests/apps/ai/models/chunk_test.py
(2 hunks)
🔇 Additional comments (4)
backend/tests/apps/ai/models/chunk_test.py (4)
3-3
: LGTM!The ContentType import and the addition of
mock.id
are appropriate changes that align with the model's transition to generic foreign keys.Also applies to: 14-14
19-36
: LGTM!The test correctly validates the string representation using the new generic foreign key approach with appropriate mocking.
101-125
: LGTM!The test correctly validates the behavior when a chunk already exists, properly using the generic foreign key approach.
171-177
: LGTM!The test appropriately validates the generic foreign key field types and relationships.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (9)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (8)
19-21
: Add a class docstring for better documentation.While the
help
attribute provides basic information, adding a proper class docstring would improve code documentation and address the pylint warning.
22-38
: Add docstring to document command-line arguments.
40-40
: Add docstring and remove unused parameter.The
args
parameter is not used in the method.
47-47
: Consider initializing instance attributes in init method.The
openai_client
attribute is defined outside__init__
, which could lead to attribute errors if methods are called in unexpected order.
64-64
: Fix whitespace formatting issue.Remove the whitespace before the colon to comply with PEP 8.
95-106
: Consider initializing instance attributes in init method.The
last_request_time
attribute is accessed before being assigned, which could lead to attribute errors.
108-123
: Simplify the complex list comprehension for better readability.The list comprehension with walrus operator and filtering is hard to read and maintain.
128-211
: Refactor this method to reduce complexity.This method has high cyclomatic complexity (25 branches) and too many statements. Consider breaking it down into smaller, focused helper methods.
backend/apps/ai/models/chunk.py (1)
19-22
: Revisit nullable content_type with unique_together constraint.The current model allows
content_type
to be nullable, but it's part of theunique_together
constraint. This could lead to unexpected behavior where multiple chunks withNULL
content_type and the same text would be allowed.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
backend/apps/ai/common/constants.py
(1 hunks)backend/apps/ai/management/commands/ai_create_chapter_chunks.py
(1 hunks)backend/apps/ai/models/chunk.py
(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (1)
Learnt from: M-ayank2005
PR: OWASP/Nest#1282
File: frontend/src/pages/About.tsx:94-98
Timestamp: 2025-04-04T05:23:43.562Z
Learning: In the OWASP/Nest frontend project, DOMPurify is used to sanitize HTML content before rendering it with dangerouslySetInnerHTML, following security best practices to prevent XSS vulnerabilities.
🧬 Code Graph Analysis (1)
backend/apps/ai/models/chunk.py (3)
backend/apps/common/models.py (2)
BulkSaveModel
(8-30)TimestampedModel
(33-40)backend/apps/common/utils.py (1)
truncate
(164-176)backend/apps/slack/models/message.py (2)
Meta
(18-21)text
(83-85)
🪛 Pylint (3.3.7)
backend/apps/ai/models/chunk.py
[error] 3-3: Unable to import 'django.contrib.contenttypes.fields'
(E0401)
[error] 4-4: Unable to import 'django.contrib.contenttypes.models'
(E0401)
[error] 5-5: Unable to import 'django.db'
(E0401)
[error] 6-6: Unable to import 'langchain.text_splitter'
(E0401)
[error] 7-7: Unable to import 'pgvector.django'
(E0401)
[convention] 16-16: Missing class docstring
(C0115)
[refactor] 16-16: Too few public methods (0/2)
(R0903)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 7-7: Unable to import 'openai'
(E0401)
[error] 8-8: Unable to import 'django.core.management.base'
(E0401)
[error] 10-14: Unable to import 'apps.ai.common.constants'
(E0401)
[error] 15-15: Unable to import 'apps.ai.models.chunk'
(E0401)
[error] 16-16: Unable to import 'apps.owasp.models.chapter'
(E0401)
[convention] 19-19: Missing class docstring
(C0115)
[convention] 22-22: Missing function or method docstring
(C0116)
[convention] 40-40: Missing function or method docstring
(C0116)
[warning] 40-40: Unused argument 'args'
(W0613)
[refactor] 128-128: Too many branches (24/12)
(R0912)
[refactor] 128-128: Too many statements (53/50)
(R0915)
[warning] 47-47: Attribute 'openai_client' defined outside init
(W0201)
[warning] 106-106: Attribute 'last_request_time' defined outside init
(W0201)
🪛 Flake8 (7.2.0)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 64-64: whitespace before ':'
(E203)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: Run backend tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (3)
backend/apps/ai/common/constants.py (1)
1-6
: LGTM!The constants are well-named and provide sensible defaults for API rate limiting and text delimiting.
backend/apps/ai/models/chunk.py (2)
27-37
: Well-designed generic string representation.The
__str__
method elegantly handles different content types by checking for common attributes (name
,key
) before falling back to string conversion.
55-88
: Clean refactoring to support generic relations.The
update_data
method has been properly updated to work with any content object while maintaining the existing behavior and API.
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
Outdated
Show resolved
Hide resolved
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (9)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (9)
19-21
: Add a class docstring for better documentation.While the
help
attribute provides basic information, adding a proper class docstring would improve code documentation and address the pylint warning.
22-22
: Add docstring to document command-line arguments.
40-40
: Add docstring and remove unused parameter.The
args
parameter is not used in the method.
47-47
: Consider initializing instance attributes in init method.The
openai_client
andlast_request_time
attributes are defined outside__init__
, which could lead to attribute errors if methods are called in unexpected order.
64-64
: Fix whitespace formatting issue.Remove the whitespace before the colon to comply with PEP 8.
95-106
: Consider initializing instance attributes in init method.The
openai_client
andlast_request_time
attributes are defined outside__init__
, which could lead to attribute errors if methods are called in unexpected order.
108-123
: Simplify the complex list comprehension for better readability.The list comprehension with walrus operator and filtering is hard to read and maintain.
128-211
: Refactor this method to reduce complexity.This method has high cyclomatic complexity (24 branches) and too many statements. Consider breaking it down into smaller, focused helper methods.
195-195
: Fix incorrect variable usage in leaders information.Line 195 incorrectly appends
location_parts
instead ofleaders_info
to the metadata.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (1)
Learnt from: M-ayank2005
PR: OWASP/Nest#1282
File: frontend/src/pages/About.tsx:94-98
Timestamp: 2025-04-04T05:23:43.562Z
Learning: In the OWASP/Nest frontend project, DOMPurify is used to sanitize HTML content before rendering it with dangerouslySetInnerHTML, following security best practices to prevent XSS vulnerabilities.
🧬 Code Graph Analysis (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (4)
backend/apps/ai/models/chunk.py (4)
Chunk
(13-88)bulk_save
(40-42)split_text
(45-52)update_data
(55-88)backend/apps/ai/management/commands/ai_create_slack_message_chunks.py (3)
Command
(18-93)handle
(21-43)create_chunks
(45-93)backend/apps/owasp/graphql/nodes/chapter.py (1)
suggested_location
(54-56)backend/apps/owasp/graphql/nodes/common.py (1)
related_urls
(18-20)
🪛 Flake8 (7.2.0)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 64-64: whitespace before ':'
(E203)
🪛 Pylint (3.3.7)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py
[error] 7-7: Unable to import 'openai'
(E0401)
[error] 8-8: Unable to import 'django.core.management.base'
(E0401)
[error] 10-14: Unable to import 'apps.ai.common.constants'
(E0401)
[error] 15-15: Unable to import 'apps.ai.models.chunk'
(E0401)
[error] 16-16: Unable to import 'apps.owasp.models.chapter'
(E0401)
[convention] 19-19: Missing class docstring
(C0115)
[convention] 22-22: Missing function or method docstring
(C0116)
[convention] 40-40: Missing function or method docstring
(C0116)
[warning] 40-40: Unused argument 'args'
(W0613)
[refactor] 128-128: Too many branches (24/12)
(R0912)
[refactor] 128-128: Too many statements (53/50)
(R0915)
[warning] 47-47: Attribute 'openai_client' defined outside init
(W0201)
[warning] 106-106: Attribute 'last_request_time' defined outside init
(W0201)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: Run backend tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/apps/ai/management/commands/ai_create_chapter_chunks.py (1)
1-17
: Well-organized imports and proper use of constants.Good practice importing timing constants from a centralized location.
"--batch-size", | ||
type=int, | ||
default=50, | ||
help="Number of chapters to process in each batch", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Add validation for batch-size argument.
Consider adding validation to ensure batch-size is positive.
parser.add_argument(
"--batch-size",
type=int,
default=50,
help="Number of chapters to process in each batch",
)
+ parser.add_argument(
+ "--batch-size",
+ type=int,
+ default=50,
+ help="Number of chapters to process in each batch",
+ choices=range(1, 1001),
+ metavar="{1..1000}",
+ )
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/apps/ai/management/commands/ai_create_chapter_chunks.py around lines
34 to 37, the batch-size argument lacks validation to ensure it is positive. Add
a check after parsing the batch-size argument to verify it is greater than zero,
and raise an appropriate error or exit with a message if the value is zero or
negative. This will prevent invalid batch sizes from being used during
processing.
"--all", | ||
action="store_true", | ||
help="Process all the chapters", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Clarify that --all includes inactive chapters.
The help text should explicitly mention that this option processes both active and inactive chapters to avoid confusion.
parser.add_argument(
"--all",
action="store_true",
- help="Process all the chapters",
+ help="Process all chapters (including inactive)",
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"--all", | |
action="store_true", | |
help="Process all the chapters", | |
parser.add_argument( | |
"--all", | |
action="store_true", | |
help="Process all chapters (including inactive)", | |
) |
🤖 Prompt for AI Agents
In backend/apps/ai/management/commands/ai_create_chapter_chunks.py around lines
29 to 31, update the help text for the "--all" argument to explicitly state that
it processes both active and inactive chapters. Modify the help string to
clarify this behavior so users understand that using "--all" includes inactive
chapters as well.
for offset in range(0, total_chapters, batch_size): | ||
batch_chapters = queryset[offset : offset + batch_size] | ||
|
||
batch_chunks = [] | ||
for chapter in batch_chapters: | ||
batch_chunks.extend(self.create_chunks(chapter)) | ||
|
||
if batch_chunks: | ||
Chunk.bulk_save(batch_chunks) | ||
self.stdout.write(f"Saved {len(batch_chunks)} chunks") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Add progress tracking for better user feedback.
Consider adding progress indicators showing which batch is being processed.
batch_size = options["batch_size"]
+ batch_count = 0
for offset in range(0, total_chapters, batch_size):
+ batch_count += 1
+ self.stdout.write(f"Processing batch {batch_count}/{(total_chapters + batch_size - 1) // batch_size}...")
batch_chapters = queryset[offset : offset + batch_size]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
for offset in range(0, total_chapters, batch_size): | |
batch_chapters = queryset[offset : offset + batch_size] | |
batch_chunks = [] | |
for chapter in batch_chapters: | |
batch_chunks.extend(self.create_chunks(chapter)) | |
if batch_chunks: | |
Chunk.bulk_save(batch_chunks) | |
self.stdout.write(f"Saved {len(batch_chunks)} chunks") | |
batch_size = options["batch_size"] | |
+ batch_count = 0 | |
for offset in range(0, total_chapters, batch_size): | |
+ batch_count += 1 | |
+ self.stdout.write( | |
+ f"Processing batch {batch_count}/{(total_chapters + batch_size - 1) // batch_size}..." | |
+ ) | |
batch_chapters = queryset[offset : offset + batch_size] | |
batch_chunks = [] | |
for chapter in batch_chapters: | |
batch_chunks.extend(self.create_chunks(chapter)) | |
if batch_chunks: | |
Chunk.bulk_save(batch_chunks) | |
self.stdout.write(f"Saved {len(batch_chunks)} chunks") |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 64-64: whitespace before ':'
(E203)
🤖 Prompt for AI Agents
In backend/apps/ai/management/commands/ai_create_chapter_chunks.py around lines
63 to 73, add progress tracking by including a print or log statement before
processing each batch to indicate the current batch number and total batches.
Calculate the total number of batches based on total_chapters and batch_size,
then output progress like "Processing batch X of Y" before processing each batch
to provide better user feedback.
Resolves #1682