Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🍱 Add priority for chunks & update dependencies #435

Merged
merged 4 commits into from
Oct 26, 2024
Merged

Conversation

cristianpb
Copy link
Member

@cristianpb cristianpb commented Oct 23, 2024

  • Propagate the priority of the jobs to the chunks. Chunks coming from small files will be treated on priority.
  • Fix typescript errors and update dependencies
  • The number of concurrent jobs should be also increased for master version, I think it's here 👇

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced error handling logic in the Express application for improved logging and response formatting.
    • Introduced job prioritization in CSV processing, allowing for better management of job execution order.
  • Bug Fixes

    • Updated route handler signatures for better type safety in API documentation routes.
    • Improved job status checks and response structure for job processing results.
  • Chores

    • Updated various dependencies to their latest versions for improved performance and security.

Copy link

coderabbitai bot commented Oct 23, 2024

Walkthrough

The pull request includes updates to the package.json file for the deces-backend project, focusing on dependency version upgrades across multiple libraries, including bullmq, express, and various ESLint-related packages. Additionally, type annotations for the res parameter in Express route handlers have been explicitly defined in documentation.ts, the return type of the error-handling middleware in server.ts has been broadened from ExResponse | void to any, and enhancements have been made to job processing features in processStream.ts.

Changes

File Change Summary
backend/package.json Updated versions for multiple dependencies: bullmq, express, fast-csv, fuzzball, nodemailer, tsoa, winston, and various devDependencies, including ESLint-related packages and TypeScript.
backend/src/controllers/documentation.ts Added type annotations for the res parameter in route handlers.
backend/src/server.ts Changed return type of error-handling middleware from `ExResponse
backend/src/processStream.ts Enhanced CSV processing with job priority handling and added a trailing comma in csvOptions.
backend/src/server.spec.ts Refactored job handling tests for clarity and reduced redundancy in job posting logic.

Possibly related PRs

  • 🗑 Remove eslintrc  #423: The changes in backend/package.json regarding ESLint-related packages are directly related to the removal of the deprecated eslintrc.js file in this PR, which also focuses on enhancing the project's adherence to current ESLint standards.
  • 🤴🏾 Jobs priorisation  #426: The modifications in backend/package.json include updates to the express and nodemailer packages, which are also relevant to the changes in job handling and email notifications introduced in this PR, particularly in the context of enhancing job processing and user notifications.

🐇 In the code, we hop and play,
With versions fresh, we clear the way.
Type annotations now in sight,
Error handling shines so bright!
Dependencies dance, all in line,
A backend strong, oh how divine!
🐇


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1ad22d0 and c4ca4d3.

📒 Files selected for processing (2)
  • backend/src/processStream.ts (2 hunks)
  • backend/src/server.spec.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/src/processStream.ts
  • backend/src/server.spec.ts

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
backend/src/controllers/documentation.ts (1)

Line range hint 10-41: Consider typing the swagger definition properly

The swagger definition template is currently typed as any. Consider creating proper interfaces for your Swagger configuration.

interface SwaggerDefinition {
  info: {
    title: string;
    version: string;
    description: string;
    license: {
      name: string;
      url: string;
    };
    contact: {
      name: string;
      url: string;
      email: string;
    };
  };
  servers: Array<{
    url: string;
    description: string;
  }>;
}

const swaggerDefinitionTemplate: SwaggerDefinition = {
  // ... your existing configuration
};
backend/package.json (1)

Line range hint 1-68: Recommendations for safe dependency updates

  1. Consider splitting this PR into smaller, focused updates:

    • Runtime dependencies (@types/express v5 deserves its own PR)
    • Development tooling (TypeScript, ESLint)
    • Other dependency updates
  2. Add test cases to verify the functionality with updated dependencies

  3. Document any breaking changes or required code modifications in the PR description

backend/src/server.ts (1)

Line range hint 97-134: Security: Avoid exposing stack traces in production.

While including stack traces in error responses is helpful for debugging, it can expose sensitive implementation details to potential attackers.

Consider conditionally including stack traces based on the environment:

    return res.status(500).json({
      message: {
        ...err.message && {error: err.message},
-       ...err.stack && {stacktrace: err.stack}
+       ...(process.env.NODE_ENV !== 'production' && err.stack && {stacktrace: err.stack})
      }
    });
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ecb95c8 and 2b3bfcb.

⛔ Files ignored due to path filters (1)
  • backend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (3)
  • backend/package.json (1 hunks)
  • backend/src/controllers/documentation.ts (1 hunks)
  • backend/src/server.ts (1 hunks)
🧰 Additional context used
🔇 Additional comments (4)
backend/package.json (3)

44-44: Major version bump in @types/express requires attention

The update to @types/express v5 is a major version change that might introduce breaking changes in type definitions. This could affect request/response type annotations throughout the codebase.

Let's check for potential type issues:

#!/bin/bash
# Look for Express Response/Request type usage that might need updates
ast-grep --pattern 'Response<$_>' 
ast-grep --pattern 'Request<$_>'

57-62: Verify TypeScript and ESLint compatibility

The updates include:

  • TypeScript ^5.6.3
  • ESLint ^9.13.0 and related plugins

These updates might introduce new compiler checks or linting rules.

Let's check for potential new TypeScript errors:

#!/bin/bash
# Look for TypeScript configuration
echo "TypeScript Config:"
cat tsconfig.json

# Look for ESLint configuration
echo "ESLint Config:"
cat .eslintrc*

Also applies to: 66-66


22-22: Verify compatibility with Express and BullMQ updates

The Express update (^4.21.1) coupled with @types/express v5 major update might require code changes. BullMQ update (^5.21.2) includes several minor versions which might introduce new features or fixes.

Let's verify the Express routes compatibility:

Also applies to: 24-24

✅ Verification successful

Let me try a different approach to verify the Express and BullMQ usage in the codebase.


Let me try one more time without specifying the file type since TypeScript isn't recognized.


Based on the search results, I can now provide a final response:

BullMQ and Express updates appear safe to proceed

The codebase usage of both Express and BullMQ follows standard patterns that are compatible with the updated versions:

  • Express usage is primarily for basic routing, middleware, and request handling (no deprecated features observed)
  • BullMQ implementation uses standard Queue and Worker patterns that are stable across v5.x versions:
    • Queue creation with Redis connection config
    • Basic job processing with Worker implementation
    • No usage of deprecated or removed features from BullMQ

The type definition update for Express (@types/express v5) should not cause issues as the codebase uses standard Express patterns and basic types (Request, Response, Router).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for Express route handlers that might need updates
rg -t typescript "app\.(get|post|put|delete|patch)" -A 2

Length of output: 215


Script:

#!/bin/bash
# Search for Express-related imports and usage
rg -t typescript "import.*express" -A 2

# Search for BullMQ-related imports and usage
rg -t typescript "import.*bullmq" -A 2

# Look for any Express router definitions
rg -t typescript "Router\(\)" -A 2

# Find Express middleware usage
rg -t typescript "app\.use\(" -A 2

# Search for BullMQ queue definitions and workers
rg -t typescript "Queue[^a-zA-Z]" -A 2
rg -t typescript "Worker[^a-zA-Z]" -A 2

Length of output: 449


Script:

#!/bin/bash
# Search for Express and BullMQ related code
rg "import.*express|import.*bullmq" -A 2

# Look for Express router and middleware usage
rg "Router\(\)|app\.use\(|app\.(get|post|put|delete|patch)" -A 2

# Search for BullMQ queue definitions and workers
rg "new Queue\(|new Worker\(" -A 2

# Find TypeScript/JavaScript files that might contain Express or BullMQ code
fd "\.(ts|js)$" --exec grep -l "express\|bullmq"

Length of output: 6016

backend/src/server.ts (1)

96-96: Verify consistent error handler return types across the codebase.

Let's ensure this type change is consistent with other error handlers in the codebase.

✅ Verification successful

Let me gather more specific information about the error handler implementation to make a proper assessment.


Based on the error handler implementation shown, I can now provide a final response:

The any return type is appropriate for this error handler middleware

The error handler middleware follows Express.js patterns and has multiple return paths:

  • Returns JSON responses with res.status().json() for different error types
  • Calls next() for unhandled cases

The any return type is actually correct here because:

  1. Express middleware can return void (when calling next())
  2. Or return the result of res.status().json() (which returns the response object)

The implementation properly handles different error types (ValidateError, JsonWebTokenError, Error) with appropriate status codes and response formats, and the return type flexibility is needed for this pattern.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other error handling middleware definitions
# Look for patterns that might need similar type updates

# Find Express error handling middleware (4 parameters where first is error)
ast-grep --pattern 'app.use((err: $_, req: $_, res: $_, next: $_) => { $$$ })'

# Find similar error handling patterns
rg -l "instanceof ValidateError|instanceof JsonWebTokenError" --type ts

Length of output: 176


Script:

#!/bin/bash
# Get the full context of error handler implementation
rg -A 10 -B 5 "app.use\(\(err:" backend/src/server.ts

# Look for any return type annotations in error handlers
ast-grep --pattern '(err: $_, req: $_, res: $_, next: $_): $_ => { $$$ }'

# Check for other error handling functions that might be related
rg -A 5 "function.*Error|catch.*Error" backend/src/server.ts

Length of output: 2003

Comment on lines +43 to +44
router.get(`/bulk.json`, (_, res: any) => res.send(specs));
router.get(`/tsoa.json`, (_, res: any) => res.send(swaggerDocument));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consider using proper Express types instead of any

Using any type for Express response objects reduces type safety and goes against TypeScript best practices. Since you've updated to @types/express@^5.0.0, you should leverage its built-in types.

Replace any with the proper Express types:

- router.get(`/bulk.json`, (_, res: any) => res.send(specs));
+ router.get(`/bulk.json`, (_, res: express.Response) => res.send(specs));

- router.get(`/tsoa.json`, (_, res: any) => res.send(swaggerDocument));
+ router.get(`/tsoa.json`, (_, res: express.Response) => res.send(swaggerDocument));

- router.get(`/svelte/swagger-example.json`, (_, res: any) => res.send(specs));
+ router.get(`/svelte/swagger-example.json`, (_, res: express.Response) => res.send(specs));

Alternatively, you could use the imported Router type to infer the correct types:

import { Router, Response } from 'express';
// Then use Response type in the handlers

Also applies to: 52-52

Comment on lines 46 to 48
specs.paths = {...specs.paths, ...swaggerDocument.paths}
specs.components = {...specs.components, ...swaggerDocument.components}
// specs.default = swaggerDocument.default;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Based on all the gathered information, I can now provide the final response:

Add OpenAPI type definitions to improve type safety

The current implementation uses any type for both the swagger specs and definition template. To improve type safety, you should:

  • Install and use @types/swagger-jsdoc and @types/swagger-ui-express for proper typing
  • Define an interface for the OpenAPI specification structure that includes paths and components
  • Type the swaggerDefinitionTemplate object according to the OpenAPI 3.0.0 schema

Example locations needing attention:

  • const swaggerDefinitionTemplate: any on line 11
  • const specs: any = swaggerJsdoc(options) on line 39

The verification revealed that:

  1. No proper type definitions are currently being used for the OpenAPI/Swagger objects
  2. The codebase is using OpenAPI 3.0.0 specification but relies on any types
  3. The imported packages' types are not being utilized effectively
  4. There's a risk of runtime errors due to untyped spread operations on specs.paths and specs.components
🔗 Analysis chain

Consider typing the swagger specs object

The specs object is typed as any and used in spread operations. This could lead to runtime errors if the structure of swaggerDocument doesn't match expectations.

Let's verify the structure of the swagger document:


Let me try a different approach to verify the swagger specification structure and typing in the codebase.

This should help us understand:

  1. If there are any existing type definitions for swagger/OpenAPI specs
  2. What types are being imported and used in the documentation controller
  3. If there are any interfaces defining the structure with paths and components

Let me try one more search to find any OpenAPI/Swagger types from the imported packages.

This should help us:

  1. See the full context of how the swagger document is being used
  2. Find any type declarations that might be available from the imported packages

Let's check the content of the found type declaration file and also look for OpenAPI types that might be available from swagger-jsdoc.

This should help us understand:

  1. What types are defined in the found declaration file
  2. If swagger-jsdoc provides any type definitions we can use
  3. If there are any OpenAPI type imports we might have missed
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check the structure of swagger.json to ensure compatibility
if [ -f backend/src/api/swagger.json ]; then
    echo "Analyzing swagger.json structure..."
    jq 'keys' backend/src/api/swagger.json
    echo "Checking paths structure..."
    jq '.paths | keys' backend/src/api/swagger.json
    echo "Checking components structure..."
    jq '.components | keys' backend/src/api/swagger.json
fi

Length of output: 41


Script:

#!/bin/bash
# Search for swagger type definitions and interfaces
echo "Searching for swagger/OpenAPI type definitions..."
rg -t ts "interface.*(?i)swagger|type.*(?i)swagger|(?i)openapi" -g "*.ts"

echo -e "\nChecking the documentation controller imports and type usage..."
rg -t ts "import.*swagger|import.*openapi" backend/src/controllers/documentation.ts

echo -e "\nLooking for swagger document structure..."
ast-grep --pattern 'interface $_ {
  paths?: $_;
  components?: $_;
}'

Length of output: 856


Script:

#!/bin/bash
# Check the full documentation controller file
echo "Checking documentation controller content..."
cat backend/src/controllers/documentation.ts

# Look for type declarations in node_modules types
echo -e "\nSearching for type declarations in project..."
fd -e d.ts -x rg -l "(?i)swagger|openapi"

Length of output: 2220


Script:

#!/bin/bash
# Check the found type declaration file
echo "Checking @jollie type declarations..."
cat "./backend/src/types/@jollie/index.d.ts"

# Look for swagger-jsdoc type definitions
echo -e "\nLooking for swagger-jsdoc types..."
rg "swagger-jsdoc" -A 5 -B 5 backend/src/types/

# Check if there are any OpenAPI type imports
echo -e "\nChecking for OpenAPI type imports..."
rg "from ['|\"]@types/swagger" -g "*.ts"

Length of output: 500

@@ -93,7 +93,7 @@ app.use((
req: ExRequest,
res: ExResponse,
next: NextFunction
): ExResponse | void => {
): any => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consider using a more specific return type instead of any.

While changing the return type to any resolves the TypeScript error, it bypasses TypeScript's type checking benefits. This could mask potential runtime issues and makes the code less maintainable.

Consider using a union type that explicitly covers all possible return values:

-): any => {
+): ExResponse | void | Promise<ExResponse | void> => {

This maintains type safety while accommodating both synchronous and asynchronous responses from the error handler.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
): any => {
): ExResponse | void | Promise<ExResponse | void> => {

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 2b3bfcb and 1ad22d0.

📒 Files selected for processing (2)
  • backend/src/processStream.ts (2 hunks)
  • backend/src/server.spec.ts (2 hunks)
🔇 Additional comments (2)
backend/src/server.spec.ts (2)

Line range hint 849-870: LGTM: Improved job status tracking logic.

The implementation correctly tracks the status of both jobs separately, which enhances the test's ability to verify job processing behavior.


873-876: Verify the job priority assertion logic.

The test assumes that the smaller job should finish faster, but this might not always be true in a concurrent processing environment. Consider adding a brief sleep between job submissions to ensure consistent test behavior.

Run the following script to check for similar priority tests in the codebase:

✅ Verification successful

Let me gather more context about the job processing implementation to better understand the priority and concurrency behavior.


Let me get more context about the specific test case and its surrounding implementation to understand the job processing behavior better.


Based on the gathered context, I can now provide the final response:

The priority test assertion is correctly implemented

The test is validating the expected behavior of the job queue system where job priority is automatically calculated based on the total number of rows (Math.round(options.totalRows/1000)+1). From the implementation in processStream.ts, we can see that:

  • Job1 has 3000 rows, resulting in priority = 4
  • Job2 has 500 rows, resulting in priority = 1

The test correctly verifies that the larger job (Job1) remains active while the smaller job (Job2) completes faster, which is the intended behavior of the priority-based queue system. No sleep is needed as the priority mechanism ensures consistent behavior.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for other priority-related tests that might need similar improvements

# Test: Search for priority-related test cases
rg -A 5 'priority.*test|test.*priority' --type ts

Length of output: 453


Script:

#!/bin/bash
# Search for job processing and queue related implementations
rg -A 10 'class.*Queue|processJob|job.*priority' --type ts

# Look for the specific test context
rg -B 10 -A 10 'priority.*test' backend/src/server.spec.ts

Length of output: 3368


Script:

#!/bin/bash
# Get the full context of the test case
rg -B 50 -A 50 'smaller job finish faster' backend/src/server.spec.ts

# Look for job queue implementation details
ast-grep --pattern 'class $_ {
  $$$
  process($$$) {
    $$$
  }
  $$$
}'

Length of output: 4187

Comment on lines 174 to 175
skipLines: job.data.skipLines,
priority: job.opts.priority
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unsupported priority option from csvOptions.

The priority property is not a valid configuration option for the @fast-csv/parse library. Including it in the csvOptions object will have no effect and may lead to confusion. It should be removed to prevent any unintended behavior.

You can remove the unsupported option as follows:

const csvOptions: any = {
  objectMode: true,
  delimiter: job.data.sep,
  headers: true,
  ignoreEmpty: true,
  encoding: 'utf8',
  escape: job.data.escape,
  quote: job.data.quote,
  skipLines: job.data.skipLines,
- priority: job.opts.priority
};
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
skipLines: job.data.skipLines,
priority: job.opts.priority
skipLines: job.data.skipLines

Comment on lines 388 to 389
candidateNumber: this.candidateNumber,
priority: this.job.opts.priority
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Pass priority in job options rather than job data for chunkQueue.add().

To set the job's priority in BullMQ, you need to pass the priority in the job options parameter, not within the job data. Currently, priority is being passed inside the data object, which will not influence the job's execution priority.

Here's how you can modify the code to pass the priority correctly:

const job = await chunkQueue
  .add(jobId, {
    chunk: this.batch.map((r: any) => this.toRequest(r)),
    dateFormatA: this.dateFormatA,
    pruneScore: this.pruneScore,
    candidateNumber: this.candidateNumber
-   priority: this.job.opts.priority
  }, {
    attempts: 2,
    jobId,
+   priority: this.job.opts.priority
  })

Committable suggestion was skipped due to low confidence.

@cristianpb cristianpb changed the title 🍱 Update dependencies 🍱 Add priority for chunks & update dependencies Oct 24, 2024
@rhanka rhanka merged commit 9f3ab31 into dev Oct 26, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants