feat(webui): Support viewing search results in context for JSON logs (clp-json). #596

haiqi96 · 2024-11-19T01:33:14Z

Description

The existing implementation of webui only supports viewing unstructured logs. This PR supports viewing JSON logs in the webui.

With refactor in #584, JSON and IR log viewing share a large portion of common code, except two differences:

The streamId between Json and IR are different. IR's stream id is orig_file_id, whereas JSON's streamId is archive_id.
The job_config and Job type to be stored in the database are different.

This PR replaces the original IR extraction methods with stream extraction methods which can support both JSON and IR viewing. In addition to the original parameters, they take in an extra "JobType" to distinguish if the job is targeting JSON or IR.

Also fixes #608..

Validation performed

Tested package locally and verified that both CLP and CLP-S flow works

Summary by CodeRabbit

Release Notes

New Features
- Updated job submission process to support extraction of both CLP IR and JSON line files.
- Enhanced QueryStatus component with new job type handling and improved error messages.
- Introduced new constants and functions in SearchResultsTable for better handling of storage types and URL generation.
- Added configuration options for stream output size in the log viewer web UI.
- New parameter for stream target uncompressed size added to settings.
Bug Fixes
- Improved error handling for unsupported job types in query routes.
Documentation
- Updated metadata handling and response structures to reflect new extraction types.
- Clarified configuration comments related to stream file sizes.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)

components/log-viewer-webui/server/src/routes/query.js (3)
5-13: Enhance function documentation with return type

The JSDoc documentation is thorough but missing the @return annotation. Consider adding it to improve completeness.

Add this line to the JSDoc:
 * @param {int} sanitizedLogEventIdx
 * @throws {Error} if the extract stream job submission fails.
+ * @return {Promise<void>}
 */
20-34: Consider separate size configurations for IR and JSON

Currently, both IR and JSON extractions use the same streamTargetUncompressedSize configuration. These might benefit from separate configurations due to their different nature and typical sizes.

Consider using distinct configuration values:
-    const streamTargetUncompressedSize = settings.StreamTargetUncompressedSize;
+    const streamTargetUncompressedSize = extractJobType === QUERY_JOB_TYPE.EXTRACT_IR
+        ? settings.IrStreamTargetUncompressedSize
+        : settings.JsonStreamTargetUncompressedSize;
85-86: Standardize error message format

The error message format differs from other error messages in the codebase. Consider using a consistent template.
-                const err = new Error("Unable to find the metadata of extracted stream with " +
-                    `streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);
+                const err = new Error(
+                    `Failed to find extracted stream metadata (streamId: ${streamId}, logEventIdx: ${sanitizedLogEventIdx})`
+                );

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ce1acdb and b631316.

📒 Files selected for processing (1)

components/log-viewer-webui/server/src/routes/query.js (2 hunks)

🔇 Additional comments (1)

components/log-viewer-webui/server/src/routes/query.js (1)

64-66: ⚠️ Potential issue

Input validation still needed for request parameters

The previous review comment about input validation remains valid. The code should validate all input parameters before processing.

As suggested in the previous review, add comprehensive validation:

 const {extractJobType, logEventIdx, streamId} = req.body;
+if (!streamId || typeof streamId !== 'string') {
+    const err = new Error('Invalid streamId');
+    err.statusCode = 400;
+    throw err;
+}
+
+if (!extractJobType || !Object.values(QUERY_JOB_TYPE).includes(extractJobType)) {
+    const err = new Error(`Invalid extractJobType: ${extractJobType}`);
+    err.statusCode = 400;
+    throw err;
+}
+
 const sanitizedLogEventIdx = Number(logEventIdx);
+if (isNaN(sanitizedLogEventIdx) || sanitizedLogEventIdx < 0) {
+    const err = new Error('Invalid logEventIdx');
+    err.statusCode = 400;
+    throw err;
+}

junhaoliao

looks good: Passing the StreamTargetUncompressedSize from the package config to the server/settings.json with a fallback value

junhaoliao · 2024-11-21T15:14:52Z

components/package-template/src/etc/clp-config.yml

@@ -86,7 +86,7 @@
 #stream_output:
 #  directory: "var/data/streams"


Can we check if issue #608 is valid?

You are right. pushed a fix

Let also mention fixes #608 in the PR description to link the issue (or we can do it in the commit message body if you see fit).

junhaoliao

Sorry for providing a suboptimal refactoring suggestion previously. Let's see if the new proposal makes sense.

junhaoliao · 2024-11-21T15:34:14Z

components/log-viewer-webui/server/src/routes/query.js

+            await submitAndWaitForExtractStreamJob(
+                fastify,
+                extractJobType,
+                streamId,
+                sanitizedLogEventIdx
+            );


so sorry for sharing a not optimal refactoring technique previously. Instead of creating a submitAndWaitForExtractStreamJob helper in this file, can we move the job config construction logics into the DbManager's submitAndWaitForExtractStreamJob method?

Suggested change

await submitAndWaitForExtractStreamJob(

fastify,

extractJobType,

streamId,

sanitizedLogEventIdx

);

const extractResult = await fastify.dbManager.submitAndWaitForExtractStreamJob({

jobType: extractJobType,

logEventIdx: sanitizedLogEventIdx,

streamId: streamId,

targetUncompressedSize: settings.StreamTargetUncompressedSize,

});

if (null === extractResult) {

const err = new Error("Unable to extract stream with " +

`streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);

err.statusCode = 400;

throw err;

}

junhaoliao · 2024-11-21T15:35:23Z

components/log-viewer-webui/server/src/routes/query.js

+
+
+/**
+ * Submits a stream extraction job with the given parameters and waits for it.
+ *
+ * @param {import("fastify").FastifyInstance | {dbManager: DbManager}} fastify
+ * @param {QUERY_JOB_TYPE} extractJobType
+ * @param {string} streamId
+ * @param {int} sanitizedLogEventIdx
+ * @throws {Error} if the extract stream job submission fails.
+ */
+const submitAndWaitForExtractStreamJob = async (
+    fastify,
+    extractJobType,
+    streamId,
+    sanitizedLogEventIdx
+) => {
+    let jobConfig;
+    const streamTargetUncompressedSize = settings.StreamTargetUncompressedSize;
+    if (QUERY_JOB_TYPE.EXTRACT_IR === extractJobType) {
+        jobConfig = {
+            file_split_id: null,
+            msg_ix: sanitizedLogEventIdx,
+            orig_file_id: streamId,
+            target_uncompressed_size: streamTargetUncompressedSize,
+        };
+    } else if (QUERY_JOB_TYPE.EXTRACT_JSON === extractJobType) {
+        jobConfig = {
+            archive_id: streamId,
+            target_chunk_size: streamTargetUncompressedSize,
+        };
+    }
+
+    if (null === jobConfig) {
+        const err = new Error(`Unsupported Job type: ${extractJobType}`);
+        err.statusCode = 400;
+        throw err;
+    }
+
+    const extractResult = await fastify.dbManager.submitAndWaitForExtractStreamJob(
+        jobConfig,
+        extractJobType
+    );
+
+    if (null === extractResult) {
+        const err = new Error("Unable to extract stream with " +
+            `streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);
+
+        err.statusCode = 400;
+        throw err;
+    }
+};


If you agree, the logics in this helper can be moved into dbManager.submitAndWaitForExtractStreamJob. Let's see the suggestion there.

Suggested change

/**

* Submits a stream extraction job with the given parameters and waits for it.

*

* @param {import("fastify").FastifyInstance | {dbManager: DbManager}} fastify

* @param {QUERY_JOB_TYPE} extractJobType

* @param {string} streamId

* @param {int} sanitizedLogEventIdx

* @throws {Error} if the extract stream job submission fails.

*/

const submitAndWaitForExtractStreamJob = async (

fastify,

extractJobType,

streamId,

sanitizedLogEventIdx

) => {

let jobConfig;

const streamTargetUncompressedSize = settings.StreamTargetUncompressedSize;

if (QUERY_JOB_TYPE.EXTRACT_IR === extractJobType) {

jobConfig = {

file_split_id: null,

msg_ix: sanitizedLogEventIdx,

orig_file_id: streamId,

target_uncompressed_size: streamTargetUncompressedSize,

};

} else if (QUERY_JOB_TYPE.EXTRACT_JSON === extractJobType) {

jobConfig = {

archive_id: streamId,

target_chunk_size: streamTargetUncompressedSize,

};

}

if (null === jobConfig) {

const err = new Error(`Unsupported Job type: ${extractJobType}`);

err.statusCode = 400;

throw err;

}

const extractResult = await fastify.dbManager.submitAndWaitForExtractStreamJob(

jobConfig,

extractJobType

);

if (null === extractResult) {

const err = new Error("Unable to extract stream with " +

`streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);

err.statusCode = 400;

throw err;

}

};

junhaoliao · 2024-11-21T15:36:33Z

components/log-viewer-webui/server/src/DbManager.js

     * @return {Promise<number|null>} The ID of the job or null if an error occurred.
     */
-    async submitAndWaitForExtractIrJob (jobConfig) {
+    async submitAndWaitForExtractStreamJob (jobConfig, jobType) {
        let jobId;


Suggested change

let jobId;

let jobConfig;

if (QUERY_JOB_TYPE.EXTRACT_IR === jobType) {

jobConfig = {

file_split_id: null,

msg_ix: logEventIdx,

orig_file_id: streamId,

target_uncompressed_size: targetUncompressedSize,

};

} else if (QUERY_JOB_TYPE.EXTRACT_JSON === jobType) {

jobConfig = {

archive_id: streamId,

target_chunk_size: targetUncompressedSize,

};

}

let jobId;

junhaoliao · 2024-11-21T15:46:36Z

components/log-viewer-webui/server/src/routes/query.js

+    fastify.post("/query/extract-stream", async (req, resp) => {
+        const {extractJobType, logEventIdx, streamId} = req.body;
        const sanitizedLogEventIdx = Number(logEventIdx);


I agree, for extractJobType and streamId.
Let's add this list, to be exported, directly below QUERY_JOB_TYPE

/** * List of valid extract job types. */ const EXTRACT_JOB_TYPES = Object.freeze([ QUERY_JOB_TYPE.EXTRACT_IR, QUERY_JOB_TYPE.EXTRACT_JSON, ]);

and add a check for extractJobType here. e.g.,

if (false === EXTRACT_JOB_TYPES.includes(extractJobType)) { const err = new Error("Invalid extractJobType"); err.statusCode = 400; throw err; }

We also want to check / sanitize the streamId to avoid SQL injection attacks. @kirkrodrigues do you think we should check for the input parameter's length and limit the characters set? Or maybe checking it against a RegExp is efficient.

junhaoliao

I followed the Validation instructions and tested the code, which worked.
There's one suggestion about moving the extract job config construction logics into the db manager, and another one about checking / sanitizing the query server inputs; otherwise, the PR is good to be merged.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)

components/log-viewer-webui/server/src/DbManager.js (1)

165-176: Consider renaming 'orig_file_id' to 'stream_id' in a future update.

Based on the previous discussion in PR comments, using 'orig_file_id' for both IR and JSON streams is a temporary solution. A future update should rename this field to 'stream_id' for better semantic clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d06be72 and b9b8440.

📒 Files selected for processing (2)

components/log-viewer-webui/server/src/DbManager.js (4 hunks)
components/log-viewer-webui/server/src/routes/query.js (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

components/log-viewer-webui/server/src/routes/query.js

🔇 Additional comments (3)

components/log-viewer-webui/server/src/DbManager.js (3)

65-65: LGTM! Well-structured enum and constant additions.

The new enum value and constant array are properly implemented, maintaining consistency with the existing codebase structure.

Also applies to: 69-75

113-140: LGTM! Well-implemented job type handling.

The method properly differentiates between IR and JSON configurations while maintaining a clean interface.

271-272: LGTM! Appropriate exports added.

The new exports provide necessary access to the job type enums and constants.

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)

components/log-viewer-webui/server/src/routes/query.js (2)

29-57: Consider optimizing metadata retrieval flow

The code makes two separate calls to getExtractedStreamFileMetadata. Consider restructuring to avoid this duplication.

+        const getMetadata = async () => {
+            const metadata = await fastify.dbManager.getExtractedStreamFileMetadata(
+                streamId,
+                sanitizedLogEventIdx
+            );
+            if (metadata === null) {
+                resp.code(StatusCodes.BAD_REQUEST);
+                throw new Error("Unable to find the metadata of extracted stream with " +
+                    `streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);
+            }
+            return metadata;
+        };
+
         let streamMetadata = await fastify.dbManager.getExtractedStreamFileMetadata(
             streamId,
             sanitizedLogEventIdx
         );

         if (null === streamMetadata) {
             const extractResult = await fastify.dbManager.submitAndWaitForExtractStreamJob(
                 extractJobType,
                 sanitizedLogEventIdx,
                 streamId,
                 settings.StreamTargetUncompressedSize,
             );

             if (null === extractResult) {
                 resp.code(StatusCodes.BAD_REQUEST);
                 throw new Error("Unable to extract stream with " +
                     `streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);
             }

-            streamMetadata = await fastify.dbManager.getExtractedStreamFileMetadata(
-                streamId,
-                sanitizedLogEventIdx
-            );
-
-            if (null === streamMetadata) {
-                resp.code(StatusCodes.BAD_REQUEST);
-                throw new Error("Unable to find the metadata of extracted stream with " +
-                    `streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);
-            }
+            streamMetadata = await getMetadata();
         }

60-60: Enhance response handling

Consider setting an explicit success status code and validating the response structure.

-        return streamMetadata;
+        resp.code(StatusCodes.OK);
+        return streamMetadata;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b9b8440 and 66e3466.

📒 Files selected for processing (1)

components/log-viewer-webui/server/src/routes/query.js (2 hunks)

🔇 Additional comments (1)

components/log-viewer-webui/server/src/routes/query.js (1)

3-3: Consider Node.js version compatibility

The with syntax for JSON imports is only available in newer Node.js versions. Please ensure your deployment environment supports this feature or consider using a more compatible import method.

✅ Verification successful

The with import syntax is supported as Node.js 22.4.0 is used for the log viewer webui

Based on the Taskfile.yml configuration, the log-viewer-webui component specifically uses Node.js v22.4.0 (which supports the with import syntax) for building and running. This is evident from the task configurations where nodejs-22 is used for the log-viewer-webui components.

Key findings:

The project explicitly uses Node.js v22.4.0 for log-viewer-webui components
The with import syntax is supported in Node.js since v20.10.0
The build system is properly configured to use the correct Node.js version

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if package.json specifies the required Node.js version
fd package.json --exec jq -r '.engines.node'

Length of output: 69

Script:

#!/bin/bash
# Check Node.js version requirements in package.json files
fd package.json --exec cat {} \; | grep -A 5 '"engines"'

# Check for any documentation about Node.js version requirements
fd -e md -e txt | xargs grep -l -i "node.*version\|prerequisites\|requirements" 2>/dev/null

# Look for any build or CI configuration files that might specify Node.js version
fd -e yml -e yaml | xargs grep -l "node-version\|nodejs" 2>/dev/null

Length of output: 636

Script:

#!/bin/bash
# Check the content of relevant CI and build files
cat Taskfile.yml lint-tasks.yml

# Check the content of relevant documentation files
cat docs/src/dev-guide/building-package.md

Length of output: 29999

coderabbitai · 2024-11-21T18:21:53Z

components/log-viewer-webui/server/src/routes/query.js

+        const {extractJobType, logEventIdx, streamId} = req.body;
        const sanitizedLogEventIdx = Number(logEventIdx);


🛠️ Refactor suggestion

Add validation for numeric conversion

The Number(logEventIdx) conversion should be validated to ensure it doesn't result in NaN or negative values.

- const sanitizedLogEventIdx = Number(logEventIdx); + const sanitizedLogEventIdx = Number(logEventIdx); + if (Number.isNaN(sanitizedLogEventIdx) || sanitizedLogEventIdx < 0) { + resp.code(StatusCodes.BAD_REQUEST); + throw new Error("logEventIdx must be a non-negative number"); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const {extractJobType, logEventIdx, streamId} = req.body;

const sanitizedLogEventIdx = Number(logEventIdx);

const {extractJobType, logEventIdx, streamId} = req.body;

const sanitizedLogEventIdx = Number(logEventIdx);

if (Number.isNaN(sanitizedLogEventIdx) || sanitizedLogEventIdx < 0) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error("logEventIdx must be a non-negative number");

}

coderabbitai · 2024-11-21T18:21:53Z

components/log-viewer-webui/server/src/routes/query.js

+        if (null === streamId) {
+            resp.code(StatusCodes.BAD_REQUEST);
+            throw new Error("streamId must not be null");
+        }


🛠️ Refactor suggestion

Enhance streamId validation

The current validation only checks for null. Consider adding checks for undefined, empty string, and string type.

- if (null === streamId) { + if (!streamId || typeof streamId !== 'string' || streamId.trim() === '') { resp.code(StatusCodes.BAD_REQUEST); - throw new Error("streamId must not be null"); + throw new Error("streamId must be a non-empty string"); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (null === streamId) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error("streamId must not be null");

}

if (!streamId || typeof streamId !== 'string' || streamId.trim() === '') {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error("streamId must be a non-empty string");

}

I feel what rabbit makes sense. Shall I just commit suggestions? @junhaoliao

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

I agree. Except that we do not need to check !streamId because typeof streamId !== "string" should have handled that.

Suggested change

if (null === streamId) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error("streamId must not be null");

}

if (typeof streamId !== "string" || 0 === streamId.trim().length) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error(`"streamId" must be a non-empty string.`);

}

junhaoliao

StreamOutput.directory default value change looks good.

junhaoliao

Changes in DbManager looks good.

junhaoliao

All good except one suggestion in a method signature.

junhaoliao · 2024-11-21T19:48:31Z

components/log-viewer-webui/server/src/routes/query.js

+        if (null === streamId) {
+            resp.code(StatusCodes.BAD_REQUEST);
+            throw new Error("streamId must not be null");
+        }


I agree. Except that we do not need to check !streamId because typeof streamId !== "string" should have handled that.

junhaoliao · 2024-11-21T19:54:27Z

components/log-viewer-webui/server/src/DbManager.js

+    async submitAndWaitForExtractStreamJob (
+        jobType,
+        logEventIdx,
+        streamId,
+        targetUncompressedSize
+    ) {


When there are multiple parameters to a single function, it's be better to use the object destructuring syntax:

Suggested change

async submitAndWaitForExtractStreamJob (

jobType,

logEventIdx,

streamId,

targetUncompressedSize

) {

async submitAndWaitForExtractStreamJob ({

jobType,

logEventIdx,

streamId,

targetUncompressedSize

}) {

The JsDoc can be updated as:

* @param {object} props * @param {number} props.jobType * @param {number} props.logEventIdx * @param {string} props.streamId * @param {number} props.targetUncompressedSize

Then the method call can be modified as:

const extractResult = await fastify.dbManager.submitAndWaitForExtractStreamJob({ jobType: extractJobType, logEventIdx: sanitizedLogEventIdx, streamId: streamId, targetUncompressedSize: settings.StreamTargetUncompressedSize, });

which looks more readable. What do you think?

junhaoliao · 2024-11-21T19:55:03Z

components/log-viewer-webui/server/src/DbManager.js

+export {EXTRACT_JOB_TYPES};
+export {QUERY_JOB_TYPE};


Suggested change

export {EXTRACT_JOB_TYPES};

export {QUERY_JOB_TYPE};

export {

EXTRACT_JOB_TYPES,

QUERY_JOB_TYPE,

};

junhaoliao · 2024-11-21T21:05:29Z

components/log-viewer-webui/server/src/routes/query.js

+        if (null === streamId) {
+            resp.code(StatusCodes.BAD_REQUEST);
+            throw new Error("streamId must not be null");
+        }


Suggested change

if (null === streamId) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error("streamId must not be null");

}

if (typeof streamId !== "string" || 0 === streamId.trim().length) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error(`"streamId" must be a non-empty string.`);

}

junhaoliao · 2024-11-21T21:06:17Z

components/log-viewer-webui/server/src/routes/query.js

+        if (false === EXTRACT_JOB_TYPES.includes(extractJobType)) {
+            resp.code(StatusCodes.BAD_REQUEST);
+            throw new Error("Invalid extractJobType");
+        }


Suggested change

if (false === EXTRACT_JOB_TYPES.includes(extractJobType)) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error("Invalid extractJobType");

}

if (false === EXTRACT_JOB_TYPES.includes(extractJobType)) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error(`Invalid extractJobType="${extractJobType}".`);

}

junhaoliao · 2024-11-21T21:06:56Z

components/log-viewer-webui/server/src/routes/query.js

-    fastify.post("/query/extract-ir", async (req, resp) => {
-        const {origFileId, logEventIdx} = req.body;
+    fastify.post("/query/extract-stream", async (req, resp) => {
+        const {extractJobType, logEventIdx, streamId} = req.body;


Let's move the parameter validation checks directly below this line.

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (5)

components/log-viewer-webui/server/src/routes/query.js (3)
Line range hint 6-14: Enhance route documentation

The JSDoc comment should include details about request parameters and possible error responses.
 /**
  * Creates query routes.
  *
+ * @param {Object} req.body
+ * @param {string} req.body.extractJobType - The type of extraction job (must be in EXTRACT_JOB_TYPES)
+ * @param {string} req.body.streamId - The ID of the stream to extract
+ * @param {number} req.body.logEventIdx - The index of the log event
+ * @throws {Error} When parameters are invalid or extraction fails
  * @param {import("fastify").FastifyInstance | {dbManager: DbManager}} fastify
  * @param {import("fastify").FastifyPluginOptions} options
  * @return {Promise<void>}
  */
42-44: Add logging for extraction failures

Consider logging extraction failures to help with debugging and monitoring.
+                fastify.log.error(`Stream extraction failed for streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);
                 resp.code(StatusCodes.BAD_REQUEST);
                 throw new Error("Unable to extract stream with " +
                     `streamId=${streamId} at logEventIdx=${sanitizedLogEventIdx}`);
59-59: Consider using a consistent response structure

For consistency across endpoints, consider wrapping the response in a standard structure.
-        return streamMetadata;
+        return {
+            success: true,
+            data: streamMetadata
+        };
components/log-viewer-webui/server/src/DbManager.js (2)
69-75: Enhance constant documentation

Consider adding more descriptive JSDoc comments to explain the purpose and usage of EXTRACT_JOB_TYPES. For example:
 /**
- * List of valid extract job types.
+ * An immutable array containing valid job types for stream extraction operations.
+ * Used for validation and type checking in stream extraction related operations.
+ * @type {ReadonlyArray<number>}
  */
172-177: Track technical debt: field naming in MongoDB

As discussed in previous reviews, orig_file_id is being used for both IR and JSON streams. Consider creating a technical debt ticket to standardize this field to stream_id in a future release.

Would you like me to create a GitHub issue to track this technical debt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 66e3466 and a897682.

📒 Files selected for processing (2)

components/log-viewer-webui/server/src/DbManager.js (4 hunks)
components/log-viewer-webui/server/src/routes/query.js (2 hunks)

🔇 Additional comments (4)

components/log-viewer-webui/server/src/routes/query.js (2)

15-60: Implementation aligns well with requirements

The route handler successfully implements the stream extraction functionality with proper validation and error handling. The code is well-structured and maintainable.

3-3: Verify JSON import assertion compatibility

The JSON import assertion syntax (with {type: "json"}) is a newer feature. Ensure your deployment environment supports this syntax or consider providing a fallback.

✅ Verification successful

JSON import assertions are properly supported

The project's configuration in log-viewer-webui-server/package.json explicitly supports JSON import assertions through the @babel/plugin-syntax-import-attributes plugin, and the project is configured as an ES module with "type": "module". The syntax is safe to use in this environment.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check package.json for Node.js version requirements
rg -A 5 '"engines"' package.json

Length of output: 87

Script:

#!/bin/bash
# Search for package.json files in the repository
fd -t f "package.json" -x cat {}

Length of output: 5779

components/log-viewer-webui/server/src/DbManager.js (2)

65-65: LGTM: Enum value addition follows established pattern

The addition of EXTRACT_JSON to QUERY_JOB_TYPE maintains consistency with the existing enum structure.

271-274: LGTM: Exports follow best practices

The addition of named exports for EXTRACT_JOB_TYPES and QUERY_JOB_TYPE is appropriate and follows JavaScript best practices.

coderabbitai · 2024-11-21T21:17:52Z

components/log-viewer-webui/server/src/routes/query.js


-        let irMetadata = await fastify.dbManager.getExtractedIrFileMetadata(
-            origFileId,
+        const sanitizedLogEventIdx = Number(logEventIdx);


🛠️ Refactor suggestion

Add validation for logEventIdx parameter

The logEventIdx parameter should be validated before conversion to ensure it's defined.

+ if (undefined === logEventIdx) { + resp.code(StatusCodes.BAD_REQUEST); + throw new Error("\"logEventIdx\" must be provided."); + } const sanitizedLogEventIdx = Number(logEventIdx);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const sanitizedLogEventIdx = Number(logEventIdx);

if (undefined === logEventIdx) {

resp.code(StatusCodes.BAD_REQUEST);

throw new Error("\"logEventIdx\" must be provided.");

}

const sanitizedLogEventIdx = Number(logEventIdx);

coderabbitai · 2024-11-21T21:17:52Z

components/log-viewer-webui/server/src/DbManager.js

+    async submitAndWaitForExtractStreamJob ({
+        jobType,
+        logEventIdx,
+        streamId,
+        targetUncompressedSize,
+    }) {
+        let jobConfig;
+        if (QUERY_JOB_TYPE.EXTRACT_IR === jobType) {
+            jobConfig = {
+                file_split_id: null,
+                msg_ix: logEventIdx,
+                orig_file_id: streamId,
+                target_uncompressed_size: targetUncompressedSize,
+            };
+        } else if (QUERY_JOB_TYPE.EXTRACT_JSON === jobType) {
+            jobConfig = {
+                archive_id: streamId,
+                target_chunk_size: targetUncompressedSize,
+            };
+        }


🛠️ Refactor suggestion

Add validation for job type parameter

The method should validate that the provided jobType is one of the valid extract job types defined in EXTRACT_JOB_TYPES.

async submitAndWaitForExtractStreamJob ({ jobType, logEventIdx, streamId, targetUncompressedSize, }) { + if (!EXTRACT_JOB_TYPES.includes(jobType)) { + throw new Error(`Invalid job type: ${jobType}. Expected one of: ${EXTRACT_JOB_TYPES}`); + } let jobConfig; if (QUERY_JOB_TYPE.EXTRACT_IR === jobType) {

Committable suggestion skipped: line range outside the PR's diff.

junhaoliao

For the PR tile, how about

feat(webui): Support viewing search results in context for JSON logs (clp-json).

…(clp-json). (y-scope#596) Co-authored-by: Junhao Liao <[email protected]>

haiqi96 added 30 commits November 12, 2024 14:56

Make it work with a lot of duplications

97d3e31

deduplication the code in task launching scripts

1d8e8e5

Remove redundant task for json. Combine them into one

c00e28a

Linter

08acaa7

refactor

65bdd13

Linter

3de1532

refactor

6bd1eef

proposal

fbe5ced

Add actual refactor

8c93ba3

Linter

15c4b1d

deduplication

8245200

Decompressor end renaming

a0d4324

Renaming for task and scheduler

0edfe49

Mass renaming

1f1dbb9

Renaming for webui

e5a691c

missing change

29322c5

polishing

41b1023

Update comments

2bcab50

fixes

b19b9a9

fixes

a7a222d

fixes2

f05d122

backup

8a93597

update webui part

563f11b

Support log event index version

f274da9

Tried my best with linter

ede18a8

Factor variables to be global

d2eae63

Fix linter again

d12a39d

Merge branch 'main' into webuiLogEventIdx

fdffdb9

Update to support latest CLP-S's search results output format

9e41095

Merge branch 'main' into webuiLogEventIdx

6a3d7c1

fix

b631316

coderabbitai bot reviewed Nov 19, 2024

View reviewed changes

Merge branch 'main' into webuiLogEventIdx

1bb72c0

haiqi96 requested a review from junhaoliao November 20, 2024 02:10

junhaoliao reviewed Nov 21, 2024

View reviewed changes

fix missing changes

d06be72

junhaoliao requested changes Nov 21, 2024

View reviewed changes

junhaoliao reviewed Nov 21, 2024

View reviewed changes

address code review

b9b8440

coderabbitai bot reviewed Nov 21, 2024

View reviewed changes

linter

66e3466

coderabbitai bot reviewed Nov 21, 2024

View reviewed changes

haiqi96 requested a review from junhaoliao November 21, 2024 18:57

junhaoliao reviewed Nov 21, 2024

View reviewed changes

junhaoliao requested changes Nov 21, 2024

View reviewed changes

commit code suggestions

99d2240

junhaoliao requested changes Nov 21, 2024

View reviewed changes

Rearrange code and polish error message

a897682

coderabbitai bot reviewed Nov 21, 2024

View reviewed changes

haiqi96 requested a review from junhaoliao November 21, 2024 21:36

junhaoliao approved these changes Nov 21, 2024

View reviewed changes

haiqi96 changed the title ~~feat(clp-package): Support JSON log viewing in webui~~ feat(webui): Support viewing search results in context for JSON logs (clp-json). Nov 21, 2024

haiqi96 merged commit 7d85ad3 into y-scope:main Nov 21, 2024
6 checks passed

coderabbitai bot mentioned this pull request Nov 26, 2024

feat(clp-package): Add support for deleting archives that are exclusively within a time range. #594

Merged

haiqi96 mentioned this pull request Dec 2, 2024

refactor(clp-package): Unify the metadata schema for JSON and IR streams. #620

Merged

jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024

feat(webui): Support viewing search results in context for JSON logs …

84f7366

…(clp-json). (y-scope#596) Co-authored-by: Junhao Liao <[email protected]>

haiqi96 deleted the webuiLogEventIdx branch December 6, 2024 20:29

coderabbitai bot mentioned this pull request Dec 12, 2024

feat(package)!: Add support for writing clp-s single file archives to S3. #634

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(webui): Support viewing search results in context for JSON logs (clp-json). #596

feat(webui): Support viewing search results in context for JSON logs (clp-json). #596

haiqi96 commented Nov 19, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot left a comment

junhaoliao left a comment

junhaoliao Nov 21, 2024

haiqi96 Nov 21, 2024

junhaoliao Nov 21, 2024

haiqi96 Nov 21, 2024

junhaoliao left a comment

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot Nov 21, 2024

coderabbitai bot Nov 21, 2024 •

edited

Loading

haiqi96 Nov 21, 2024

coderabbitai bot Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao left a comment

junhaoliao left a comment

junhaoliao left a comment

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

junhaoliao Nov 21, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 21, 2024

coderabbitai bot Nov 21, 2024

junhaoliao left a comment

		@@ -86,7 +86,7 @@
		#stream_output:
		# directory: "var/data/streams"

		const {extractJobType, logEventIdx, streamId} = req.body;
		const sanitizedLogEventIdx = Number(logEventIdx);

feat(webui): Support viewing search results in context for JSON logs (clp-json). #596

feat(webui): Support viewing search results in context for JSON logs (clp-json). #596

Conversation

haiqi96 commented Nov 19, 2024 • edited by coderabbitai bot Loading

Description

Validation performed

Summary by CodeRabbit

Release Notes

coderabbitai bot left a comment

Choose a reason for hiding this comment

junhaoliao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junhaoliao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junhaoliao left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 21, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot Nov 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junhaoliao left a comment

Choose a reason for hiding this comment

junhaoliao left a comment

Choose a reason for hiding this comment

junhaoliao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 21, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 21, 2024

Choose a reason for hiding this comment

junhaoliao left a comment

Choose a reason for hiding this comment

haiqi96 commented Nov 19, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot Nov 21, 2024 •

edited

Loading