-
Notifications
You must be signed in to change notification settings - Fork 86
feat(webui): Add dataset and paths to compression job table. #1798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughMigrates frontend job fetching from client-side SQL to a server API; adds compression metadata schemas and CLP I/O config types; implements server query, decoding and mapping; updates frontend Jobs page, types and utilities to consume the new API; removes legacy SQL helpers and adjusts related backend job config types. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (3)
🧰 Additional context used📓 Path-based instructions (1)**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (4)📚 Learning: 2024-11-21T15:51:33.203ZApplied to files:
📚 Learning: 2025-08-04T18:38:33.130ZApplied to files:
📚 Learning: 2025-07-23T09:54:45.185ZApplied to files:
📚 Learning: 2025-09-15T22:20:40.750ZApplied to files:
🧬 Code graph analysis (2)components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (2)
components/webui/server/src/routes/api/compress/index.ts (4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (13)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/webui/server/src/routes/api/archive-metadata/index.ts (1)
26-32: Endpoint executes arbitrary SQL queries without authentication or parameterized queries.Line 28 executes user-provided
queryStringdirectly against the database without parameterization or input sanitization. No authentication middleware is applied to this route, making it publicly accessible.Critical issues:
- SQL injection vulnerability: Direct SQL execution allows arbitrary database commands (DROP, DELETE, UPDATE, etc.)
- Missing authentication: No checks verify user identity before accepting queries
- Missing authorization: No role or permission validation
Require authentication on this endpoint and implement parameterized queries or query builder patterns (similar to
compress-metadatawhich usesgetCompressionMetadataQuery()).components/webui/server/src/routes/api/compress/index.ts (1)
69-86: Add type guard before casting to CompressionJobFsInputConfig.Line 72 performs an unsafe type cast to
CompressionJobFsInputConfigwithout verifyingjobConfig.input.typeis actuallyCompressionJobInputType.FS. While currently safe (sinceDEFAULT_COMPRESSION_JOB_CONFIGalways sets type to FS), the TODO comment on line 71 indicates S3 support will be added. When that happens, this cast becomes unsafe and could cause runtime errors.🔎 Recommended fix with type guard
const jobConfig: ClpIoConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG); // eslint-disable-next-line no-warning-comments //TODO: Add support for S3 input -(jobConfig.input as CompressionJobFsInputConfig).paths_to_compress = paths.map( - (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path -); +if (CompressionJobInputType.FS === jobConfig.input.type) { + jobConfig.input.paths_to_compress = paths.map( + (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path + ); +}Note: This assumes TypeScript's discriminated union type narrowing works for
ClpIoConfig. If not, you may need an explicit type assertion after the guard.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (15)
components/webui/client/src/api/compress-metadata/index.ts(1 hunks)components/webui/client/src/pages/IngestPage/Jobs/index.tsx(3 hunks)components/webui/client/src/pages/IngestPage/Jobs/sql.ts(0 hunks)components/webui/client/src/pages/IngestPage/Jobs/typings.tsx(3 hunks)components/webui/client/src/pages/IngestPage/Jobs/utils.ts(2 hunks)components/webui/client/src/pages/IngestPage/sqlConfig.ts(0 hunks)components/webui/common/src/schemas/compress-metadata.ts(1 hunks)components/webui/common/src/schemas/compression.ts(1 hunks)components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts(2 hunks)components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts(0 hunks)components/webui/server/src/routes/api/archive-metadata/index.ts(1 hunks)components/webui/server/src/routes/api/compress-metadata/index.ts(1 hunks)components/webui/server/src/routes/api/compress-metadata/sql.ts(1 hunks)components/webui/server/src/routes/api/compress-metadata/utils.ts(1 hunks)components/webui/server/src/routes/api/compress/index.ts(2 hunks)
💤 Files with no reviewable changes (3)
- components/webui/client/src/pages/IngestPage/sqlConfig.ts
- components/webui/client/src/pages/IngestPage/Jobs/sql.ts
- components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/webui/server/src/plugins/app/CompressionJobDbManager/index.tscomponents/webui/client/src/api/compress-metadata/index.tscomponents/webui/server/src/routes/api/compress-metadata/sql.tscomponents/webui/client/src/pages/IngestPage/Jobs/typings.tsxcomponents/webui/server/src/routes/api/archive-metadata/index.tscomponents/webui/common/src/schemas/compress-metadata.tscomponents/webui/common/src/schemas/compression.tscomponents/webui/server/src/routes/api/compress-metadata/utils.tscomponents/webui/server/src/routes/api/compress/index.tscomponents/webui/client/src/pages/IngestPage/Jobs/index.tsxcomponents/webui/client/src/pages/I 8000 ngestPage/Jobs/utils.tscomponents/webui/server/src/routes/api/compress-metadata/index.ts
🧠 Learnings (3)
📚 Learning: 2025-09-28T15:00:22.170Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 1340
File: components/job-orchestration/job_orchestration/executor/compress/compression_task.py:528-528
Timestamp: 2025-09-28T15:00:22.170Z
Learning: In components/job-orchestration/job_orchestration/executor/compress/compression_task.py, there is a suggestion to refactor from passing logger as a parameter through multiple functions to creating a ClpCompressor class that takes the logger as a class member, with current helper functions becoming private member functions.
Applied to files:
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.
Applied to files:
components/webui/server/src/routes/api/compress-metadata/sql.tscomponents/webui/client/src/pages/IngestPage/Jobs/typings.tsxcomponents/webui/server/src/routes/api/compress-metadata/utils.tscomponents/webui/server/src/routes/api/compress/index.tscomponents/webui/client/src/pages/IngestPage/Jobs/index.tsx
📚 Learning: 2025-09-15T22:20:40.750Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1125
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:267-291
Timestamp: 2025-09-15T22:20:40.750Z
Learning: For CLP compression jobs, the team has decided to fail the entire job immediately upon encountering any invalid input path, rather than continuing to process valid paths. This decision was made during PR #1125 development.
Applied to files:
components/webui/server/src/routes/api/compress/index.tscomponents/webui/client/src/pages/IngestPage/Jobs/utils.ts
🧬 Code graph analysis (5)
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)
components/webui/common/src/schemas/compression.ts (1)
ClpIoConfig(166-166)
components/webui/client/src/api/compress-metadata/index.ts (1)
components/webui/common/src/schemas/compress-metadata.ts (1)
CompressionMetadataDecoded(58-58)
components/webui/server/src/routes/api/compress/index.ts (3)
components/webui/common/src/schemas/compression.ts (3)
ClpIoConfig(166-166)CompressionJobInputType(159-159)CompressionJobFsInputConfig(169-169)components/webui/server/src/routes/api/compress/typings.ts (1)
CONTAINER_INPUT_LOGS_ROOT_DIR(7-7)components/webui/common/src/config.ts (1)
CLP_STORAGE_ENGINES(54-54)
components/webui/client/src/pages/IngestPage/Jobs/index.tsx (3)
components/webui/client/src/api/compress-metadata/index.ts (1)
fetchCompressionJobs(19-19)components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (1)
JobData(148-148)components/webui/client/src/pages/IngestPage/Jobs/utils.ts (1)
mapCompressionJobResponseToTableData(142-142)
components/webui/client/src/pages/IngestPage/Jobs/utils.ts (5)
components/webui/common/src/schemas/compression.ts (2)
ClpIoConfig(166-166)CompressionJobInputType(159-159)components/webui/common/src/config.ts (1)
CLP_STORAGE_ENGINES(54-54)components/webui/client/src/config/index.ts (1)
SETTINGS_STORAGE_ENGINE(24-24)components/webui/common/src/schemas/compress-metadata.ts (1)
CompressionMetadataDecoded(58-58)components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (2)
JobData(148-148)CompressionJobStatus(151-151)
🔇 Additional comments (20)
components/webui/server/src/routes/api/archive-metadata/index.ts (1)
28-28: Refactoring tofastify.mysql.queryis correct and idiomatic.The change from using a pool variable to calling
fastify.mysql.querydirectly is cleaner and the API is properly supported. This method exists as a utility on the fastify.mysql namespace and returns an array that can be destructured as shown, where the first element contains the query result set.components/webui/common/src/schemas/compression.ts (3)
75-81: LGTM! Enum matches Python InputType class.The enum definition is clear and aligns with the documented Python class structure.
101-110: Schema fields correctly match Python S3InputConfig definition.The nullable fields (
dataset,keys,timestamp_key) in the TypeScript schema match the Python class definition, where all three are defined as optional with| None. Theunstructuredfield is correctly non-nullable in both versions.
86-96: Verify nullable field requirements with Python FsInputConfig.The schema defines several fields as
Type.Union([Type.String(), Type.Null()]), making them nullable. Ensure this nullability matches the PythonFsInputConfigclass definition, particularly fordataset,path_prefix_to_remove, andtimestamp_key.#!/bin/bash # Description: Verify FsInputConfig field requirements in Python codebase # Search for FsInputConfig class definition rg -n -A 15 "class FsInputConfig" --type=pycomponents/webui/server/src/routes/api/compress-metadata/utils.ts (1)
42-72: LGTM! Mapping function is well-structured.The mapping logic correctly transforms database rows to the decoded metadata format. The destructuring approach is clean and maintainable.
components/webui/server/src/routes/api/compress-metadata/index.ts (1)
35-39: LGTM! Query and mapping logic is sound.The route correctly executes the query and maps the results using the utility function. Error handling will be managed by Fastify's default error handler.
components/webui/client/src/pages/IngestPage/Jobs/index.tsx (2)
20-26: LGTM! Data fetching migration is clean.The switch from SQL-based queries to the new
fetchCompressionJobsAPI withmapCompressionJobResponseToTableDatatransformation is well-structured. The query key remains stable for proper cache management.
38-40: Table configuration enhancements are appropriate.The addition of
rowKey, horizontal scrolling (x: true), and fixed layout are good improvements to accommodate the new dataset and paths columns mentioned in the PR objectives.components/webui/common/src/schemas/compress-metadata.ts (1)
40-47: Good design using partial schema for backward compatibility.Using
ClpIoPartialConfigSchemain the decoded schema properly supports the backward compatibility requirement mentioned in the PR objectives for v0.6.0 data.components/webui/server/src/routes/api/compress-metadata/sql.ts (1)
6-28: LGTM! Enum and interface definitions are well-structured.The enum provides type-safe column name references, and the interface correctly extends
RowDataPacketwith properly typed fields matching the database schema.components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)
44-44: Parameter type change already verified in all call sites.The caller in
components/webui/server/src/routes/api/compress/index.tsat line 89 correctly passesjobConfigof typeClpIoConfig, which matches the updated method signature. The oldCompressionJobConfigtype is not found in the codebase, confirming the change has been fully propagated.components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (3)
1-19: LGTM: Imports and setup are appropriate.The new imports support the conditional column visibility logic and the enhanced Paths column rendering with copyable text.
43-44: LGTM: JobData interface properly extended.The new fields align with the PR objectives and use appropriate nullable types for dataset.
50-130: LGTM: Conditional column logic is well-implemented.The conditional rendering of Dataset and Paths columns based on storage engine configuration is clean and the Paths column render function provides good UX with copyable text and tooltip ellipsis.
components/webui/server/src/routes/api/compress/index.ts (2)
4-9: LGTM: Import additions support the new CLP IO config types.
20-37: LGTM: DEFAULT_COMPRESSION_JOB_CONFIG properly updated to ClpIoConfig type.The new fields have sensible defaults and the object remains frozen to prevent mutations.
components/webui/client/src/pages/IngestPage/Jobs/utils.ts (4)
24-31: LGTM: stripPrefix follows coding guidelines.The implementation correctly uses
false === path.startsWith(prefix)rather than negation, adhering to the project's coding guidelines.
68-77: LGTM: extractDataFromIoConfig logic is correct.The conditional dataset extraction based on storage engine is appropriate, and the code follows coding guidelines.
114-128: LGTM: NaN checks follow coding guidelines.The code correctly uses
false === isNaN(...)pattern rather than!isNaN(...), adhering to the project's coding guidelines.
92-140: No changes required. The type castclpConfig as ClpIoConfigon line 101 is legitimate and intentional. Theclp_configfield is correctly defined asClpIoPartialConfigSchemainCompressionMetadataDecodedto support backward compatibility with older CLP releases, while the cast safely narrows the type to satisfy the function signature.Likely an incorrect or invalid review comment.
| const fetchCompressionJobs = async (): Promise<CompressionMetadataDecoded[]> => { | ||
| const {data} = await axios.get<CompressionMetadataDecoded[]>( | ||
| "/api/compress-metadata" | ||
| ); | ||
|
|
||
| return data; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider adding error handling or documenting error propagation.
The function doesn't handle errors from the axios request. While letting errors propagate to the caller is acceptable, consider:
- Adding JSDoc to document which errors can be thrown (network errors, 4xx/5xx responses)
- Or adding error handling to transform or enrich error messages for better debugging
🔎 Example with documented error propagation
/**
* Retrieves recent compression jobs (last 30 days).
*
+ * @throws {AxiosError} If the request fails due to network issues or server errors.
* @return Recent compression jobs metadata.
*/
const fetchCompressionJobs = async (): Promise<CompressionMetadataDecoded[]> => {Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In components/webui/client/src/api/compress-metadata/index.ts around lines 10 to
16, the axios GET call lacks error handling or documentation of propagated
errors; update the function by adding a JSDoc comment that documents the
possible errors (network errors, AxiosError for non-2xx responses) and either
wrap the axios request in a try/catch to rethrow a richer, contextual Error
(including HTTP status and response data when available) or explicitly document
that the function lets Axios errors propagate so callers must handle them.
| const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => { | ||
| const {input} = clpConfig; | ||
|
|
||
| // Fallback for legacy/partial payloads that include paths but omit the FS type | ||
| if ("paths_to_compress" in input) { | ||
| input.type = CompressionJobInputType.FS; | ||
| } | ||
|
|
||
| if (CompressionJobInputType.FS === input.type) { | ||
| const prefixToRemove = input.path_prefix_to_remove; | ||
| if (prefixToRemove) { | ||
| return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove)); | ||
| } | ||
|
|
||
| return input.paths_to_compress; | ||
| } | ||
|
|
||
| // eslint-disable-next-line no-warning-comments | ||
| // TODO: Add support to parse S3 paths; may need a bucket prefix from the CLP config. | ||
|
|
||
| return []; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid mutating the input config parameter.
Lines 43-45 mutate the input object by setting input.type = CompressionJobInputType.FS. This side effect could cause unexpected behaviour if the config object is used elsewhere after calling this function. Consider either cloning the config before mutation or handling legacy payloads without mutation.
🔎 Suggested approach to avoid mutation
Since this function only reads from the input config, document the mutation clearly or restructure to avoid it:
const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => {
const {input} = clpConfig;
- // Fallback for legacy/partial payloads that include paths but omit the FS type
- if ("paths_to_compress" in input) {
- input.type = CompressionJobInputType.FS;
- }
-
- if (CompressionJobInputType.FS === input.type) {
+ // Fallback for legacy/partial payloads that include paths but omit the FS type
+ if (CompressionJobInputType.FS === input.type || "paths_to_compress" in input) {
const prefixToRemove = input.path_prefix_to_remove;
if (prefixToRemove) {
return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove));
}
return input.paths_to_compress;
}This assumes that if paths_to_compress exists, it's safe to treat the input as FS type without mutation.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => { | |
| const {input} = clpConfig; | |
| // Fallback for legacy/partial payloads that include paths but omit the FS type | |
| if ("paths_to_compress" in input) { | |
| input.type = CompressionJobInputType.FS; | |
| } | |
| if (CompressionJobInputType.FS === input.type) { | |
| const prefixToRemove = input.path_prefix_to_remove; | |
| if (prefixToRemove) { | |
| return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove)); | |
| } | |
| return input.paths_to_compress; | |
| } | |
| // eslint-disable-next-line no-warning-comments | |
| // TODO: Add support to parse S3 paths; may need a bucket prefix from the CLP config. | |
| return []; | |
| }; | |
| const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => { | |
| const {input} = clpConfig; | |
| // Fallback for legacy/partial payloads that include paths but omit the FS type | |
| if (CompressionJobInputType.FS === input.type || "paths_to_compress" in input) { | |
| const prefixToRemove = input.path_prefix_to_remove; | |
| if (prefixToRemove) { | |
| return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove)); | |
| } | |
| return input.paths_to_compress; | |
| } | |
| // eslint-disable-next-line no-warning-comments | |
| // TODO: Add support to parse S3 paths; may need a bucket prefix from the CLP config. | |
| return []; | |
| }; |
🤖 Prompt for AI Agents
components/webui/client/src/pages/IngestPage/Jobs/utils.ts around lines 39 to
60: the function currently mutates the passed-in clpConfig.input by assigning
input.type = CompressionJobInputType.FS for legacy payloads; instead avoid side
effects by treating the detected legacy case locally (do not write back to
input) — either create a local variable (e.g., const resolvedType =
"paths_to_compress" in input ? CompressionJobInputType.FS : input.type) or
shallow-clone the input before modifying, then use resolvedType when branching
and leave the original clpConfig untouched.
| const CompressionMetadataSchema = Type.Intersect([ | ||
| CompressionMetadataBaseSchema, | ||
| Type.Object({ | ||
| clp_config: Type.Unsafe<Buffer>({}), | ||
| }), | ||
| ]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Document why Type.Unsafe is necessary for Buffer.
The use of Type.Unsafe<Buffer>({}) is correct since TypeBox doesn't provide a native Buffer type, but consider adding a comment explaining this to help future maintainers.
🔎 Suggested documentation
/**
* Compression metadata as stored, including the encoded IO config blob.
*/
const CompressionMetadataSchema = Type.Intersect([
CompressionMetadataBaseSchema,
Type.Object({
+ // TypeBox doesn't provide a native Buffer type, so Type.Unsafe is used
clp_config: Type.Unsafe<Buffer>({}),
}),
]);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const CompressionMetadataSchema = Type.Intersect([ | |
| CompressionMetadataBaseSchema, | |
| Type.Object({ | |
| clp_config: Type.Unsafe<Buffer>({}), | |
| }), | |
| ]); | |
| const CompressionMetadataSchema = Type.Intersect([ | |
| CompressionMetadataBaseSchema, | |
| Type.Object({ | |
| // TypeBox doesn't provide a native Buffer type, so Type.Unsafe is used | |
| clp_config: Type.Unsafe<Buffer>({}), | |
| }), | |
| ]); |
🤖 Prompt for AI Agents
In components/webui/common/src/schemas/compress-metadata.ts around lines 28 to
33, add a brief inline comment above the clp_config property explaining why
Type.Unsafe<Buffer> is used (TypeBox lacks a native Buffer type so Unsafe is
required to represent Node Buffer binary data), so future maintainers understand
the rationale; keep the comment concise and mention that the schema expects a
Node Buffer and that Unsafe is intentional.
| /** | ||
| * Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors | ||
| * for compression metadata from older CLP releases with missing data. | ||
| */ | ||
| const ClpIoPartialConfigSchema = | ||
| Type.Object({ | ||
| input: Type.Union([Type.Partial(ClpIoFsInputConfigSchema), | ||
| Type.Partial(ClpIoS3InputConfigSchema)]), | ||
| output: Type.Partial(ClpIoOutputConfigSchema), | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Document backward compatibility version.
The comment mentions "older CLP releases with missing data" but doesn't specify which version. Consider documenting the specific version (e.g., v0.6.0 mentioned in PR objectives) to clarify the compatibility requirement.
🔎 Suggested documentation improvement
/**
- * Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors
- * for compression metadata from older CLP releases with missing data.
+ * Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors
+ * for compression metadata from CLP releases prior to v0.7.0 with missing data.
*/
const ClpIoPartialConfigSchema =📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors | |
| * for compression metadata from older CLP releases with missing data. | |
| */ | |
| const ClpIoPartialConfigSchema = | |
| Type.Object({ | |
| input: Type.Union([Type.Partial(ClpIoFsInputConfigSchema), | |
| Type.Partial(ClpIoS3InputConfigSchema)]), | |
| output: Type.Partial(ClpIoOutputConfigSchema), | |
| }); | |
| /** | |
| * Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors | |
| * for compression metadata from CLP releases prior to v0.7.0 with missing data. | |
| */ | |
| const ClpIoPartialConfigSchema = | |
| Type.Object({ | |
| input: Type.Union([Type.Partial(ClpIoFsInputConfigSchema), | |
| Type.Partial(ClpIoS3InputConfigSchema)]), | |
| output: Type.Partial(ClpIoOutputConfigSchema), | |
| }); |
🤖 Prompt for AI Agents
In components/webui/common/src/schemas/compression.ts around lines 135 to 144,
the comment describing compatibility with "older CLP releases with missing data"
is vague; update the comment above ClpIoPartialConfigSchema to explicitly state
the backward-compatibility target (e.g., "supports CLP releases back to v0.6.0")
and, if relevant, note which fields were missing in those versions so
maintainers know why the schema is relaxed.
| }, | ||
| }, | ||
| async () => { | ||
| console.log("Fetching compression metadata..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Use Fastify's logger instead of console.log.
Replace console.log with fastify.log.info for consistent structured logging and proper log level management.
🔎 Proposed fix
async () => {
- console.log("Fetching compression metadata...");
+ fastify.log.info("Fetching compression metadata");
const [rows] = await fastify.mysql.query<CompressionMetadataQueryRow[]>(📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| console.log("Fetching compression metadata..."); | |
| fastify.log.info("Fetching compression metadata"); |
🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress-metadata/index.ts around line
34, the code uses console.log("Fetching compression metadata..."); which
bypasses Fastify's structured logger — replace this call with
fastify.log.info("Fetching compression metadata...") (or request.server.log.info
inside route handlers if `fastify` isn't in scope) to use Fastify's logger,
preserve log levels and structure, and remove the console.log call.
| const getCompressionMetadataQuery = () => ` | ||
| SELECT | ||
| id as ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS_MSG}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.START_TIME}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.DURATION}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UNCOMPRESSED_SIZE}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.COMPRESSED_SIZE}, | ||
| ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.CLP_CONFIG} | ||
| FROM ${settings.SqlDbCompressionJobsTableName} | ||
| WHERE ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME} >= NOW() - INTERVAL 30 DAY | ||
| ORDER BY ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID} DESC; | ||
| `; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider making the 30-day interval configurable.
The query hardcodes a 30-day lookback period (Line 47). While this aligns with the documentation, consider making this configurable via settings.json or as a function parameter for flexibility.
🔎 Proposed enhancement
/**
* Builds the SQL query to fetch recent compression metadata.
*
+ * @param daysBack Number of days to look back (default: 30).
* @return Compression metadata query string.
*/
-const getCompressionMetadataQuery = () => `
+const getCompressionMetadataQuery = (daysBack: number = 30) => `
SELECT
id as ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS_MSG},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.START_TIME},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.DURATION},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UNCOMPRESSED_SIZE},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.COMPRESSED_SIZE},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.CLP_CONFIG}
FROM ${settings.SqlDbCompressionJobsTableName}
- WHERE ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME} >= NOW() - INTERVAL 30 DAY
+ WHERE ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME} >= NOW() - INTERVAL ${daysBack} DAY
ORDER BY ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID} DESC;
`;🤖 Prompt for AI Agents
components/webui/server/src/routes/api/compress-metadata/sql.ts lines 35-49: the
SQL currently hardcodes a 30-day interval; make the lookback configurable by
either adding a function parameter (e.g., lookbackDays: number =
settings.SqlCompressionMetadataLookbackDays) or reading a new setting from
settings.json (default 30), then interpolate that value into the query instead
of the fixed "30 DAY"; validate the value is a positive integer, update any
callers to pass the parameter if using the function param approach, and add the
new setting with a default to settings.json/schema and relevant docs.
| export const decodeJobConfig = ( | ||
| jobConfig: unknown | ||
| ): {clp_config: ClpIoConfig} => { | ||
| if (!(jobConfig instanceof Buffer)) { | ||
| throw new Error("Missing clp_config buffer for compression metadata"); | ||
| } | ||
|
|
||
| try { | ||
| const decodedClpConfig = decode( | ||
| brotliDecompressSync(jobConfig) | ||
| ) as ClpIoConfig; | ||
|
|
||
| return {clp_config: decodedClpConfig}; | ||
| } catch (err: unknown) { | ||
| console.error(err); | ||
| throw new Error("Failed to decode clp_config buffer"); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add schema validation after decoding clp_config.
The decoded msgpack data is cast to ClpIoConfig without validation. Malformed or corrupted data could cause runtime errors downstream. Consider using TypeBox's Value.Check() or Value.Parse() to validate against ClpIoPartialConfigSchema before returning.
🔎 Proposed fix to add validation
+import {Value} from "@sinclair/typebox/value";
+import {ClpIoPartialConfigSchema} from "@webui/common/schemas/compression";
+
export const decodeJobConfig = (
jobConfig: unknown
): {clp_config: ClpIoConfig} => {
if (!(jobConfig instanceof Buffer)) {
throw new Error("Missing clp_config buffer for compression metadata");
}
try {
const decodedClpConfig = decode(
brotliDecompressSync(jobConfig)
- ) as ClpIoConfig;
+ );
+
+ if (false === Value.Check(ClpIoPartialConfigSchema, decodedClpConfig)) {
+ throw new Error("Decoded clp_config does not match expected schema");
+ }
- return {clp_config: decodedClpConfig};
+ return {clp_config: decodedClpConfig as ClpIoConfig};
} catch (err: unknown) {
console.error(err);
throw new Error("Failed to decode clp_config buffer");
}
};Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress-metadata/utils.ts around
lines 17 to 34, the function decodes a msgpack buffer and casts it to
ClpIoConfig without schema validation; add validation after decoding by using
the TypeBox schema ClpIoPartialConfigSchema and Value.Parse (or Value.Check) to
validate/parse the decoded object, and if validation fails throw a descriptive
error (including validation details) instead of returning an unchecked object so
downstream code receives only a schema-conformant clp_config.
| } catch (err: unknown) { | ||
| console.error(err); | ||
| throw new Error("Failed to decode clp_config buffer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid logging potentially sensitive error details.
Line 31 logs the full error object which may contain buffer contents or stack traces with sensitive path information. Consider using a structured logger with appropriate log levels and sanitization, or only logging error messages.
🔎 Proposed fix to sanitize error logging
} catch (err: unknown) {
- console.error(err);
+ console.error("Failed to decode clp_config:", err instanceof Error ? err.message : "Unknown error");
throw new Error("Failed to decode clp_config buffer");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (err: unknown) { | |
| console.error(err); | |
| throw new Error("Failed to decode clp_config buffer"); | |
| } catch (err: unknown) { | |
| console.error("Failed to decode clp_config:", err instanceof Error ? err.message : "Unknown error"); | |
| throw new Error("Failed to decode clp_config buffer"); | |
| } |
🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress-metadata/utils.ts around
lines 30 to 32, avoid logging the full error object (which may contain sensitive
buffer contents or stack paths). Replace the console.error(err) call with a
sanitized log that only records a safe summary (e.g., err instanceof Error ?
err.message : String(err) with sensitive content truncated) using the project's
structured logger at an appropriate level, and do not include the error.stack or
the raw buffer; keep throwing the existing generic Error("Failed to decode
clp_config buffer") so callers don't receive sensitive details.
Description
PR adds support for showing dataset and paths in webui ingest table.
Unfortunately these fields cannot be queried since they are compressed blobs in the database.
At first i tried to decompress client side, but code was a mess. So i moved the entire query to the server, and decompressed on the server. I think longer term this is probably a better approach for all yscope (non-user) webui queries. Initially we thought it was better for user dashboard capability in the future to send queries from client, but for querying our own clp metadata server handling may be better. One advantage is fastify will type the response.
Since fastify is typing response. I cleaned up the compress types junhao added for his compress job form, since they are reused here. Unfortunately, some fields were missing, so i added them, and then allowed partial types so it wont give errors for anyone who compressed with v0.6.0 in UI.
Did not add support for s3 here. paths are jsut hidden. The PR is already large.
Checklist
breaking change.
Validation performed
Tested with clp-s
Testing with clp and S3 still underway. Even though there is no s3 support, the fastify validation may still cause errors so need to check with command line compression.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.