10BC0 feat(webui): Add dataset and paths to compression job table. by davemarco · Pull Request #1798 · y-scope/clp · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@davemarco
Copy link
Contributor
@davemarco davemarco commented Dec 18, 2025

Description

PR adds support for showing dataset and paths in webui ingest table.

Unfortunately these fields cannot be queried since they are compressed blobs in the database.

At first i tried to decompress client side, but code was a mess. So i moved the entire query to the server, and decompressed on the server. I think longer term this is probably a better approach for all yscope (non-user) webui queries. Initially we thought it was better for user dashboard capability in the future to send queries from client, but for querying our own clp metadata server handling may be better. One advantage is fastify will type the response.

Since fastify is typing response. I cleaned up the compress types junhao added for his compress job form, since they are reused here. Unfortunately, some fields were missing, so i added them, and then allowed partial types so it wont give errors for anyone who compressed with v0.6.0 in UI.

Did not add support for s3 here. paths are jsut hidden. The PR is already large.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Tested with clp-s
Testing with clp and S3 still underway. Even though there is no s3 support, the fastify validation may still cause errors so need to check with command line compression.

Summary by CodeRabbit

  • New Features
    • Jobs list now fetches recent compression metadata from the backend.
    • Added Dataset and Paths columns to the compression jobs table.
    • Paths display supports copyable text and tooltip ellipses.
    • Table layout improved with horizontal scrolling and fixed layout for better readability.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor
coderabbitai bot commented Dec 18, 2025

Walkthrough

Migrates frontend job fetching from client-side SQL to a server API; adds compression metadata schemas and CLP I/O config types; implements server query, decoding and mapping; updates frontend Jobs page, types and utilities to consume the new API; removes legacy SQL helpers and adjusts related backend job config types.

Changes

Cohort / File(s) Summary
Backend API Endpoint – Compress Metadata
components/webui/server/src/routes/api/compress-metadata/index.ts, components/webui/server/src/routes/api/compress-metadata/sql.ts, components/webui/server/src/routes/api/compress-metadata/utils.ts
New GET /api/compress-metadata route, SQL query builder selecting recent compression rows, row->typed mapping and decoding utilities that brotli-decompress and msgpack-decode clp_config, returning CompressionMetadataDecoded[].
Frontend API Client
components/webui/client/src/api/compress-metadata/index.ts
New exported async fetchCompressionJobs() calling /api/compress-metadata and returning CompressionMetadataDecoded[].
Frontend Jobs Page Refactor
components/webui/client/src/pages/IngestPage/Jobs/index.tsx
Replaced client-side SQL fetch with fetchCompressionJobs() and mapCompressionJobResponseToTableData; added rowKey, horizontal scroll and fixed table layout adjustments.
Frontend Jobs Types
components/webui/client/src/pages/IngestPage/Jobs/typings.tsx
JobData extended with `dataset: string
Frontend Jobs Utilities
components/webui/client/src/pages/IngestPage/Jobs/utils.ts
Renamed convertQueryJobsItemToJobDatamapCompressionJobResponseToTableData, input type now CompressionMetadataDecoded; added helpers to extract dataset and paths from CLP I/O config with path-prefix stripping and engine-aware logic.
Removed Frontend SQL Infrastructure
components/webui/client/src/pages/IngestPage/Jobs/sql.ts, components/webui/ 8000 client/src/pages/IngestPage/sqlConfig.ts
Deleted legacy SQL helper and removed COMPRESSION_JOBS_TABLE_COLUMN_NAMES enum from SQL config.
Shared Schemas
components/webui/common/src/schemas/compress-metadata.ts, components/webui/common/src/schemas/compression.ts
Added TypeBox schemas and types: CompressionMetadataSchema, CompressionMetadataDecodedSchema, ClpIoConfigSchema, ClpIoPartialConfigSchema; added CompressionJobInputType enum and related FS/S3 input/output schemas and types.
Backend Job Configuration Updates
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts, components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts, components/webui/server/src/routes/api/compress/index.ts
Replaced CompressionJobConfig usage with ClpIoConfig (removed old typings); compress route and job manager now use ClpIoConfig with expanded input (dataset, paths_to_compress, timestamp_key, type, unstructured) and output (tags, target_segment_size) fields.
Backend Archive Metadata Route
components/webui/server/src/routes/api/archive-metadata/index.ts
Replaced intermediate mysqlConnectionPool variable; now calls fastify.mysql.query() directly.
Package manifest
package.json
Manifest included in diff (no action summary beyond presence).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Review items needing extra attention:
    • components/webui/server/src/routes/api/compress-metadata/utils.ts — brotli/msgpack decoding and error paths.
    • components/webui/client/src/pages/IngestPage/Jobs/utils.ts — path extraction, prefix stripping and engine-conditional logic.
    • Type migrations from CompressionJobConfigClpIoConfig across server routes and job manager for schema/shape consistency.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding dataset and paths fields to the compression job table in the WebUI.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5845fbb and e1aea36.

📒 Files selected for processing (3)
  • components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (3 hunks)
  • components/webui/common/src/schemas/compression.ts (1 hunks)
  • components/webui/server/src/routes/api/compress/index.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/client/src/pages/IngestPage/Jobs/typings.tsx
  • components/webui/common/src/schemas/compression.ts
  • components/webui/server/src/routes/api/compress/index.ts
🧠 Learnings (4)
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.

Applied to files:

  • components/webui/client/src/pages/IngestPage/Jobs/typings.tsx
📚 Learning: 2025-08-04T18:38:33.130Z
Learnt from: haiqi96
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-08-04T18:38:33.130Z
Learning: User haiqi96 requested creating a GitHub issue to document a documentation discrepancy where Task version requirements in docs weren't updated after yscope-utils upgrade in PR #1158.

Applied to files:

  • components/webui/common/src/schemas/compression.ts
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • components/webui/common/src/schemas/compression.ts
📚 Learning: 2025-09-15T22:20:40.750Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1125
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:267-291
Timestamp: 2025-09-15T22:20:40.750Z
Learning: For CLP compression jobs, the team has decided to fail the entire job immediately upon encountering any invalid input path, rather than continuing to process valid paths. This decision was made during PR #1125 development.

Applied to files:

  • components/webui/server/src/routes/api/compress/index.ts
🧬 Code graph analysis (2)
components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (2)
components/webui/common/src/config.ts (2)
  • CLP_STORAGE_ENGINES (54-54)
  • STORAGE_TYPE (56-56)
components/webui/client/src/config/index.ts (2)
  • SETTINGS_STORAGE_ENGINE (24-24)
  • SETTINGS_LOGS_INPUT_TYPE (22-22)
components/webui/server/src/routes/api/compress/index.ts (4)
components/webui/common/src/schemas/compression.ts (3)
  • ClpIoConfig (163-163)
  • CompressionJobInputType (156-156)
  • ClpIoFsInputConfig (164-164)
components/webui/server/src/routes/api/compress/typings.ts (1)
  • CONTAINER_INPUT_LOGS_ROOT_DIR (7-7)
components/webui/client/src/settings.ts (1)
  • settings (38-38)
components/webui/common/src/config.ts (1)
  • CLP_STORAGE_ENGINES (54-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: check-generated
🔇 Additional comments (13)
components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (5)

1-17: LGTM!

The imports and Text destructuring are appropriate for the new conditional column features and paths rendering.


42-43: LGTM!

The new dataset and paths fields are correctly typed and align with the server-side CLP I/O configuration structure.


49-53: LGTM!

The feature flag logic correctly shows the Dataset column for CLP_S storage and the Paths column for filesystem input. The TODO comment appropriately documents the S3 limitation mentioned in the PR objectives.


102-129: LGTM!

The conditional column rendering is well-implemented:

  • Clean spread operator pattern for optional columns
  • Excellent UX for the Paths column with copyable text and ellipsis tooltip
  • Fixed width prevents layout issues with long path lists

59-143: LGTM!

Adding explicit title properties to all columns improves code clarity and ensures consistent labelling across the job table.

components/webui/common/src/schemas/compression.ts (4)

75-110: LGTM!

The CompressionJobInputType enum and input configuration schemas are well-structured:

  • Type discriminants (type field) enable proper union discrimination in ClpIoConfigSchema
  • Nullable fields correctly use Type.Union([Type.String(), Type.Null()]) pattern
  • Schema structure matches the Python backend as documented

112-133: LGTM!

The output configuration and main CLP I/O config schemas correctly model the compression job structure with proper TypeBox union types for flexible input sources (FS or S3).


135-144: Skip - already reviewed.

A past review comment already addressed the need to document the specific backward compatibility version (v0.6.0) for ClpIoPartialConfigSchema.


146-168: LGTM!

The type aliases and exports appropriately expose the CLP I/O configuration types for use across the server and client components.

components/webui/server/src/routes/api/compress/index.ts (4)

6-9: LGTM!

The import additions correctly bring in the new CLP I/O configuration types from the common schemas module.


22-39: LGTM!

The default compression job configuration correctly implements the ClpIoConfig structure with sensible defaults. The addition of target_segment_size and tags fields aligns with the updated schema, and the filesystem input type default is appropriate.


71-76: LGTM!

The job configuration setup is correct:

  • structuredClone ensures proper deep copying of the default config
  • The cast to ClpIoFsInputConfig is necessary and safe since the default config has type: CompressionJobInputType.FS
  • Path mapping correctly prepends the logs input root directory

The TODO appropriately documents the S3 limitation mentioned in the PR objectives.


78-88: LGTM!

The CLP_S-specific configuration correctly:

  • Sets unstructured to false for structured data processing
  • Validates and assigns the required dataset field with appropriate error logging
  • Conditionally assigns the timestamp_key when provided

The properties accessed (unstructured, dataset, timestamp_key) exist in both FS and S3 input configs, making the assignments type-safe without additional casts.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@davemarco davemarco marked this pull request as ready for review December 19, 2025 04:21
@davemarco davemarco requested a review from a team as a code owner December 19, 2025 04:21
Copy link
Contributor
@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/webui/server/src/routes/api/archive-metadata/index.ts (1)

26-32: Endpoint executes arbitrary SQL queries without authentication or parameterized queries.

Line 28 executes user-provided queryString directly against the database without parameterization or input sanitization. No authentication middleware is applied to this route, making it publicly accessible.

Critical issues:

  • SQL injection vulnerability: Direct SQL execution allows arbitrary database commands (DROP, DELETE, UPDATE, etc.)
  • Missing authentication: No checks verify user identity before accepting queries
  • Missing authorization: No role or permission validation

Require authentication on this endpoint and implement parameterized queries or query builder patterns (similar to compress-metadata which uses getCompressionMetadataQuery()).

components/webui/server/src/routes/api/compress/index.ts (1)

69-86: Add type guard before casting to CompressionJobFsInputConfig.

Line 72 performs an unsafe type cast to CompressionJobFsInputConfig without verifying jobConfig.input.type is actually CompressionJobInputType.FS. While currently safe (since DEFAULT_COMPRESSION_JOB_CONFIG always sets type to FS), the TODO comment on line 71 indicates S3 support will be added. When that happens, this cast becomes unsafe and could cause runtime errors.

🔎 Recommended fix with type guard
 const jobConfig: ClpIoConfig = structuredClone(DEFAULT_COMPRESSION_JOB_CONFIG);
 // eslint-disable-next-line no-warning-comments
 //TODO: Add support for S3 input
-(jobConfig.input as CompressionJobFsInputConfig).paths_to_compress = paths.map(
-    (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
-);
+if (CompressionJobInputType.FS === jobConfig.input.type) {
+    jobConfig.input.paths_to_compress = paths.map(
+        (path) => CONTAINER_INPUT_LOGS_ROOT_DIR + path
+    );
+}

Note: This assumes TypeScript's discriminated union type narrowing works for ClpIoConfig. If not, you may need an explicit type assertion after the guard.

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24a44f0 and 5845fbb.

📒 Files selected for processing (15)
  • components/webui/client/src/api/compress-metadata/index.ts (1 hunks)
  • components/webui/client/src/pages/IngestPage/Jobs/index.tsx (3 hunks)
  • components/webui/client/src/pages/IngestPage/Jobs/sql.ts (0 hunks)
  • components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (3 hunks)
  • components/webui/client/src/pages/IngestPage/Jobs/utils.ts (2 hunks)
  • components/webui/client/src/pages/IngestPage/sqlConfig.ts (0 hunks)
  • components/webui/common/src/schemas/compress-metadata.ts (1 hunks)
  • components/webui/common/src/schemas/compression.ts (1 hunks)
  • components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (2 hunks)
  • components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts (0 hunks)
  • components/webui/server/src/routes/api/archive-metadata/index.ts (1 hunks)
  • components/webui/server/src/routes/api/compress-metadata/index.ts (1 hunks)
  • components/webui/server/src/routes/api/compress-metadata/sql.ts (1 hunks)
  • components/webui/server/src/routes/api/compress-metadata/utils.ts (1 hunks)
  • components/webui/server/src/routes/api/compress/index.ts (2 hunks)
💤 Files with no reviewable changes (3)
  • components/webui/client/src/pages/IngestPage/sqlConfig.ts
  • components/webui/client/src/pages/IngestPage/Jobs/sql.ts
  • components/webui/server/src/plugins/app/CompressionJobDbManager/typings.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts
  • components/webui/client/src/api/compress-metadata/index.ts
  • components/webui/server/src/routes/api/compress-metadata/sql.ts
  • components/webui/client/src/pages/IngestPage/Jobs/typings.tsx
  • components/webui/server/src/routes/api/archive-metadata/index.ts
  • components/webui/common/src/schemas/compress-metadata.ts
  • components/webui/common/src/schemas/compression.ts
  • components/webui/server/src/routes/api/compress-metadata/utils.ts
  • components/webui/server/src/routes/api/compress/index.ts
  • components/webui/client/src/pages/IngestPage/Jobs/index.tsx
  • components/webui/client/src/pages/I 8000 ngestPage/Jobs/utils.ts
  • components/webui/server/src/routes/api/compress-metadata/index.ts
🧠 Learnings (3)
📚 Learning: 2025-09-28T15:00:22.170Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 1340
File: components/job-orchestration/job_orchestration/executor/compress/compression_task.py:528-528
Timestamp: 2025-09-28T15:00:22.170Z
Learning: In components/job-orchestration/job_orchestration/executor/compress/compression_task.py, there is a suggestion to refactor from passing logger as a parameter through multiple functions to creating a ClpCompressor class that takes the logger as a class member, with current helper functions becoming private member functions.

Applied to files:

  • components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts
📚 Learning: 2024-11-21T15:51:33.203Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 596
File: components/log-viewer-webui/client/src/api/query.js:16-23
Timestamp: 2024-11-21T15:51:33.203Z
Learning: In `components/log-viewer-webui/client/src/api/query.js`, the `ExtractJsonResp` type definition is accurate as-is and does not require modification. When suggesting changes to type definitions, ensure they align with the server-side definitions, referencing the source code if necessary.

Applied to files:

  • components/webui/server/src/routes/api/compress-metadata/sql.ts
  • components/webui/client/src/pages/IngestPage/Jobs/typings.tsx
  • components/webui/server/src/routes/api/compress-metadata/utils.ts
  • components/webui/server/src/routes/api/compress/index.ts
  • components/webui/client/src/pages/IngestPage/Jobs/index.tsx
📚 Learning: 2025-09-15T22:20:40.750Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1125
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:267-291
Timestamp: 2025-09-15T22:20:40.750Z
Learning: For CLP compression jobs, the team has decided to fail the entire job immediately upon encountering any invalid input path, rather than continuing to process valid paths. This decision was made during PR #1125 development.

Applied to files:

  • components/webui/server/src/routes/api/compress/index.ts
  • components/webui/client/src/pages/IngestPage/Jobs/utils.ts
🧬 Code graph analysis (5)
components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)
components/webui/common/src/schemas/compression.ts (1)
  • ClpIoConfig (166-166)
components/webui/client/src/api/compress-metadata/index.ts (1)
components/webui/common/src/schemas/compress-metadata.ts (1)
  • CompressionMetadataDecoded (58-58)
components/webui/server/src/routes/api/compress/index.ts (3)
components/webui/common/src/schemas/compression.ts (3)
  • ClpIoConfig (166-166)
  • CompressionJobInputType (159-159)
  • CompressionJobFsInputConfig (169-169)
components/webui/server/src/routes/api/compress/typings.ts (1)
  • CONTAINER_INPUT_LOGS_ROOT_DIR (7-7)
components/webui/common/src/config.ts (1)
  • CLP_STORAGE_ENGINES (54-54)
components/webui/client/src/pages/IngestPage/Jobs/index.tsx (3)
components/webui/client/src/api/compress-metadata/index.ts (1)
  • fetchCompressionJobs (19-19)
components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (1)
  • JobData (148-148)
components/webui/client/src/pages/IngestPage/Jobs/utils.ts (1)
  • mapCompressionJobResponseToTableData (142-142)
components/webui/client/src/pages/IngestPage/Jobs/utils.ts (5)
components/webui/common/src/schemas/compression.ts (2)
  • ClpIoConfig (166-166)
  • CompressionJobInputType (159-159)
components/webui/common/src/config.ts (1)
  • CLP_STORAGE_ENGINES (54-54)
components/webui/client/src/config/index.ts (1)
  • SETTINGS_STORAGE_ENGINE (24-24)
components/webui/common/src/schemas/compress-metadata.ts (1)
  • CompressionMetadataDecoded (58-58)
components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (2)
  • JobData (148-148)
  • CompressionJobStatus (151-151)
🔇 Additional comments (20)
components/webui/server/src/routes/api/archive-metadata/index.ts (1)

28-28: Refactoring to fastify.mysql.query is correct and idiomatic.

The change from using a pool variable to calling fastify.mysql.query directly is cleaner and the API is properly supported. This method exists as a utility on the fastify.mysql namespace and returns an array that can be destructured as shown, where the first element contains the query result set.

components/webui/common/src/schemas/compression.ts (3)

75-81: LGTM! Enum matches Python InputType class.

The enum definition is clear and aligns with the documented Python class structure.


101-110: Schema fields correctly match Python S3InputConfig definition.

The nullable fields (dataset, keys, timestamp_key) in the TypeScript schema match the Python class definition, where all three are defined as optional with | None. The unstructured field is correctly non-nullable in both versions.


86-96: Verify nullable field requirements with Python FsInputConfig.

The schema defines several fields as Type.Union([Type.String(), Type.Null()]), making them nullable. Ensure this nullability matches the Python FsInputConfig class definition, particularly for dataset, path_prefix_to_remove, and timestamp_key.

#!/bin/bash
# Description: Verify FsInputConfig field requirements in Python codebase

# Search for FsInputConfig class definition
rg -n -A 15 "class FsInputConfig" --type=py
components/webui/server/src/routes/api/compress-metadata/utils.ts (1)

42-72: LGTM! Mapping function is well-structured.

The mapping logic correctly transforms database rows to the decoded metadata format. The destructuring approach is clean and maintainable.

components/webui/server/src/routes/api/compress-metadata/index.ts (1)

35-39: LGTM! Query and mapping logic is sound.

The route correctly executes the query and maps the results using the utility function. Error handling will be managed by Fastify's default error handler.

components/webui/client/src/pages/IngestPage/Jobs/index.tsx (2)

20-26: LGTM! Data fetching migration is clean.

The switch from SQL-based queries to the new fetchCompressionJobs API with mapCompressionJobResponseToTableData transformation is well-structured. The query key remains stable for proper cache management.


38-40: Table configuration enhancements are appropriate.

The addition of rowKey, horizontal scrolling (x: true), and fixed layout are good improvements to accommodate the new dataset and paths columns mentioned in the PR objectives.

components/webui/common/src/schemas/compress-metadata.ts (1)

40-47: Good design using partial schema for backward compatibility.

Using ClpIoPartialConfigSchema in the decoded schema properly supports the backward compatibility requirement mentioned in the PR objectives for v0.6.0 data.

components/webui/server/src/routes/api/compress-metadata/sql.ts (1)

6-28: LGTM! Enum and interface definitions are well-structured.

The enum provides type-safe column name references, and the interface correctly extends RowDataPacket with properly typed fields matching the database schema.

components/webui/server/src/plugins/app/CompressionJobDbManager/index.ts (1)

44-44: Parameter type change already verified in all call sites.

The caller in components/webui/server/src/routes/api/compress/index.ts at line 89 correctly passes jobConfig of type ClpIoConfig, which matches the updated method signature. The old CompressionJobConfig type is not found in the codebase, confirming the change has been fully propagated.

components/webui/client/src/pages/IngestPage/Jobs/typings.tsx (3)

1-19: LGTM: Imports and setup are appropriate.

The new imports support the conditional column visibility logic and the enhanced Paths column rendering with copyable text.


43-44: LGTM: JobData interface properly extended.

The new fields align with the PR objectives and use appropriate nullable types for dataset.


50-130: LGTM: Conditional column logic is well-implemented.

The conditional rendering of Dataset and Paths columns based on storage engine configuration is clean and the Paths column render function provides good UX with copyable text and tooltip ellipsis.

components/webui/server/src/routes/api/compress/index.ts (2)

4-9: LGTM: Import additions support the new CLP IO config types.


20-37: LGTM: DEFAULT_COMPRESSION_JOB_CONFIG properly updated to ClpIoConfig type.

The new fields have sensible defaults and the object remains frozen to prevent mutations.

components/webui/client/src/pages/IngestPage/Jobs/utils.ts (4)

24-31: LGTM: stripPrefix follows coding guidelines.

The implementation correctly uses false === path.startsWith(prefix) rather than negation, adhering to the project's coding guidelines.


68-77: LGTM: extractDataFromIoConfig logic is correct.

The conditional dataset extraction based on storage engine is appropriate, and the code follows coding guidelines.


114-128: LGTM: NaN checks follow coding guidelines.

The code correctly uses false === isNaN(...) pattern rather than !isNaN(...), adhering to the project's coding guidelines.


92-140: No changes required. The type cast clpConfig as ClpIoConfig on line 101 is legitimate and intentional. The clp_config field is correctly defined as ClpIoPartialConfigSchema in CompressionMetadataDecoded to support backward compatibility with older CLP releases, while the cast safely narrows the type to satisfy the function signature.

Likely an incorrect or invalid review comment.

Comment on lines +10 to +16
const fetchCompressionJobs = async (): Promise<CompressionMetadataDecoded[]> => {
const {data} = await axios.get<CompressionMetadataDecoded[]>(
"/api/compress-metadata"
);

return data;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding error handling or documenting error propagation.

The function doesn't handle errors from the axios request. While letting errors propagate to the caller is acceptable, consider:

  • Adding JSDoc to document which errors can be thrown (network errors, 4xx/5xx responses)
  • Or adding error handling to transform or enrich error messages for better debugging
🔎 Example with documented error propagation
 /**
  * Retrieves recent compression jobs (last 30 days).
  *
+ * @throws {AxiosError} If the request fails due to network issues or server errors.
  * @return Recent compression jobs metadata.
  */
 const fetchCompressionJobs = async (): Promise<CompressionMetadataDecoded[]> => {

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In components/webui/client/src/api/compress-metadata/index.ts around lines 10 to
16, the axios GET call lacks error handling or documentation of propagated
errors; update the function by adding a JSDoc comment that documents the
possible errors (network errors, AxiosError for non-2xx responses) and either
wrap the axios request in a try/catch to rethrow a richer, contextual Error
(including HTTP status and response data when available) or explicitly document
that the function lets Axios errors propagate so callers must handle them.

Comment on lines +39 to +60
const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => {
const {input} = clpConfig;

// Fallback for legacy/partial payloads that include paths but omit the FS type
if ("paths_to_compress" in input) {
input.type = CompressionJobInputType.FS;
}

if (CompressionJobInputType.FS === input.type) {
const prefixToRemove = input.path_prefix_to_remove;
if (prefixToRemove) {
return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove));
}

return input.paths_to_compress;
}

// eslint-disable-next-line no-warning-comments
// TODO: Add support to parse S3 paths; may need a bucket prefix from the CLP config.

return [];
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid mutating the input config parameter.

Lines 43-45 mutate the input object by setting input.type = CompressionJobInputType.FS. This side effect could cause unexpected behaviour if the config object is used elsewhere after calling this function. Consider either cloning the config before mutation or handling legacy payloads without mutation.

🔎 Suggested approach to avoid mutation

Since this function only reads from the input config, document the mutation clearly or restructure to avoid it:

 const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => {
     const {input} = clpConfig;
 
-    // Fallback for legacy/partial payloads that include paths but omit the FS type
-    if ("paths_to_compress" in input) {
-        input.type = CompressionJobInputType.FS;
-    }
-
-    if (CompressionJobInputType.FS === input.type) {
+    // Fallback for legacy/partial payloads that include paths but omit the FS type
+    if (CompressionJobInputType.FS === input.type || "paths_to_compress" in input) {
         const prefixToRemove = input.path_prefix_to_remove;
         if (prefixToRemove) {
             return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove));
         }
 
         return input.paths_to_compress;
     }

This assumes that if paths_to_compress exists, it's safe to treat the input as FS type without mutation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => {
const {input} = clpConfig;
// Fallback for legacy/partial payloads that include paths but omit the FS type
if ("paths_to_compress" in input) {
input.type = CompressionJobInputType.FS;
}
if (CompressionJobInputType.FS === input.type) {
const prefixToRemove = input.path_prefix_to_remove;
if (prefixToRemove) {
return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove));
}
return input.paths_to_compress;
}
// eslint-disable-next-line no-warning-comments
// TODO: Add support to parse S3 paths; may need a bucket prefix from the CLP config.
return [];
};
const extractPathsFromInput = (clpConfig: ClpIoConfig): string[] => {
const {input} = clpConfig;
// Fallback for legacy/partial payloads that include paths but omit the FS type
if (CompressionJobInputType.FS === input.type || "paths_to_compress" in input) {
const prefixToRemove = input.path_prefix_to_remove;
if (prefixToRemove) {
return input.paths_to_compress.map((path) => stripPrefix(path, prefixToRemove));
}
return input.paths_to_compress;
}
// eslint-disable-next-line no-warning-comments
// TODO: Add support to parse S3 paths; may need a bucket prefix from the CLP config.
return [];
};
🤖 Prompt for AI Agents
components/webui/client/src/pages/IngestPage/Jobs/utils.ts around lines 39 to
60: the function currently mutates the passed-in clpConfig.input by assigning
input.type = CompressionJobInputType.FS for legacy payloads; instead avoid side
effects by treating the detected legacy case locally (do not write back to
input) — either create a local variable (e.g., const resolvedType =
"paths_to_compress" in input ? CompressionJobInputType.FS : input.type) or
shallow-clone the input before modifying, then use resolvedType when branching
and leave the original clpConfig untouched.

Comment on lines +28 to +33
const CompressionMetadataSchema = Type.Intersect([
CompressionMetadataBaseSchema,
Type.Object({
clp_config: Type.Unsafe<Buffer>({}),
}),
]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Document why Type.Unsafe is necessary for Buffer.

The use of Type.Unsafe<Buffer>({}) is correct since TypeBox doesn't provide a native Buffer type, but consider adding a comment explaining this to help future maintainers.

🔎 Suggested documentation
 /**
  * Compression metadata as stored, including the encoded IO config blob.
  */
 const CompressionMetadataSchema = Type.Intersect([
     CompressionMetadataBaseSchema,
     Type.Object({
+        // TypeBox doesn't provide a native Buffer type, so Type.Unsafe is used
         clp_config: Type.Unsafe<Buffer>({}),
     }),
 ]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const CompressionMetadataSchema = Type.Intersect([
CompressionMetadataBaseSchema,
Type.Object({
clp_config: Type.Unsafe<Buffer>({}),
}),
]);
const CompressionMetadataSchema = Type.Intersect([
CompressionMetadataBaseSchema,
Type.Object({
// TypeBox doesn't provide a native Buffer type, so Type.Unsafe is used
clp_config: Type.Unsafe<Buffer>({}),
}),
]);
🤖 Prompt for AI Agents
In components/webui/common/src/schemas/compress-metadata.ts around lines 28 to
33, add a brief inline comment above the clp_config property explaining why
Type.Unsafe<Buffer> is used (TypeBox lacks a native Buffer type so Unsafe is
required to represent Node Buffer binary data), so future maintainers understand
the rationale; keep the comment concise and mention that the schema expects a
Node Buffer and that Unsafe is intentional.

Comment on lines +135 to +144
/**
* Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors
* for compression metadata from older CLP releases with missing data.
*/
const ClpIoPartialConfigSchema =
Type.Object({
input: Type.Union([Type.Partial(ClpIoFsInputConfigSchema),
Type.Partial(ClpIoS3InputConfigSchema)]),
output: Type.Partial(ClpIoOutputConfigSchema),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Document backward compatibility version.

The comment mentions "older CLP releases with missing data" but doesn't specify which version. Consider documenting the specific version (e.g., v0.6.0 mentioned in PR objectives) to clarify the compatibility requirement.

🔎 Suggested documentation improvement
 /**
- * Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors
- * for compression metadata from older CLP releases with missing data.
+ * Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors
+ * for compression metadata from CLP releases prior to v0.7.0 with missing data.
  */
 const ClpIoPartialConfigSchema =
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors
* for compression metadata from older CLP releases with missing data.
*/
const ClpIoPartialConfigSchema =
Type.Object({
input: Type.Union([Type.Partial(ClpIoFsInputConfigSchema),
Type.Partial(ClpIoS3InputConfigSchema)]),
output: Type.Partial(ClpIoOutputConfigSchema),
});
/**
* Less strict version of `ClpIoConfigSchema` to prevent Fastify validation errors
* for compression metadata from CLP releases prior to v0.7.0 with missing data.
*/
const ClpIoPartialConfigSchema =
Type.Object({
input: Type.Union([Type.Partial(ClpIoFsInputConfigSchema),
Type.Partial(ClpIoS3InputConfigSchema)]),
output: Type.Partial(ClpIoOutputConfigSchema),
});
🤖 Prompt for AI Agents
In components/webui/common/src/schemas/compression.ts around lines 135 to 144,
the comment describing compatibility with "older CLP releases with missing data"
is vague; update the comment above ClpIoPartialConfigSchema to explicitly state
the backward-compatibility target (e.g., "supports CLP releases back to v0.6.0")
and, if relevant, note which fields were missing in those versions so
maintainers know why the schema is relaxed.

},
},
async () => {
console.log("Fetching compression metadata...");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use Fastify's logger instead of console.log.

Replace console.log with fastify.log.info for consistent structured logging and proper log level management.

🔎 Proposed fix
         async () => {
-            console.log("Fetching compression metadata...");
+            fastify.log.info("Fetching compression metadata");
             const [rows] = await fastify.mysql.query<CompressionMetadataQueryRow[]>(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
console.log("Fetching compression metadata...");
fastify.log.info("Fetching compression metadata");
🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress-metadata/index.ts around line
34, the code uses console.log("Fetching compression metadata..."); which
bypasses Fastify's structured logger — replace this call with
fastify.log.info("Fetching compression metadata...") (or request.server.log.info
inside route handlers if `fastify` isn't in scope) to use Fastify's logger,
preserve log levels and structure, and remove the console.log call.

Comment on lines +35 to +49
const getCompressionMetadataQuery = () => `
SELECT
id as ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS_MSG},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.START_TIME},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.DURATION},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UNCOMPRESSED_SIZE},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.COMPRESSED_SIZE},
${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.CLP_CONFIG}
FROM ${settings.SqlDbCompressionJobsTableName}
WHERE ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME} >= NOW() - INTERVAL 30 DAY
ORDER BY ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID} DESC;
`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider making the 30-day interval configurable.

The query hardcodes a 30-day lookback period (Line 47). While this aligns with the documentation, consider making this configurable via settings.json or as a function parameter for flexibility.

🔎 Proposed enhancement
 /**
  * Builds the SQL query to fetch recent compression metadata.
  *
+ * @param daysBack Number of days to look back (default: 30).
  * @return Compression metadata query string.
  */
-const getCompressionMetadataQuery = () => `
+const getCompressionMetadataQuery = (daysBack: number = 30) => `
     SELECT
         id as ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.STATUS_MSG},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.START_TIME},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.DURATION},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UNCOMPRESSED_SIZE},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.COMPRESSED_SIZE},
         ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.CLP_CONFIG}
     FROM ${settings.SqlDbCompressionJobsTableName}
-    WHERE ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME} >= NOW() - INTERVAL 30 DAY
+    WHERE ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.UPDATE_TIME} >= NOW() - INTERVAL ${daysBack} DAY
     ORDER BY ${COMPRESSION_JOBS_TABLE_COLUMN_NAMES.ID} DESC;
 `;
🤖 Prompt for AI Agents
components/webui/server/src/routes/api/compress-metadata/sql.ts lines 35-49: the
SQL currently hardcodes a 30-day interval; make the lookback configurable by
either adding a function parameter (e.g., lookbackDays: number =
settings.SqlCompressionMetadataLookbackDays) or reading a new setting from
settings.json (default 30), then interpolate that value into the query instead
of the fixed "30 DAY"; validate the value is a positive integer, update any
callers to pass the parameter if using the function param approach, and add the
new setting with a default to settings.json/schema and relevant docs.

Comment on lines +17 to +34
export const decodeJobConfig = (
jobConfig: unknown
): {clp_config: ClpIoConfig} => {
if (!(jobConfig instanceof Buffer)) {
throw new Error("Missing clp_config buffer for compression metadata");
}

try {
const decodedClpConfig = decode(
brotliDecompressSync(jobConfig)
) as ClpIoConfig;

return {clp_config: decodedClpConfig};
} catch (err: unknown) {
console.error(err);
throw new Error("Failed to decode clp_config buffer");
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add schema validation after decoding clp_config.

The decoded msgpack data is cast to ClpIoConfig without validation. Malformed or corrupted data could cause runtime errors downstream. Consider using TypeBox's Value.Check() or Value.Parse() to validate against ClpIoPartialConfigSchema before returning.

🔎 Proposed fix to add validation
+import {Value} from "@sinclair/typebox/value";
+import {ClpIoPartialConfigSchema} from "@webui/common/schemas/compression";
+
 export const decodeJobConfig = (
     jobConfig: unknown
 ): {clp_config: ClpIoConfig} => {
     if (!(jobConfig instanceof Buffer)) {
         throw new Error("Missing clp_config buffer for compression metadata");
     }
 
     try {
         const decodedClpConfig = decode(
             brotliDecompressSync(jobConfig)
-        ) as ClpIoConfig;
+        );
+        
+        if (false === Value.Check(ClpIoPartialConfigSchema, decodedClpConfig)) {
+            throw new Error("Decoded clp_config does not match expected schema");
+        }
 
-        return {clp_config: decodedClpConfig};
+        return {clp_config: decodedClpConfig as ClpIoConfig};
     } catch (err: unknown) {
         console.error(err);
         throw new Error("Failed to decode clp_config buffer");
     }
 };

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress-metadata/utils.ts around
lines 17 to 34, the function decodes a msgpack buffer and casts it to
ClpIoConfig without schema validation; add validation after decoding by using
the TypeBox schema ClpIoPartialConfigSchema and Value.Parse (or Value.Check) to
validate/parse the decoded object, and if validation fails throw a descriptive
error (including validation details) instead of returning an unchecked object so
downstream code receives only a schema-conformant clp_config.

Comment on lines +30 to +32
} catch (err: unknown) {
console.error(err);
throw new Error("Failed to decode clp_config buffer");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid logging potentially sensitive error details.

Line 31 logs the full error object which may contain buffer contents or stack traces with sensitive path information. Consider using a structured logger with appropriate log levels and sanitization, or only logging error messages.

🔎 Proposed fix to sanitize error logging
     } catch (err: unknown) {
-        console.error(err);
+        console.error("Failed to decode clp_config:", err instanceof Error ? err.message : "Unknown error");
         throw new Error("Failed to decode clp_config buffer");
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} catch (err: unknown) {
console.error(err);
throw new Error("Failed to decode clp_config buffer");
} catch (err: unknown) {
console.error("Failed to decode clp_config:", err instanceof Error ? err.message : "Unknown error");
throw new Error("Failed to decode clp_config buffer");
}
🤖 Prompt for AI Agents
In components/webui/server/src/routes/api/compress-metadata/utils.ts around
lines 30 to 32, avoid logging the full error object (which may contain sensitive
buffer contents or stack paths). Replace the console.error(err) call with a
sanitized log that only records a safe summary (e.g., err instanceof Error ?
err.message : String(err) with sensitive content truncated) using the project's
structured logger at an appropriate level, and do not include the error.stack or
the raw buffer; keep throwing the existing generic Error("Failed to decode
clp_config buffer") so callers don't receive sensitive details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

0