E52D feat(cli): Add schema and basic functionality for integrations config file by tkislan · Pull Request #244 · deepnote/deepnote · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@tkislan
Copy link
Contributor
@tkislan tkislan commented Jan 25, 2026

Summary by CodeRabbit

  • New Features

    • Configure database integrations via YAML with strict validation and contextual error reporting
    • Supports many integration types (Postgres, MySQL, Snowflake, BigQuery, Athena, Redshift, Databricks, MongoDB, etc.)
    • Case-insensitive ID lookup for integrations
  • Tests

    • Comprehensive parsing and validation tests covering missing files, YAML errors, mixed valid/invalid entries, and defaults
  • Chores

    • Added YAML runtime dependency and JSON Schema generation tooling

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor
coderabbitai bot commented Jan 25, 2026
📝 Walkthrough

Walkthrough

Adds integration configuration parsing, validation, and utilities: Zod schemas (loose and strict) for integrations, parsing logic that reads YAML, reports structured validation issues, and returns valid integrations, plus helpers to build a case-insensitive map and derive the default file path. Adds a script to emit a JSON Schema from the Zod schema and comprehensive tests. New exports consolidate integrations types and functions under a central module.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI Process
    participant FS as File System
    participant YAML as YAML Parser
    participant Zod as Zod Validator
    participant Result as Parse Result

    CLI->>FS: read integrations file
    alt file missing
        FS-->>Result: return empty IntegrationsParseResult
    else file exists
        FS->>YAML: provide file content
        YAML->>CLI: parsed object or error
        alt YAML parse error
            CLI-->>Result: add yaml_parse_error issue
        else valid YAML
            CLI->>Zod: validate top-level (base) schema
            alt top-level invalid
                Zod-->>Result: add issues, return
            else top-level valid
                loop per integration entry
                    CLI->>Zod: validate entry (strict schema)
                    alt entry invalid
                        Zod-->>Result: add indexed issues
                    else entry valid
                        Zod-->>Result: include integration
                    end
                end
            end
        end
    end
    Result-->>CLI: IntegrationsParseResult (integrations + issues)
Loading
sequenceDiagram
    participant Script as Generate Script
    participant Zod as Zod Schema
    participant ZTS as zodToJsonSchema
    participant FS as File System

    Script->>Zod: import integrationsFileSchema
    Script->>ZTS: convert Zod -> JSON Schema
    ZTS->>Script: return JSON Schema
    Script->>FS: write json-schemas/integrations-file-schema.json
    alt write succeeds
        FS-->>Script: file written
        Script-->>Script: log "Done", exit 0
    else write fails
        FS-->>Script: error
        Script-->>Script: log error, exit 1
    end
Loading
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective: adding a schema and basic functionality for integrations config file parsing.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor
@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In `@json-schemas/integrations-file-schema.json`:
- Around line 1-6: Add a machine-readable generated marker to the JSON Schema by
inserting a top-level "$comment" (or "description") property such as "$comment":
"GENERATED FILE - DO NOT EDIT. Source: <your generator>" at the root of the
schema object (the same level as "type" and "properties") so tooling and humans
know this file is generated; update the schema that contains the "integrations"
array definition accordingly.

In `@packages/cli/scripts/generate-json-schemas.ts`:
- Around line 8-9: The script writes JSON to a relative path which is
non-deterministic and can fail if the directory doesn't exist; make the output
path deterministic by resolving a project-root directory (e.g., const outputDir
= path.resolve(process.cwd(), 'json-schemas')) and ensure the directory exists
before writing (await fs.mkdir(outputDir, { recursive: true })), then write the
schema to path.join(outputDir, 'integrations-file-schema.json') using the
existing jsonSchema variable produced by
zodToJsonSchema(integrationsFileSchema).
- Around line 3-8: Remove the unnecessary double-cast and unused import: stop
casting integrationsFileSchema through "as unknown as ZodSchema" when calling
zodToJsonSchema and pass integrationsFileSchema directly to zodToJsonSchema;
also delete the unused ZodSchema import from the top of the file so the call
becomes zodToJsonSchema(integrationsFileSchema) inside the run function.

In `@packages/cli/src/integrations/parse-integrations.test.ts`:
- Around line 282-286: The test pins a double-slash outcome for
getDefaultIntegrationsFilePath; change the assertion to be robust to
implementation (e.g., use path.join or path.normalize when building the expected
path) so the test doesn't fail if getDefaultIntegrationsFilePath switches to
path.join—update the test in parse-integrations.test.ts to compute the expected
value with Node's path utilities (referencing the getDefaultIntegrationsFilePath
call and its input '/path/to/project/') or assert that the returned path
endsWith('.deepnote.env.yaml') instead of asserting the exact double-slash
string.

In `@packages/cli/src/integrations/parse-integrations.ts`:
- Around line 36-41: In formatZodIssues, avoid producing trailing dots when
issue.path is empty by building the path from non-empty segments: take
pathPrefix and issue.path, filter out empty strings, then join with '.'; if all
segments are empty return an empty string. Update the path computation in
formatZodIssues to use this filtered-join approach (referencing formatZodIssues,
pathPrefix and issue.path).
- Around line 140-141: The getDefaultIntegrationsFilePath function builds a file
path using string concatenation which can produce incorrect separators on some
platforms; replace the concatenation with path.join to construct the path
cross‑platform (import Node's path module if not already present) so
getDefaultIntegrationsFilePath(deepnoteFileDir) returns
path.join(deepnoteFileDir, '.deepnote.env.yaml') instead of using
`${deepnoteFileDir}/.deepnote.env.yaml`.
- Around line 55-76: The fs.access try/catch in parse-integrations.ts currently
treats any access error as "file missing"; update the catch to inspect the
caught error (e.g., the error from fs.access) and only return the
emptyIntegrations silently when the error is an ENOENT "file not found" error;
for all other errors (EACCES, etc.) push an issue onto the issues array with a
descriptive message (similar to the file_read_error pattern) and return {
integrations: emptyIntegrations, issues } so callers know there was an access
problem; locate the fs.access block around filePath and use the existing
emptyIntegrations and issues symbols to implement this behavior.

Comment on lines +1 to +6
{
"type": "object",
"properties": {
"integrations": {
"type": "array",
"items": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider marking this file as generated.
Helps avoid manual edits drifting from the script output.

🤖 Prompt for AI Agents
In `@json-schemas/integrations-file-schema.json` around lines 1 - 6, Add a
machine-readable generated marker to the JSON Schema by inserting a top-level
"$comment" (or "description") property such as "$comment": "GENERATED FILE - DO
NOT EDIT. Source: <your generator>" at the root of the schema object (the same
level as "type" and "properties") so tooling and humans know this file is
generated; update the schema that contains the "integrations" array definition
accordingly.

Comment on lines +3 to +8
import type { ZodSchema } from 'zod/v3'
import { zodToJsonSchema } from 'zod-to-json-schema'
import { integrationsFileSchema } from '../src/integrations/integrations-file-schemas'

async function run() {
const jsonSchema = zodToJsonSchema(integrationsFileSchema as unknown as ZodSchema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

zod-to-json-schema zodToJsonSchema type signature parameters version 3.25

💡 Result:

In zod-to-json-schema v3.25.x, the exported function has this TypeScript signature: [1]

declare const zodToJsonSchema: <Target extends Targets = "jsonSchema7">(
  schema: ZodSchema<any>,
  options?: string | Partial<Options<Target>> | undefined
) => (
  Target extends "jsonSchema7" ? JsonSchema7Type : object
) & {
  $schema?: string | undefined;
  definitions?: {
    [key: string]:
      Target extends "jsonSchema7"
        ? JsonSchema7Type
        : Target extends "jsonSchema2019-09"
          ? JsonSchema7Type
          : object;
  } | undefined;
};

Parameters

  • schema: ZodSchema<any> (imported from zod/v3) [1]
  • options?: string | Partial<Options<Target>> [1]
    • If string, it’s the schema name (used to place the schema in definitions and $ref it). [2]

Generic

  • <Target extends Targets = "jsonSchema7"> [1]

Return type

  • JSON Schema object typed as JsonSchema7Type when Target is "jsonSchema7", otherwise object, plus optional $schema and definitions. [1]

Sources: [1] [2]


🏁 Script executed:

# Check the actual file to see the context and understand the schema type
find . -type f -name "generate-json-schemas.ts" | head -5

Repository: deepnote/deepnote

Length of output: 108


🏁 Script executed:

cat -n ./packages/cli/scripts/generate-json-schemas.ts

Repository: deepnote/deepnote

Length of output: 814


🏁 Script executed:

# Check the type of integrationsFileSchema
find . -type f -name "integrations-file-schemas.ts" -o -name "integrations-file-schemas.tsx" | head -5

Repository: deepnote/deepnote

Length of output: 121


🏁 Script executed:

cat -n ./packages/cli/src/integrations/integrations-file-schemas.ts | head -50

Repository: deepnote/deepnote

Length of output: 1130


🏁 Script executed:

# Check if ZodSchema is used anywhere else in the file
rg "ZodSchema" ./packages/cli/scripts/generate-json-schemas.ts

Repository: deepnote/deepnote

Length of output: 185


Remove the unnecessary double-cast.

The zodToJsonSchema function accepts ZodSchema<any>, and integrationsFileSchema (a z.object()) is already a Zod schema. The as unknown as ZodSchema cast is unsafe and bypasses type checking. Pass it directly. Also remove the unused ZodSchema import.

♻️ Proposed change
 import fs from 'node:fs/promises'
 import path from 'node:path'
-import type { ZodSchema } from 'zod/v3'
 import { zodToJsonSchema } from 'zod-to-json-schema'
 import { integrationsFileSchema } from '../src/integrations/integrations-file-schemas'

 async function run() {
-  const jsonSchema = zodToJsonSchema(integrationsFileSchema as unknown as ZodSchema)
+  const jsonSchema = zodToJsonSchema(integrationsFileSchema)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import type { ZodSchema } from 'zod/v3'
import { zodToJsonSchema } from 'zod-to-json-schema'
import { integrationsFileSchema } from '../src/integrations/integrations-file-schemas'
async function run() {
const jsonSchema = zodToJsonSchema(integrationsFileSchema as unknown as ZodSchema)
import { zodToJsonSchema } from 'zod-to-json-schema'
import { integrationsFileSchema } from '../src/integrations/integrations-file-schemas'
async function run() {
const jsonSchema = zodToJsonSchema(integrationsFileSchema)
🤖 Prompt for AI Agents
In `@packages/cli/scripts/generate-json-schemas.ts` around lines 3 - 8, Remove the
unnecessary double-cast and unused import: stop casting integrationsFileSchema
through "as unknown as ZodSchema" when calling zodToJsonSchema and pass
integrationsFileSchema directly to zodToJsonSchema; also delete the unused
ZodSchema import from the top of the file so the call becomes
zodToJsonSchema(integrationsFileSchema) inside the run function.

Comment on lines +8 to +9
const jsonSchema = zodToJsonSchema(integrationsFileSchema as unknown as ZodSchema)
await fs.writeFile(path.join('json-schemas', 'integrations-file-schema.json'), JSON.stringify(jsonSchema, null, 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make the output path deterministic and ensure the directory exists.
Relative paths depend on cwd and can write to the wrong location or fail if json-schemas/ is missing.

🛠️ Proposed fix
 import fs from 'node:fs/promises'
 import path from 'node:path'
+import { fileURLToPath } from 'node:url'
 import { zodToJsonSchema } from 'zod-to-json-schema'
 import { integrationsFileSchema } from '../src/integrations/integrations-file-schemas'

 async function run() {
   const jsonSchema = zodToJsonSchema(integrationsFileSchema as unknown as ZodSchema)
-  await fs.writeFile(path.join('json-schemas', 'integrations-file-schema.json'), JSON.stringify(jsonSchema, null, 2))
+  const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), '../../..')
+  const outputDir = path.join(repoRoot, 'json-schemas')
+  await fs.mkdir(outputDir, { recursive: true })
+  await fs.writeFile(path.join(outputDir, 'integrations-file-schema.json'), JSON.stringify(jsonSchema, null, 2))
 }
🤖 Prompt for AI Agents
In `@packages/cli/scripts/generate-json-schemas.ts` around lines 8 - 9, The script
writes JSON to a relative path which is non-deterministic and can fail if the
directory doesn't exist; make the output path deterministic by resolving a
project-root directory (e.g., const outputDir = path.resolve(process.cwd(),
'json-schemas')) and ensure the directory exists before writing (await
fs.mkdir(outputDir, { recursive: true })), then write the schema to
path.join(outputDir, 'integrations-file-schema.json') using the existing
jsonSchema variable produced by zodToJsonSchema(integrationsFileSchema).

Comment on lines +282 to +286
it('handles trailing slash in directory', () => {
const result = getDefaultIntegrationsFilePath('/path/to/project/')

expect(result).toBe('/path/to/project//.deepnote.env.yaml')
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Don’t lock in the double‑slash behavior.
Update this expectation if getDefaultIntegrationsFilePath switches to path.join.

🤖 Prompt for AI Agents
In `@packages/cli/src/integrations/parse-integrations.test.ts` around lines 282 -
286, The test pins a double-slash outcome for getDefaultIntegrationsFilePath;
change the assertion to be robust to implementation (e.g., use path.join or
path.normalize when building the expected path) so the test doesn't fail if
getDefaultIntegrationsFilePath switches to path.join—update the test in
parse-integrations.test.ts to compute the expected value with Node's path
utilities (referencing the getDefaultIntegrationsFilePath call and its input
'/path/to/project/') or assert that the returned path
endsWith('.deepnote.env.yaml') instead of asserting the exact double-slash
string.

Comment on lines +36 to +41
function formatZodIssues(issues: ZodIssue[], pathPrefix: string): ValidationIssue[] {
return issues.map(issue => ({
path: pathPrefix ? `${pathPrefix}.${issue.path.join('.')}` : issue.path.join('.'),
message: issue.message,
code: issue.code,
}))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Avoid trailing dots in issue paths.
Edge case when issue.path is empty.

♻️ Suggested tweak
-function formatZodIssues(issues: ZodIssue[], pathPrefix: string): ValidationIssue[] {
-  return issues.map(issue => ({
-    path: pathPrefix ? `${pathPrefix}.${issue.path.join('.')}` : issue.path.join('.'),
-    message: issue.message,
-    code: issue.code,
-  }))
-}
+function formatZodIssues(issues: ZodIssue[], pathPrefix: string): ValidationIssue[] {
+  return issues.map(issue => {
+    const issuePath = issue.path.join('.')
+    return {
+      path: pathPrefix ? (issuePath ? `${pathPrefix}.${issuePath}` : pathPrefix) : issuePath,
+      message: issue.message,
+      code: issue.code,
+    }
+  })
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function formatZodIssues(issues: ZodIssue[], pathPrefix: string): ValidationIssue[] {
return issues.map(issue => ({
path: pathPrefix ? `${pathPrefix}.${issue.path.join('.')}` : issue.path.join('.'),
message: issue.message,
code: issue.code,
}))
function formatZodIssues(issues: ZodIssue[], pathPrefix: string): ValidationIssue[] {
return issues.map(issue => {
const issuePath = issue.path.join('.')
return {
path: pathPrefix ? (issuePath ? `${pathPrefix}.${issuePath}` : pathPrefix) : issuePath,
message: issue.message,
code: issue.code,
}
})
}
🤖 Prompt for AI Agents
In `@packages/cli/src/integrations/parse-integrations.ts` around lines 36 - 41, In
formatZodIssues, avoid producing trailing dots when issue.path is empty by
building the path from non-empty segments: take pathPrefix and issue.path,
filter out empty strings, then join with '.'; if all segments are empty return
an empty string. Update the path computation in formatZodIssues to use this
filtered-join approach (referencing formatZodIssues, pathPrefix and issue.path).

Comment on lines +55 to +76
// Check if file exists
try {
await fs.access(filePath)
} catch {
// File doesn't exist - return empty result (not an error, integrations are optional)
return { integrations: emptyIntegrations, issues }
}

// Read and decode file
let content: string
try {
const rawBytes = await fs.readFile(filePath)
content = decodeUtf8NoBom(rawBytes)
} catch (error) {
const message = error instanceof Error ? error.message : String(error)
issues.push({
path: '',
message: `Failed to read integrations file: ${message}`,
code: 'file_read_error',
})
return { integrations: emptyIntegrations, issues }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t treat non‑ENOENT access errors as “missing file.”
fs.access failures like EACCES are currently swallowed, returning empty results instead of issues.

🐛 Proposed fix
-  // Check if file exists
-  try {
-    await fs.access(filePath)
-  } catch {
-    // File doesn't exist - return empty result (not an error, integrations are optional)
-    return { integrations: emptyIntegrations, issues }
-  }
-
-  // Read and decode file
+  // Read and decode file
   let content: string
   try {
     const rawBytes = await fs.readFile(filePath)
     content = decodeUtf8NoBom(rawBytes)
   } catch (error) {
+    const err = error as NodeJS.ErrnoException
+    if (err?.code === 'ENOENT') {
+      return { integrations: emptyIntegrations, issues }
+    }
     const message = error instanceof Error ? error.message : String(error)
     issues.push({
       path: '',
       message: `Failed to read integrations file: ${message}`,
       code: 'file_read_error',
     })
     return { integrations: emptyIntegrations, issues }
   }
🤖 Prompt for AI Agents
In `@packages/cli/src/integrations/parse-integrations.ts` around lines 55 - 76,
The fs.access try/catch in parse-integrations.ts currently treats any access
error as "file missing"; update the catch to inspect the caught error (e.g., the
error from fs.access) and only return the emptyIntegrations silently when the
error is an ENOENT "file not found" error; for all other errors (EACCES, etc.)
push an issue onto the issues array with a descriptive message (similar to the
file_read_error pattern) and return { integrations: emptyIntegrations, issues }
so callers know there was an access problem; locate the fs.access block around
filePath and use the existing emptyIntegrations and issues symbols to implement
this behavior.

Comment on lines +140 to +141
export function getDefaultIntegrationsFilePath(deepnoteFileDir: string): string {
return `${deepnoteFileDir}/.deepnote.env.yaml`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use path.join for cross‑platform paths.
String concatenation yields double slashes and can be problematic on Windows.

🛠️ Proposed fix
-import fs from 'node:fs/promises'
+import fs from 'node:fs/promises'
+import path from 'node:path'
@@
 export function getDefaultIntegrationsFilePath(deepnoteFileDir: string): string {
-  return `${deepnoteFileDir}/.deepnote.env.yaml`
+  return path.join(deepnoteFileDir, '.deepnote.env.yaml')
 }
🤖 Prompt for AI Agents
In `@packages/cli/src/integrations/parse-integrations.ts` around lines 140 - 141,
The getDefaultIntegrationsFilePath function builds a file path using string
concatenation which can produce incorrect separators on some platforms; replace
the concatenation with path.join to construct the path cross‑platform (import
Node's path module if not already present) so
getDefaultIntegrationsFilePath(deepnoteFileDir) returns
path.join(deepnoteFileDir, '.deepnote.env.yaml') instead of using
`${deepnoteFileDir}/.deepnote.env.yaml`.

@codecov
Copy link
codecov bot commented Jan 25, 2026

Codecov Report

❌ Patch coverage is 94.11765% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.95%. Comparing base (5e05301) to head (1a51771).

Files with missing lines Patch % Lines
...ackages/cli/src/integrations/parse-integrations.ts 93.87% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #244      +/-   ##
==========================================
+ Coverage   92.93%   92.95%   +0.01%     
==========================================
  Files          78       80       +2     
  Lines        4645     4696      +51     
  Branches     1256     1305      +49     
==========================================
+ Hits         4317     4365      +48     
- Misses        328      331       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

0