8000 [WEB-5575]feat: enhance APITokenLogMiddleware to support logging to MongoDB by pablohashescobar · Pull Request #8241 · makeplane/plane · GitHub
[go: up one dir, main page]

Skip to content

[WEB-5575]feat: enhance APITokenLogMiddleware to support logging to MongoDB#8241

Merged
sriramveeraghanta merged 5 commits intopreviewfrom
fix-api-logger
Dec 9, 2025
Merged

[WEB-5575]feat: enhance APITokenLogMiddleware to support logging to MongoDB#8241
sriramveeraghanta merged 5 commits intopreviewfrom
fix-api-logger

Conversation

@pablohashescobar
Copy link
Member
@pablohashescobar pablohashescobar commented Dec 4, 2025

Description

  • Added functionality to log external API requests to MongoDB, with a fallback to PostgreSQL if MongoDB is unavailable.
  • Implemented error handling for MongoDB connection and logging operations.
  • Introduced additional fields for MongoDB logs, including timestamps and user identifiers.
  • Refactored request logging logic to streamline the process and improve maintainability.

Type of Change

  • Improvement (change that would cause existing functionality to not work as expected)

Test Scenarios

  • check external apis request should correctly log into mongo even if one instance is failed

References

WEB-5575

Summary by CodeRabbit

  • New Features

    • Async background logging that prefers MongoDB with automatic PostgreSQL fallback for API requests.
    • Captures decoded request/response bodies and user context when available.
  • Improvements

    • Timezone-aware timestamps for log entries.
    • Centralized error handling and safer body decoding for more reliable logging.
    • Skips logging when no API key is present to reduce noise.

✏️ Tip: You can customize this high-level summary in your review settings.

- Added functionality to log external API requests to MongoDB, with a fallback to PostgreSQL if MongoDB is unavailable.
- Implemented error handling for MongoDB connection and logging operations.
- Introduced additional fields for MongoDB logs, including timestamps and user identifiers.
- Refactored request logging logic to streamline the process and improve maintainability.
@makeplane
Copy link
makeplane bot commented Dec 4, 2025

Linked to Plane Work Item(s)

This comment was auto-generated by Plane

@coderabbitai
Copy link
Contributor
coderabbitai bot commented Dec 4, 2025

Warning

Rate limit exceeded

@pablohashescobar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 29 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between acc3970 and e83ddd2.

📒 Files selected for processing (1)
  • apps/api/plane/bgtasks/logger_task.py (1 hunks)

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

APITokenLogMiddleware now builds timezone-aware log payloads, safely decodes bodies, and enqueues persistence to a background worker (process_logs) which prefers MongoDB and falls back to PostgreSQL; exceptions are routed to a centralized logger and requests without API keys short-circuit without logging.

Changes

Cohort / File(s) Summary
API token middleware changes
apps/api/plane/middleware/logger.py
Replaces inline DB writes with background-task delegation (constructs log_data and mongo_log and calls process_logs.delay(...)), adds timezone-aware timestamps and user context, uses centralized safe_decode_body, short-circuits when no API key, wraps errors with log_exception, and adds a class docstring.
Background logging task module
apps/api/plane/bgtasks/logger_task.py
New module providing get_mongo_collection(), safe_decode_body(), log_to_mongo(), log_to_postgres(), and process_logs(); attempts Mongo insertion first, falls back to Postgres ORM, detects binary/undecodable bodies, and centralizes worker logging and exception capture.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Middleware as APITokenLogMiddleware
    participant Worker as Background Worker
    participant Mongo as MongoDB
    participant Postgres as PostgreSQL

    Client->>Middleware: HTTP request (with API key)
    Middleware->>Middleware: build `log_data` & `mongo_log` (tz timestamps, decoded bodies)
    Middleware->>Worker: enqueue process_logs(mongo_log, log_data)

    rect rgb(220,230,250)
    note over Worker,Mongo: Worker attempts Mongo first
    Worker->>Mongo: get_mongo_collection() / insert document
    alt Mongo available & insert succeeds
        Mongo-->>Worker: insert success
    else Mongo unavailable or insert fails
        Mongo-->>Worker: error
        Worker->>Worker: log_exception(error)
    end
    end

    rect rgb(230,250,220)
    note over Worker,Postgres: Fallback to Postgres
    alt Mongo failed or absent
        Worker->>Postgres: create APIActivityLog(log_data)
        Postgres-->>Worker: success/failure
    end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review process_logs for correct routing, error handling, and idempotence.
  • Validate safe_decode_body for binary detection and edge cases.
  • Inspect get_mongo_collection for robust connection handling and logging.
  • Confirm middleware short-circuit logic (no API key) and that sensitive fields are not logged.
  • Check background task enqueueing semantics and any task retry implications.

Poem

🐰 I hopped through headers, timestamps bright,
I nudged each log toward Mongo's light.
When writes stumbled on a rocky page,
I tucked the notes and took the Postgres stage.
Quiet now — the logs rest snug tonight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: enhancing APITokenLogMiddleware to support MongoDB logging, which aligns with the file modifications and PR objectives.
Description check ✅ Passed The description covers main changes, includes a type of change selection, provides a test scenario, and references the related issue, though missing explicit screenshots/media section.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor
@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe86713 and 889da74.

📒 Files selected for processing (1)
  • apps/api/plane/middleware/logger.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
apps/api/plane/middleware/logger.py (2)
apps/api/plane/settings/mongo.py (2)
  • MongoConnection (18-122)
  • get_collection (91-109)
apps/api/plane/utils/exception_logger.py (1)
  • log_exception (9-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Analyze (javascript)
🔇 Additional comments (2)
apps/api/plane/middleware/logger.py (2)

147-156: LGTM!

The PostgreSQL fallback logic is straightforward with appropriate exception handling.


193-199: Fallback logic is well-structured.

The approach of attempting MongoDB first and falling back to PostgreSQL on failure ensures logging reliability.

Copy link
Contributor
Copilot AI 6880 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the APITokenLogMiddleware to support logging external API requests to MongoDB with a fallback to PostgreSQL. The implementation adds MongoDB as the primary logging destination for API activity logs, improving scalability and separation of concerns for high-volume logging scenarios.

Key Changes

  • Added MongoDB collection initialization and connection management in middleware __init__
  • Implemented log_to_mongo() and log_to_postgres() methods for dual-database logging strategy
  • Refactored process_request() to build a unified log data structure with MongoDB-specific fields

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor
Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

- Enhanced the logic for determining MongoDB availability by checking if the collection is not None.
- Added a check for MongoDB configuration before attempting to retrieve the collection.
- Updated error handling to ensure the middleware correctly reflects the state of MongoDB connectivity.
Copy link
Contributor
@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
apps/api/plane/middleware/logger.py (1)

194-195: Field naming inconsistent with codebase patterns.

Past reviews noted that MongoDB field names should use created_by_id and updated_by_id instead of created_by and updated_by to match the existing codebase pattern. The cleanup task at apps/api/plane/bgtasks/cleanup_task.py:178 uses created_by_id for MongoDB documents corresponding to PostgreSQL's foreign key fields.

Apply this diff:

             mongo_log = {
                 **log_data,
                 "created_at": timezone.now(),
                 "updated_at": timezone.now(),
-                "created_by": user_id,
-                "updated_by": user_id,
+                "created_by_id": user_id,
+                "updated_by_id": user_id,
             }

Based on learnings from past reviews.

🧹 Nitpick comments (3)
apps/api/plane/middleware/logger.py (3)

192-193: Timestamp storage inconsistent with existing patterns.

Past reviews noted that the middleware stores timezone.now() as datetime objects, but the cleanup task pattern converts timestamps to strings (e.g., apps/api/plane/bgtasks/cleanup_task.py:167 uses str(record["created_at"])). While MongoDB supports native datetime objects, consistency with the existing codebase pattern would improve maintainability.

Consider converting timestamps to strings for consistency:

             mongo_log = {
                 **log_data,
-                "created_at": timezone.now(),
-                "updated_at": timezone.now(),
+                "created_at": str(timezone.now()),
+                "updated_at": str(timezone.now()),
                 "created_by_id": user_id,
                 "updated_by_id": user_id,
             }

Alternatively, if native datetime objects are preferred, ensure the cleanup task is updated to match this pattern.

Based on learnings from past reviews.


198-201: Performance: Synchronous database writes in request cycle.

Past reviews noted that synchronous database writes to both MongoDB and PostgreSQL during the request/response cycle can significantly impact request latency, especially under high traffic. The codebase already uses asynchronous patterns for similar operations (e.g., webhook logs in apps/api/plane/bgtasks/webhook_task.py).

Consider an asynchronous approach:

  1. Queue log data to a message queue (e.g., Celery task) for background processing
  2. Use bulk writes for batching log entries
  3. Add appropriate timeouts for database operations to prevent request hanging

This refactor can be deferred if the current load is manageable, but should be prioritized as traffic grows.

Based on learnings from past reviews.


184-188: Remove redundant string conversion of user_id.

The User model's id is a UUIDField, which is serializable to both MongoDB and PostgreSQL natively. The explicit str() conversion is unnecessary. Additionally, this middleware inconsistently converts to string while the RequestLoggerMiddleware in the same file stores user_id directly without conversion. For consistency, remove the conversion:

             user_id = (
-                str(request.user.id)
+                request.user.id
                 if getattr(request, "user") and getattr(request.user, "is_authenticated", False)
                 else None
             )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 889da74 and 415f6e3.

📒 Files selected for processing (1)
  • apps/api/plane/middleware/logger.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
apps/api/plane/middleware/logger.py (2)
apps/api/plane/settings/mongo.py (3)
  • MongoConnection (18-122)
  • is_configured (112-122)
  • get_collection (91-109)
apps/api/plane/utils/exception_logger.py (1)
  • log_exception (9-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and lint web apps
  • GitHub Check: Analyze (javascript)
🔇 Additional comments (5)
apps/api/plane/middleware/logger.py (5)

7-7: LGTM!

The new imports are appropriate and used correctly throughout the middleware for MongoDB integration and centralized exception logging.

Also applies to: 15-16


80-90: LGTM!

The MongoDB initialization correctly sets mongo_available only when the collection is successfully retrieved, addressing the previous review concern about setting the flag unconditionally.


92-105: LGTM!

The method correctly follows the established codebase pattern by checking MongoConnection.is_configured() before attempting to get the collection, as recommended in previous reviews.


135-150: LGTM!

The recovery logic now correctly refreshes self.mongo_collection after a failure (lines 148-149), addressing the previous review concern. The defensive check on line 140 prevents operations on a None collection.


152-161: LGTM!

The PostgreSQL fallback implementation is straightforward and includes appropriate exception handling.

- Added a new logger_task module to handle logging of API activity to MongoDB and PostgreSQL.
- Introduced functions for safely decoding request/response bodies and processing logs based on MongoDB availability.
- Refactored APITokenLogMiddleware to utilize the new logging functions, improving code organization and maintainability.
Copy link
Contributor
@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
apps/api/plane/middleware/logger.py (1)

120-145: Sanitize logged data and align Mongo field conventions.

A few concerns in this block:

  1. Sensitive data in logs (security/compliance).

    • token_identifier stores the full X-Api-Key value.
    • headers is str(request.headers), which will include Authorization, cookies, other API keys, etc.
    • body and response_body log full payloads, which may contain secrets or PII.

    This was already highlighted in earlier reviews and remains unaddressed. Consider:

    • Masking API keys (e.g., keep only last 4 chars or use a hash).
    • Redacting sensitive headers (Authorization, Cookie, X-Api-Key, CSRF tokens, etc.) before logging.
    • Optionally truncating or field-filtering bodies, especially for password/PII fields.
  2. Mongo user/timestamp field conventions.

    • You’re using created_by / updated_by, whereas other Mongo flows in this codebase use created_by_id / updated_by_id to mirror the Postgres FK fields.
    • Timestamps are stored as timezone.now() (naive vs string format differences) while some existing tasks stringify datetimes before writing to Mongo.

    For consistency and to simplify querying/cleanup jobs, consider aligning with the existing conventions:

-            mongo_log = {
-                **log_data,
-                "created_at": timezone.now(),
-                "updated_at": timezone.now(),
-                "created_by": user_id,
-                "updated_by": user_id,
-            }
+            mongo_log = {
+                **log_data,
+                "created_at": timezone.now(),  # or str(timezone.now()) if matching existing tasks
+                "updated_at": timezone.now(),
+                "created_by_id": user_id,
+                "updated_by_id": user_id,
+            }

(Adjust the timestamp representation to whichever format is already used in cleanup_task and related Mongo consumers.)

Addressing these points will make the logs safer to store and easier to work with, and it lines up with prior review feedback.

🧹 Nitpick comments (2)
apps/api/plane/bgtasks/logger_task.py (1)

33-55: Unify body-decoding helper and relax the type hint.

This helper already handles None and empty bytes, but the type hint is bytes, and the same decoding logic is duplicated in APITokenLogMiddleware._safe_decode_body.

Consider:

  • Updating the type hint to reflect actual usage:
-def safe_decode_body(content: bytes) -> Optional[str]:
+def safe_decode_body(content: Optional[bytes]) -> Optional[str]:
  • Importing and using safe_decode_body from this module inside APITokenLogMiddleware instead of maintaining a second, almost identical implementation, so request/response decoding behavior stays in sync in one place.
apps/api/plane/middleware/logger.py (1)

90-111: Avoid duplicating body-decoding logic and fix the docstring.

_safe_decode_body here is effectively a copy of safe_decode_body in apps/api/plane/bgtasks/logger_task.py, but:

  • The docstring omits the special cases "[Binary Content]" and "[Could not decode content]".
  • You now have two places to keep in sync if the decoding rules change.

Consider:

  • Replacing this method with an import of the shared helper:
from plane.bgtasks.logger_task import safe_decode_body

and then using safe_decode_body(request_body) / safe_decode_body(response.content) in process_request.

  • Or, if you keep the method, update the docstring so it matches the actual return values (including the binary and decode-failure markers).

This keeps behavior consistent across middleware and background processing.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 415f6e3 and 6a95341.

📒 Files selected for processing (2)
  • apps/api/plane/bgtasks/logger_task.py (1 hunks)
  • apps/api/plane/middleware/logger.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
apps/api/plane/middleware/logger.py (3)
apps/api/plane/settings/mongo.py (1)
  • MongoConnection (18-122)
apps/api/plane/utils/exception_logger.py (1)
  • log_exception (9-20)
apps/api/plane/bgtasks/logger_task.py (1)
  • process_logs (85-93)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and lint web apps
  • GitHub Check: Analyze (javascript)
🔇 Additional comments (3)
apps/api/plane/bgtasks/logger_task.py (2)

17-31: Mongo collection helper is consistent and robust.

The collection retrieval logic matches the MongoConnection API, handles “not configured” and exceptions cleanly, and returns None in all failure cases. No changes needed here.


57-82: Mongo/Postgres logging helpers look correct.

log_to_mongo and log_to_postgres correctly guard on None, call the respective backends, and surface failures via log_exception while returning a boolean. The encapsulation is clean and keeps persistence details out of the caller.

apps/api/plane/middleware/logger.py (1)

76-80: APITokenLogMiddleware docstring is clear and helpful.

The docstring concisely explains the purpose of the middleware and the Mongo/Postgres logging behavior, improving readability.

- Removed direct dependency on MongoDB collection in log_to_mongo function, now retrieving it internally.
- Updated process_logs to check MongoDB configuration before logging, enhancing error handling.
- Cleaned up logger.py by removing unused imports related to MongoDB.
dheeru0198
dheeru0198 previously approved these changes Dec 5, 2025
- Introduced the @shared_task decorator to the process_logs function, enabling asynchronous processing of log data.
- Updated function signature to include a return type of None for clarity.
Copy link
Contributor
@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
apps/api/plane/middleware/logger.py (2)

145-145: process_logs is not a Celery task; .delay() will raise AttributeError at runtime.

The process_logs function in apps/api/plane/bgtasks/logger_task.py (lines 86-94) is a plain function without a Celery task decorator. Calling .delay() on it will fail.

Either call the function directly or decorate it with @shared_task:

-            process_logs.delay(log_data=log_data, mongo_log=mongo_log)
+            process_logs(log_data=log_data, mongo_log=mongo_log)

Or add the decorator in logger_task.py:

from celery import shared_task

@shared_task
def process_logs(log_data: Dict[str, Any], mongo_log: Dict[str, Any]):
    ...

119-130: Sensitive data logged without sanitization.

Request headers (line 124) and full API key (line 120) are logged without redaction. Headers may contain Authorization, Cookie, and other sensitive tokens. The full API key should be masked (e.g., show only last 4 characters).

🧹 Nitpick comments (3)
apps/api/plane/middleware/logger.py (1)

88-108: Code duplication: _safe_decode_body is duplicated in logger_task.py.

This method is nearly identical to safe_decode_body in apps/api/plane/bgtasks/logger_task.py (lines 33-54). Since logging is now delegated to the background task, consider removing this method from the middleware and using the one in logger_task.py, or extract it to a shared utility module.

apps/api/plane/bgtasks/logger_task.py (2)

57-71: Double-checking Mongo configuration is redundant.

log_to_mongo calls get_mongo_collection() internally, which already checks MongoConnection.is_configured(). This means line 63 logs "MongoDB not configured" after get_mongo_collection already logged the same at line 22. Consider passing the collection as a parameter or removing the redundant error log.

-def log_to_mongo(log_document: Dict[str, Any]) -> bool:
+def log_to_mongo(mongo_collection: Collection, log_document: Dict[str, Any]) -> bool:
     """
     Logs the request to MongoDB if available.
     """
-    mongo_collection = get_mongo_collection()
-    if mongo_collection is None:
-        logger.error("MongoDB not configured")
-        return False
-
     try:
         mongo_collection.insert_one(log_document)
         return True
     except Exception as e:
         log_exception(e)
         return False

Then update process_logs:

if MongoConnection.is_configured():
    mongo_collection = get_mongo_collection()
    if mongo_collection and not log_to_mongo(mongo_collection, mongo_log):
        log_to_postgres(log_data)
    elif not mongo_collection:
        log_to_postgres(log_data)
else:
    log_to_postgres(log_data)

33-54: safe_decode_body is unused in this file.

This function is defined but never called within logger_task.py. The middleware prepares the decoded body before calling process_logs. Consider removing this dead code or consolidating it as the single implementation used by the middleware.

#!/bin/bash
# Verify if safe_decode_body is used anywhere in the codebase
rg -n "safe_decode_body" --type py
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a95341 and acc3970.

📒 Files selected for processing (2)
  • apps/api/plane/bgtasks/logger_task.py (1 hunks)
  • apps/api/plane/middleware/logger.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
apps/api/plane/bgtasks/logger_task.py (2)
apps/api/plane/settings/mongo.py (2)
  • is_configured (112-122)
  • get_collection (91-109)
apps/api/plane/utils/exception_logger.py (1)
  • log_exception (9-20)
apps/api/plane/middleware/logger.py (2)
apps/api/plane/utils/exception_logger.py (1)
  • log_exception (9-20)
apps/api/plane/bgtasks/logger_task.py (1)
  • process_logs (86-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and lint web apps
  • GitHub Check: Analyze (javascript)
🔇 Additional comments (3)
apps/api/plane/middleware/logger.py (1)

131-148: Exception handling structure looks correct.

The try-except block correctly wraps the log preparation and task invocation, with errors routed to the centralized log_exception helper. The user ID extraction logic properly handles unauthenticated requests.

apps/api/plane/bgtasks/logger_task.py (2)

17-30: get_mongo_collection follows the established pattern correctly.

The function properly checks MongoConnection.is_configured() before attempting to get the collection, matching the pattern used in apps/api/plane/bgtasks/cleanup_task.py.


74-83: PostgreSQL fallback implementation is correct.

The function properly creates the APIActivityLog record and handles exceptions with log_exception. The boolean return value enables proper fallback logic.

@sriramveeraghanta sriramveeraghanta merged commit 491b5bc into preview Dec 9, 2025
9 checks passed
@sriramveeraghanta sriramveeraghanta deleted the fix-api-logger branch December 9, 2025 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

1641
0