[WEB-5575]feat: enhance APITokenLogMiddleware to support logging to MongoDB#8241
[WEB-5575]feat: enhance APITokenLogMiddleware to support logging to MongoDB#8241sriramveeraghanta merged 5 commits intopreviewfrom
Conversation
- Added functionality to log external API requests to MongoDB, with a fallback to PostgreSQL if MongoDB is unavailable. - Implemented error handling for MongoDB connection and logging operations. - Introduced additional fields for MongoDB logs, including timestamps and user identifiers. - Refactored request logging logic to streamline the process and improve maintainability.
|
Linked to Plane Work Item(s) This comment was auto-generated by Plane |
|
Warning Rate limit exceeded@pablohashescobar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 29 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAPITokenLogMiddleware now builds timezone-aware log payloads, safely decodes bodies, and enqueues persistence to a background worker (process_logs) which prefers MongoDB and falls back to PostgreSQL; exceptions are routed to a centralized logger and requests without API keys short-circuit without logging. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Middleware as APITokenLogMiddleware
participant Worker as Background Worker
participant Mongo as MongoDB
participant Postgres as PostgreSQL
Client->>Middleware: HTTP request (with API key)
Middleware->>Middleware: build `log_data` & `mongo_log` (tz timestamps, decoded bodies)
Middleware->>Worker: enqueue process_logs(mongo_log, log_data)
rect rgb(220,230,250)
note over Worker,Mongo: Worker attempts Mongo first
Worker->>Mongo: get_mongo_collection() / insert document
alt Mongo available & insert succeeds
Mongo-->>Worker: insert success
else Mongo unavailable or insert fails
Mongo-->>Worker: error
Worker->>Worker: log_exception(error)
end
end
rect rgb(230,250,220)
note over Worker,Postgres: Fallback to Postgres
alt Mongo failed or absent
Worker->>Postgres: create APIActivityLog(log_data)
Postgres-->>Worker: success/failure
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/api/plane/middleware/logger.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
apps/api/plane/middleware/logger.py (2)
apps/api/plane/settings/mongo.py (2)
MongoConnection(18-122)get_collection(91-109)apps/api/plane/utils/exception_logger.py (1)
log_exception(9-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Analyze (javascript)
🔇 Additional comments (2)
apps/api/plane/middleware/logger.py (2)
147-156: LGTM!The PostgreSQL fallback logic is straightforward with appropriate exception handling.
193-199: Fallback logic is well-structured.The approach of attempting MongoDB first and falling back to PostgreSQL on failure ensures logging reliability.
There was a problem hiding this comment.
Pull request overview
This PR enhances the APITokenLogMiddleware to support logging external API requests to MongoDB with a fallback to PostgreSQL. The implementation adds MongoDB as the primary logging destination for API activity logs, improving scalability and separation of concerns for high-volume logging scenarios.
Key Changes
- Added MongoDB collection initialization and connection management in middleware
__init__ - Implemented
log_to_mongo()andlog_to_postgres()methods for dual-database logging strategy - Refactored
process_request()to build a unified log data structure with MongoDB-specific fields
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Enhanced the logic for determining MongoDB availability by checking if the collection is not None. - Added a check for MongoDB configuration before attempting to retrieve the collection. - Updated error handling to ensure the middleware correctly reflects the state of MongoDB connectivity.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
apps/api/plane/middleware/logger.py (1)
194-195: Field naming inconsistent with codebase patterns.Past reviews noted that MongoDB field names should use
created_by_idandupdated_by_idinstead ofcreated_byandupdated_byto match the existing codebase pattern. The cleanup task atapps/api/plane/bgtasks/cleanup_task.py:178usescreated_by_idfor MongoDB documents corresponding to PostgreSQL's foreign key fields.Apply this diff:
mongo_log = { **log_data, "created_at": timezone.now(), "updated_at": timezone.now(), - "created_by": user_id, - "updated_by": user_id, + "created_by_id": user_id, + "updated_by_id": user_id, }Based on learnings from past reviews.
🧹 Nitpick comments (3)
apps/api/plane/middleware/logger.py (3)
192-193: Timestamp storage inconsistent with existing patterns.Past reviews noted that the middleware stores
timezone.now()as datetime objects, but the cleanup task pattern converts timestamps to strings (e.g.,apps/api/plane/bgtasks/cleanup_task.py:167usesstr(record["created_at"])). While MongoDB supports native datetime objects, consistency with the existing codebase pattern would improve maintainability.Consider converting timestamps to strings for consistency:
mongo_log = { **log_data, - "created_at": timezone.now(), - "updated_at": timezone.now(), + "created_at": str(timezone.now()), + "updated_at": str(timezone.now()), "created_by_id": user_id, "updated_by_id": user_id, }Alternatively, if native datetime objects are preferred, ensure the cleanup task is updated to match this pattern.
Based on learnings from past reviews.
198-201: Performance: Synchronous database writes in request cycle.Past reviews noted that synchronous database writes to both MongoDB and PostgreSQL during the request/response cycle can significantly impact request latency, especially under high traffic. The codebase already uses asynchronous patterns for similar operations (e.g., webhook logs in
apps/api/plane/bgtasks/webhook_task.py).Consider an asynchronous approach:
- Queue log data to a message queue (e.g., Celery task) for background processing
- Use bulk writes for batching log entries
- Add appropriate timeouts for database operations to prevent request hanging
This refactor can be deferred if the current load is manageable, but should be prioritized as traffic grows.
Based on learnings from past reviews.
184-188: Remove redundant string conversion of user_id.The User model's
idis a UUIDField, which is serializable to both MongoDB and PostgreSQL natively. The explicitstr()conversion is unnecessary. Additionally, this middleware inconsistently converts to string while theRequestLoggerMiddlewarein the same file stores user_id directly without conversion. For consistency, remove the conversion:user_id = ( - str(request.user.id) + request.user.id if getattr(request, "user") and getattr(request.user, "is_authenticated", False) else None )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/api/plane/middleware/logger.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
apps/api/plane/middleware/logger.py (2)
apps/api/plane/settings/mongo.py (3)
MongoConnection(18-122)is_configured(112-122)get_collection(91-109)apps/api/plane/utils/exception_logger.py (1)
log_exception(9-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build and lint web apps
- GitHub Check: Analyze (javascript)
🔇 Additional comments (5)
apps/api/plane/middleware/logger.py (5)
7-7: LGTM!The new imports are appropriate and used correctly throughout the middleware for MongoDB integration and centralized exception logging.
Also applies to: 15-16
80-90: LGTM!The MongoDB initialization correctly sets
mongo_availableonly when the collection is successfully retrieved, addressing the previous review concern about setting the flag unconditionally.
92-105: LGTM!The method correctly follows the established codebase pattern by checking
MongoConnection.is_configured()before attempting to get the collection, as recommended in previous reviews.
135-150: LGTM!The recovery logic now correctly refreshes
self.mongo_collectionafter a failure (lines 148-149), addressing the previous review concern. The defensive check on line 140 prevents operations on aNonecollection.
152-161: LGTM!The PostgreSQL fallback implementation is straightforward and includes appropriate exception handling.
- Added a new logger_task module to handle logging of API activity to MongoDB and PostgreSQL. - Introduced functions for safely decoding request/response bodies and processing logs based on MongoDB availability. - Refactored APITokenLogMiddleware to utilize the new logging functions, improving code organization and maintainability.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
apps/api/plane/middleware/logger.py (1)
120-145: Sanitize logged data and align Mongo field conventions.A few concerns in this block:
Sensitive data in logs (security/compliance).
token_identifierstores the fullX-Api-Keyvalue.headersisstr(request.headers), which will includeAuthorization, cookies, other API keys, etc.bodyandresponse_bodylog full payloads, which may contain secrets or PII.This was already highlighted in earlier reviews and remains unaddressed. Consider:
- Masking API keys (e.g., keep only last 4 chars or use a hash).
- Redacting sensitive headers (
Authorization,Cookie,X-Api-Key, CSRF tokens, etc.) before logging.- Optionally truncating or field-filtering bodies, especially for password/PII fields.
Mongo user/timestamp field conventions.
- You’re using
created_by/updated_by, whereas other Mongo flows in this codebase usecreated_by_id/updated_by_idto mirror the Postgres FK fields.- Timestamps are stored as
timezone.now()(naive vs string format differences) while some existing tasks stringify datetimes before writing to Mongo.For consistency and to simplify querying/cleanup jobs, consider aligning with the existing conventions:
- mongo_log = { - **log_data, - "created_at": timezone.now(), - "updated_at": timezone.now(), - "created_by": user_id, - "updated_by": user_id, - } + mongo_log = { + **log_data, + "created_at": timezone.now(), # or str(timezone.now()) if matching existing tasks + "updated_at": timezone.now(), + "created_by_id": user_id, + "updated_by_id": user_id, + }(Adjust the timestamp representation to whichever format is already used in
cleanup_taskand related Mongo consumers.)Addressing these points will make the logs safer to store and easier to work with, and it lines up with prior review feedback.
🧹 Nitpick comments (2)
apps/api/plane/bgtasks/logger_task.py (1)
33-55: Unify body-decoding helper and relax the type hint.This helper already handles
Noneand empty bytes, but the type hint isbytes, and the same decoding logic is duplicated inAPITokenLogMiddleware._safe_decode_body.Consider:
- Updating the type hint to reflect actual usage:
-def safe_decode_body(content: bytes) -> Optional[str]: +def safe_decode_body(content: Optional[bytes]) -> Optional[str]:
- Importing and using
safe_decode_bodyfrom this module insideAPITokenLogMiddlewareinstead of maintaining a second, almost identical implementation, so request/response decoding behavior stays in sync in one place.apps/api/plane/middleware/logger.py (1)
90-111: Avoid duplicating body-decoding logic and fix the docstring.
_safe_decode_bodyhere is effectively a copy ofsafe_decode_bodyinapps/api/plane/bgtasks/logger_task.py, but:
- The docstring omits the special cases
"[Binary Content]"and"[Could not decode content]".- You now have two places to keep in sync if the decoding rules change.
Consider:
- Replacing this method with an import of the shared helper:
from plane.bgtasks.logger_task import safe_decode_bodyand then using
safe_decode_body(request_body)/safe_decode_body(response.content)inprocess_request.
- Or, if you keep the method, update the docstring so it matches the actual return values (including the binary and decode-failure markers).
This keeps behavior consistent across middleware and background processing.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
apps/api/plane/bgtasks/logger_task.py(1 hunks)apps/api/plane/middleware/logger.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
apps/api/plane/middleware/logger.py (3)
apps/api/plane/settings/mongo.py (1)
MongoConnection(18-122)apps/api/plane/utils/exception_logger.py (1)
log_exception(9-20)apps/api/plane/bgtasks/logger_task.py (1)
process_logs(85-93)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build and lint web apps
- GitHub Check: Analyze (javascript)
🔇 Additional comments (3)
apps/api/plane/bgtasks/logger_task.py (2)
17-31: Mongo collection helper is consistent and robust.The collection retrieval logic matches the
MongoConnectionAPI, handles “not configured” and exceptions cleanly, and returnsNonein all failure cases. No changes needed here.
57-82: Mongo/Postgres logging helpers look correct.
log_to_mongoandlog_to_postgrescorrectly guard onNone, call the respective backends, and surface failures vialog_exceptionwhile returning a boolean. The encapsulation is clean and keeps persistence details out of the caller.apps/api/plane/middleware/logger.py (1)
76-80: APITokenLogMiddleware docstring is clear and helpful.The docstring concisely explains the purpose of the middleware and the Mongo/Postgres logging behavior, improving readability.
- Removed direct dependency on MongoDB collection in log_to_mongo function, now retrieving it internally. - Updated process_logs to check MongoDB configuration before logging, enhancing error handling. - Cleaned up logger.py by removing unused imports related to MongoDB.
- Introduced the @shared_task decorator to the process_logs function, enabling asynchronous processing of log data. - Updated function signature to include a return type of None for clarity.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
apps/api/plane/middleware/logger.py (2)
145-145:process_logsis not a Celery task;.delay()will raiseAttributeErrorat runtime.The
process_logsfunction inapps/api/plane/bgtasks/logger_task.py(lines 86-94) is a plain function without a Celery task decorator. Calling.delay()on it will fail.Either call the function directly or decorate it with
@shared_task:- process_logs.delay(log_data=log_data, mongo_log=mongo_log) + process_logs(log_data=log_data, mongo_log=mongo_log)Or add the decorator in
logger_task.py:from celery import shared_task @shared_task def process_logs(log_data: Dict[str, Any], mongo_log: Dict[str, Any]): ...
119-130: Sensitive data logged without sanitization.Request headers (line 124) and full API key (line 120) are logged without redaction. Headers may contain
Authorization,Cookie, and other sensitive tokens. The full API key should be masked (e.g., show only last 4 characters).
🧹 Nitpick comments (3)
apps/api/plane/middleware/logger.py (1)
88-108: Code duplication:_safe_decode_bodyis duplicated inlogger_task.py.This method is nearly identical to
safe_decode_bodyinapps/api/plane/bgtasks/logger_task.py(lines 33-54). Since logging is now delegated to the background task, consider removing this method from the middleware and using the one inlogger_task.py, or extract it to a shared utility module.apps/api/plane/bgtasks/logger_task.py (2)
57-71: Double-checking Mongo configuration is redundant.
log_to_mongocallsget_mongo_collection()internally, which already checksMongoConnection.is_configured(). This means line 63 logs "MongoDB not configured" afterget_mongo_collectionalready logged the same at line 22. Consider passing the collection as a parameter or removing the redundant error log.-def log_to_mongo(log_document: Dict[str, Any]) -> bool: +def log_to_mongo(mongo_collection: Collection, log_document: Dict[str, Any]) -> bool: """ Logs the request to MongoDB if available. """ - mongo_collection = get_mongo_collection() - if mongo_collection is None: - logger.error("MongoDB not configured") - return False - try: mongo_collection.insert_one(log_document) return True except Exception as e: log_exception(e) return FalseThen update
process_logs:if MongoConnection.is_configured(): mongo_collection = get_mongo_collection() if mongo_collection and not log_to_mongo(mongo_collection, mongo_log): log_to_postgres(log_data) elif not mongo_collection: log_to_postgres(log_data) else: log_to_postgres(log_data)
33-54:safe_decode_bodyis unused in this file.This function is defined but never called within
logger_task.py. The middleware prepares the decoded body before callingprocess_logs. Consider removing this dead code or consolidating it as the single implementation used by the middleware.#!/bin/bash # Verify if safe_decode_body is used anywhere in the codebase rg -n "safe_decode_body" --type py
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
apps/api/plane/bgtasks/logger_task.py(1 hunks)apps/api/plane/middleware/logger.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
apps/api/plane/bgtasks/logger_task.py (2)
apps/api/plane/settings/mongo.py (2)
is_configured(112-122)get_collection(91-109)apps/api/plane/utils/exception_logger.py (1)
log_exception(9-20)
apps/api/plane/middleware/logger.py (2)
apps/api/plane/utils/exception_logger.py (1)
log_exception(9-20)apps/api/plane/bgtasks/logger_task.py (1)
process_logs(86-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build and lint web apps
- GitHub Check: Analyze (javascript)
🔇 Additional comments (3)
apps/api/plane/middleware/logger.py (1)
131-148: Exception handling structure looks correct.The try-except block correctly wraps the log preparation and task invocation, with errors routed to the centralized
log_exceptionhelper. The user ID extraction logic properly handles unauthenticated requests.apps/api/plane/bgtasks/logger_task.py (2)
17-30:get_mongo_collectionfollows the established pattern correctly.The function properly checks
MongoConnection.is_configured()before attempting to get the collection, matching the pattern used inapps/api/plane/bgtasks/cleanup_task.py.
74-83: PostgreSQL fallback implementation is correct.The function properly creates the
APIActivityLogrecord and handles exceptions withlog_exception. The boolean return value enables proper fallback logic.
Description
Type of Change
Test Scenarios
References
WEB-5575
Summary by CodeRabbit
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.