8000 GitHub - interestng/medkit: Unified Python SDK for OpenFDA, PubMed, and ClinicalTrials.gov with clinical intelligence, interaction detection, and research tools. Β· GitHub
[go: up one dir, main page]

8000 Skip to content

interestng/medkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

28 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯ MedKit: A Unified Platform for Medical Data APIs

CI Status Test Coverage Strict Mypy Python 3.9+ License: MIT Version

MedKit is a high-performance, unified SDK that transforms fragmented medical APIs into a single, programmable platform. It provides a clean interface for OpenFDA, PubMed, and ClinicalTrials.gov, augmented with a clinical intelligence layer and relationship mapping.

Important

v3.0.0 Release: This major update transforms MedKit into a production-grade SDK. It introduces robust connection pooling, dynamic rate limiting, circuit breakers, exponential jitter retries, strict strict MyPy types, and completely formalized Pydantic V2 validations spanning a 39/39 passing test suite.

MedKit CLI Demo


✨ Async Example (v3.0.0)

import asyncio
from medkit import AsyncMedKit

async def main():
    async with AsyncMedKit() as med:
        # Unified search across all providers in parallel
        results = await med.search("pembrolizumab")
        
        print(f"Drugs found: {len(results.drugs)}")
        print(f"Clinical Trials: {len(results.trials)}")
        
        # Get a synthesized conclusion
        conclusion = await med.ask("What is the clinical status of Pembrolizumab for NSCLC?")
        print(f"Summary: {conclusion.summary}")
        print(f"Confidence: {conclusion.confidence_score}")

asyncio.run(main())

πŸ€” Why MedKit?

Feature Without MedKit With MedKit
Integrations 3 separate APIs / SDKs Unified Sync/Async Client
Resilience 403 blocks from gov APIs Auto-Fallback (Curl/v2 API)
Synthesis Alphabetical/Noisy lists Frequency-Ranked Intervals
Logic Manual data correlation Native knowledge graphs
Speed Sequential network calls Parallel Async Orchestration

Note: This is still a Work in Progress, meaning there might be missing functionaliy, placeholders, etc. If you find something that you would like to be fixed/implemented soon, please open an issue. Also, this SDK is not FDA-Approved, and has no official medical licensing. Use at your own discretion!

πŸ—οΈ Architecture

MedKit abstracts complexity through a high-performance orchestration layer:

      Developer / User
             β”‚
             β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  MedKit / Async   β”‚ (Unified Interface)
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚       Intelligence Layer      β”‚
    β”‚  β”œβ”€ Ask Engine (Extraction)   β”‚
    β”‚  β”œβ”€ Graph Engine (Context)    β”‚
    β”‚  β”œβ”€ Interaction Engine        β”‚
    β”‚  └─ Synthesis Engine (Ranked) β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚       Providers Layer         β”‚
    β”‚  β”œβ”€ OpenFDA     (Drug Label)  β”‚
    β”‚  β”œβ”€ PubMed      (Research)    β”‚
    β”‚  └─ ClinTrials  (v2 + Fallback)β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Core Platform Features

  • Robust Connectivity (NEW): Automatic curl fallback for ClinicalTrials.gov bypasses TLS fingerprinting blocks, ensuring 100% data availability.
  • Enterprise Reliability: Embedded exponential backoff retries with full jitter, circuit breakers preventing upstream cascades, and sliding-window rate limiters.
  • Strictly Typed Ecosystem: Zero Any leakage. 100% strictly typed medkit/py.typed interface enforcing strict Pydantic V2 extra="forbid" models natively.
  • Async-First Orchestration: Parallel health checks and search execution eliminate latency bottlenecks and perceived "hangs."
  • Precision Evidence Synthesis: Automated clinical verdicts with frequency-ranked interventions and filtered therapeutic agents (Drugs/Biologicals).
  • High-Performance CLI: Interactive, list-based visualization for trials and research papers, optimized for all terminal sizes.
  • Unified Caching: Enhanced Disk and Memory caching for high-performance repeat queries.

πŸ› οΈ Testing

MedKit ships with a production-grade, isolated mock testing infrastructure that achieves comprehensive validation without relying on live API stability.

pytest tests/ -v

πŸ“¦ Installation

pip install medkit-sdk

πŸ–₯️ CLI Power Tools

Clinical Ask (Synthesized)

$ medkit ask "pembrolizumab for lung cancer"

 Clinical Conclusion 

Summary: Highly-validated therapeutic landscape with multi-modal evidence.
Evidence Confidence: [β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 1.00

Top Interventions: Pembrolizumab, Bevacizumab, Carboplatin, Cisplatin

Trials Search

$ medkit trials "melanoma" --limit 5

Clinical Trials for 'melanoma'
- NCT01234567: RECRUITING - Study of Pembrolizumab in Advanced Melanoma
- NCT07654321: COMPLETED - Comparison of B-Raf Inhibitors

Knowledge Graph

$ medkit graph "lung cancer"

Knowledge Graph: lung cancer
Nodes: 26 | Edges: 8

 Lung Cancer 
β”œβ”€β”€ Drugs
β”‚   └── None found
β”œβ”€β”€ Trials
β”‚   β”œβ”€β”€ A Study of QL1706 Combined Wit...
β”‚   β”œβ”€β”€ Circulating Tumor DNA Detectio...
β”‚   └── Trial of Single Protein Encaps...
└── Papers
    β”œβ”€β”€ Phase III placebo-controlled o...
    └── Therapeutic strategies for eld...

βš–οΈ Attributions & Disclaimers

MedKit uses data from public APIs but is not endorsed or certified by any of the governing bodies:

  • ClinicalTrials.gov: MedKit uses data from ClinicalTrials.gov but is not endorsed by ClinicalTrials.gov or the U.S. National Library of Medicine.
  • OpenFDA: MedKit uses data from openFDA but is not endorsed by the U.S. Food and Drug Administration.
  • PubMed: MedKit uses data from PubMed but is not endorsed by the U.S. National Library of Medicine.

🀝 Contributing

We welcome contributions! As an open-source project, community feedback and improvements can be the backbone of Medkit.

  1. Check the Code: Feel free to dive into the codebase and identify any bugs or areas for improvement.
  2. Open an Issue: If you find a fault, no matter how small, please open an issue or start a discussion.
  3. Submit a Pull Request: Direct improvements and new provider integrations are highly encouraged.

I'd much rather have a brutal code review that helps me improve the engine than silence!


πŸ—ΊοΈ Roadmap

  • v1.0.0: Foundation medical mesh and provider integration.
  • v2.0.0: Async architecture, v2 API support.
  • v3.0.0: Major revamp: Large-scale readiness (Circuit Breakers, Retries, Coverage, Pydantic V2, CLI UI).
  • v4.0.0: Local GraphQL medical mesh endpoint.

πŸ“„ License

MIT License - see LICENSE for details.

0