A comprehensive medical database search tool that allows users to search for medications across multiple medical databases.
- Search across 140+ medical databases including PubMed, FDA, EMA, and medical journals
- Real-time search results with streaming updates
- Advanced filtering options by date range
- Support for multiple active ingredients
- Browser automation with human-like behavior to avoid detection
- CAPTCHA solving capabilities
- Clean, modern user interface
- Node.js (v16+)
- Python (v3.8+)
- npm or yarn
-
Clone the repository:
git clone https://github.com/yourusername/medsearch.git cd medsearch
-
Install JavaScript dependencies:
npm install
-
Install Python dependencies:
pip install -r scraping/requirements.txt
-
Install Selenium and related packages:
pip install selenium webdriver-manager undetected-chromedriver 2captcha-python
-
Start the development server:
npm run dev
-
Open your browser and navigate to:
http://localhost:3000
- Enter one or more active ingredients in the search form
- Select the databases you want to search
- Set a date range if needed
- Click "Search" to start the search process
- View real-time updates as results are found
- Browse through the search results
You can also run searches directly from the command line:
python run_search.py --query "methotrexate" --databases pubmed fda-drugs ema-medicines nejm amjmed
Options:
--query
: The search query (required)--databases
: List of database IDs to search--output
: Output file for search results (JSON)--max-results
: Maximum number of results per database (default: 10)--min-date
: Minimum date (YYYY-MM-DD)--max-date
: Maximum date (YYYY-MM-DD)--no-captcha
: Disable CAPTCHA solver--no-browser
: Disable browser automation--no-parallel
: Disable parallel searches
You can test the connections to various databases:
python test_databases.py --query "methotrexate" --limit 10
To use CAPTCHA solving services, set up your API key:
python setup_captcha.py --api-key YOUR_API_KEY --service 2captcha
Supported services:
- 2captcha (recommended)
- anticaptcha
- capsolver
- local (uses Tesseract OCR, less reliable)
/app
- Next.js application files/components
- React components/public
- Static assets/scraping
- Python scraping moduless 80F1 mart_access_manager.py
- Main search orchestrationbrowser_automation.py
- Browser automation with human-like behaviorcaptcha_solver.py
- CAPTCHA solving capabilities- Database-specific modules for each data source
If you encounter issues with browser automation:
-
Make sure Selenium is installed:
pip install selenium webdriver-manager
-
Try running with the
--no-browser
option to disable browser automation:python run_search.py --query "methotrexate" --no-browser
If you encounter issues with CAPTCHA solving:
-
Make sure you have set up a CAPTCHA solving service:
python setup_captcha.py --api-key YOUR_API_KEY --service 2captcha
-
Try running with the
--no-captcha
option to disable CAPTCHA solving:python run_search.py --query "methotrexate" --no-captcha
This project is licensed under the MIT License - see the LICENSE file for details.
- Thanks to all the medical databases that provide access to their data
- The Next.js team for their excellent framework
- The Selenium team for their browser automation tools