Build software better, together

firecrawl / firecrawl

🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

markdown crawler scraper ai html-to-markdown web-crawler scraping web-scraper web-scraping data-extraction webscraping web-data-extraction ai-agents web-search ai-search web-data llm ai-crawler ai-scraping

Updated Jan 28, 2026
TypeScript

scrapy / scrapy

Star

Scrapy, a fast high-level web crawling & scraping framework for Python.

python crawler framework scraping crawling web-scraping hacktoberfest web-scraping-python

Updated Jan 23, 2026
Python

feder-cr / Jobs_Applier_AI_Agent_AIHawk

Sponsor

Star

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

Updated Nov 16, 2025
Python

gocolly / colly

Star

Elegant Scraper and Crawler Framework for Golang

go golang crawler scraper framework spider scraping crawling

Updated Jan 5, 2026
Go

ScrapeGraphAI / Scrapegraph-ai

Sponsor

Star

Python scraper based on AI

Updated Jan 20, 2026
Python

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

Updated Jan 28, 2026
TypeScript

soxoj / maigret

Sponsor

Star

🕵️‍♂️ Collect a dossier on a person by username from thousands of sites

Updated Jan 27, 2026
Python

psf / requests-html

Sponsor

Star

Pythonic HTML Parsing for Humans™

python html http scraping requests kennethreitz beautifulsoup lxml css-selectors pyquery

Updated Apr 16, 2024
Python

ultrafunkamsterdam / undetected-chromedriver

Star

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

testing chrome automation webdriver browser captcha scraping selenium navigator python3 cloudflare chromedriver anti-bot bot-detection cloudflare-bypass distil anti-detection

Updated Jul 5, 2025
Python

code4craft / webmagic

Star

A scalable web crawler framework for Java.

java crawler framework scraping

Updated Dec 20, 2025
Java

D4Vinci / Scrapling

Sponsor

Star

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

Updated Jan 22, 2026
Python

apify / crawlee-python

Star

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.