Skip to content

Choose a tag to compare

@harshichowdary25 harshichowdary25 released this 07 Apr 11:25
· 19 commits to main since this release

I just released the first version of a simple yet powerful website crawler written in Python. It’s designed to take a base URL and a list of keywords as input, then scan the entire site — including all its subpages — to find matches. The crawler uses multi-threading for speed and efficiency, making it much faster than traditional single-threaded scrapers. As it scans, it shows real-time progress in the terminal and automatically saves all matches in a .txt file named after the domain. It’s perfect for digging through academic or corporate websites to find names, emails, or any specific text content you’re looking for.