Stephen Happ

Download websites with WebMirror

View the source code here: https://github.com/StephenHapp/WebMirror

This tool lets you download a copy of a website or any portion of a website. Specify which web pages to start from and rules for which links to follow, and WebMirror will download each page it discovers, complete with images, videos, audio and other media. These pages are then modified to link to each other locally, allowing for easy offline viewing and sharing. Downloaded sites can then also serve as an archive of how the website once appeared.

This project started from a narrower project: I wanted to download a portion of Wikipedia. Apparently there are already archives of just the text, but I think the images and media are important enough to be included in an archive too. I tried using HTTrack for this, but ultimately decided to create my own tool that could give me the precision I wanted.

Stephen Happ

ABOUT ME PROJECTS

Download websites with WebMirror