I’ve build a little command line app to help me save full-text copies of webpages for future reference and llm ingestion. It’s called Grabit and it’s open-source.
Here’s what it does – you just need to point it to an url, and it’ll download its content, remove all unnecessary cruft like headers, menus, footers and whatnot, convert the remaining content to beautiful sparkling Markdown, and save that to a file.
It gets you from this:
to this:
How & why
Grabit is straightforward to use. Assuming you have uv installed and set up, all you need to do is download a single file somewhere on your computer, and then just uv run grabit.py URL
to save URL
’s content to a file.
I use Grabit a lot for saving full-text bookmarks in my Obsidian vault, so you’ll see a lot of focus adjacent stuff, like adding YAML front matter by default, creating a domain subdirectory also by default, etc. That being said, it’s flexible enough to be used for other scenarios, too.
For example, this is how you’d use it to summarize my post on better dependency injection in FastAPI using Simon Willison’s llm cli: uv run -q grabit.py -f stdout.md https://vladiliescu.net/better-dependency-injection-in-fastapi/ | llm -s "What's this about, eh?"
Inspiration
Grabit draws inspiration from Brett Terpstra’s gather-cli, a more complete tool but with some shortcomings that have gone ignored for almost a year, and which were annoying enough for me to write my own tool 🤷🏻♂️.
What triggered me to actually do this was Simon Willison’s article on using Claude to write tiny Python command line interfaces. I didn’t know about uv run
, and click looked pretty cool. So I decided to experiment with them while solving my issues.
Try it out
Grabit is available on GitHub, so go ahead and try it out.