
Modern web scraping for when you need the good parts, not the markup soup. Extracts clean article content, parses feeds (RSS, Atom, JSON), and gathers metadata from any page. Handles broken encodings, malformed feeds, and the chaos of real-world HTML. TypeScript-native, works everywhere. Named after the bird known for collecting valuable things... you get the idea.
Problem Hypothesis
Even with modern AI/LLM tools the root problem of machine learning is still garbage in, garbage out. And the web still is a messy place that would burn lots of tokens needlessly or exploce context windows.
↓→
Solution Attempt
Lean fast extraction of the relevant data from website that can be used for subsequent processing and AI pipelines.
Current State
Stage: Empathy
Discovering and understanding deeply the pain points of your niche market.
Category: Library
Some functionality implemented in ready-to-use code form, available as open source software on github. Ideally in Rust for maximum reuse across tech stacks.
Users
not tracked yet.
Revenue
not tracked yet.