About the DisclosureLens crawler
This page tells you how to identify our crawler, how often it fetches, and how to make it slow down. We respect robots.txt and we respond to abuse@disclosurelens.com within one business day.
Identification
Our bot identifies itself in the HTTP User-Agent header as:
DisclosureLens bot (https://disclosurelens.com/about-the-bot)SEC EDGAR requests additionally carry a contact email per the SEC user-agent policy.
Politeness policy
- SEC EDGAR: max 8 requests/sec (within SEC's 10/sec documented cap)
- HHS OCR: 1 request/sec, sequential
- State AG portals: 1 request per 3 seconds, jittered
- UK ICO: 0.5 request/sec
- OAIC, CNIL, BfDI: 0.5 request/sec
We honor robots.txt directives and Crawl-delay headers. We back off exponentially on HTTP 429 and 5xx responses.
What we collect
Only material that is already public on the source regulator's portal. We archive raw filings to immutable storage for evidentiary purposes (audit replay, schema migration). On disclosure detail pages we render the full SEC and HHS filing inline from the originating regulator's public record under the fair report privilege (see methodology §4.5). State AG, EU DPA, and threat-actor leak-site sources are not redistributed inline; those records carry only a link to the originating regulator URL.
Block requests
If you operate a regulator portal and need our bot to slow down or stop, email abuse@disclosurelens.com. We respond within one business day and apply the requested rate limit immediately.
Why we publish this page
Most bots that hit your AG portal don't identify themselves and don't publish a contact. We'd rather you block us correctly than rate-limit-bomb your portal trying to figure out who's causing the spike. The published User-Agent, the abuse@ contact, and the per-source rate caps above are all part of the same contract — read them as "here's how to make us behave," not as a marketing artifact.