Best Bulk Domain Datasets in 2026: WHOIS, RDAP, Zone Files, and Domain Data Providers Compared

Best Bulk Domain Datasets in 2026: WHOIS, RDAP, Zone Files, and Domain Data Providers Compared

June 17, 2026 · hostingflow

TL;DR: There is no single best bulk domain dataset — the right one depends on whether you need zone files, WHOIS/RDAP records, passive DNS, or newly registered domains. For free gTLD zone files, ICANN CZDS is the canonical source. For the broadest commercial WHOIS data, RDAP API access, and newly-registered-domain feeds, WhoisXML API leads. For investigative passive DNS and historical pivoting, SecurityTrails and DomainTools are strongest. For an affordable, high-volume bulk domain dataset you can download as JSONL — roughly 400M+ domains with RDAP/WHOIS-scanned records — webatla.com is a credible newer option. ViewDNS and DomainsDB.info cover the budget and ad-hoc-lookup tiers.

If you build phishing detection, brand-protection monitoring, attack-surface tooling, or large-scale SEO research, you eventually outgrow per-domain lookups and need data in bulk. A bulk domain dataset is any large, downloadable or API-served collection of domains and their attributes: registration records (WHOIS or RDAP), DNS records, zone-file delegations, or curated lists of newly registered domains. The providers below differ sharply on coverage, freshness, delivery format, and price, so the comparison table comes first and the detail follows. Coverage figures marked with an asterisk are vendor-reported, not independently audited — validate a sample before committing budget.

Bulk domain dataset providers at a glance

This table summarizes the seven domain data providers covered below. "Data type" is the core offering; many providers sell several products with different freshness and pricing.

Provider Data type Coverage* Update frequency API available Pricing model
ICANN CZDS gTLD zone files (DNS delegation only) gTLDs only; ~1,000+ zones Daily (1 download / zone / 24h) REST API + web portal Free (per-registry approval)
WhoisXML API WHOIS, RDAP, DNS, IP, NRD feeds, bulk downloads 374M+ active domains; 7,596+ TLDs On-demand + daily feeds + quarterly snapshots REST, bulk, Snowflake Free tier (500 req) + credits + quote
SecurityTrails Passive DNS, DNS/WHOIS history, SSL, subdomains ~200M+ domains; DNS since 2008, WHOIS since 2002 Daily DB + on-demand history REST / DSL / SQL + CSV feeds Subscription + credits; enterprise quote
DomainTools WHOIS/RDAP, passive DNS (DNSDB), risk feeds "97% of the internet"; 100B+ passive DNS records Real-time feeds + near-real-time DNSDB REST (Iris, DNSDB) Enterprise contract; no free tier
webatla.com Bulk domain lists, RDAP/WHOIS, DNS, tech, rankings ~400M+ domains; 1,400+ TLDs Daily refresh REST download API + S3 bulk One-time per-month (€29–€599); free sample
ViewDNS.info WHOIS, reverse DNS/IP/MX/NS, IP history, NRD, ccTLD lists Hundreds of millions; NRD across 1,125 TLDs Daily NRD feed; on-demand lookups REST API Credits/subscription; free web UI
DomainsDB.info Domain-name lists + DNS metadata (bulk via domains-index.com) ~255–260M domains; 1,000+ TLDs Daily (paid tier); free-tier cadence undocumented Free REST (≤100 rows/query) Free API; bulk one-time elsewhere

How should you evaluate a domain data provider?

Before comparing names, decide which axis matters most for your workload. Data type: zone files give you which domains exist and where they delegate, but no ownership; WHOIS/RDAP gives ownership and dates; passive DNS gives historical resolution behavior. Freshness: real-time streams, daily snapshots, and quarterly bulk dumps sit at very different price points. Coverage vs. completeness: a provider can cover thousands of TLDs yet still be incomplete within each one, especially for ccTLDs. Compliance: post-GDPR WHOIS is heavily redacted, so judge a provider on the fields you can actually use, not its headline record count.

1. ICANN CZDS — the free baseline for gTLD zone files

The Centralized Zone Data Service is the canonical, free source for gTLD zone files. Each zone file is DNS master-file text listing the delegated domains in one gTLD plus their NS, glue, and DNSSEC (DS) records — the largest, like .com, contain 150M+ delegations. There is no registrant or WHOIS data here, only the DNS layer.

Access is via a REST API (authentication plus per-zone download endpoints, with official Python and Java clients) or the web portal. Zones regenerate daily, and ICANN policy caps you at one download per zone per 24 hours. Pricing is free; the real cost is the per-registry approval workflow, which is granted (or denied) zone by zone and can take weeks. The classic use case is building your own newly-registered-domain feed by diffing consecutive daily snapshots. The gotchas: CZDS contains gTLDs only — ccTLDs such as .de, .uk, and .io require separate per-registry programs — and a zone file shows only domains that are actually delegated, so it is not a perfect registration roster.

2. WhoisXML API — the broadest commercial WHOIS, RDAP, and NRD coverage

WhoisXML API is the most complete single-vendor option for WHOIS and RDAP data, and it bundles passive DNS, IP netblocks, and newly-registered-domain (NRD) feeds alongside it. Vendor figures cite 374M+ active domains, 7,596+ TLDs, and 28.7B+ historical WHOIS records, with roughly 250,000 newly registered domains surfaced per day.

Delivery is flexible: a REST WHOIS API with an RDAP toggle and automatic fallback, a Bulk WHOIS API that accepts CSV batches up to 500,000 domains (one credit per domain), a real-time domain-registration streaming feed for NRDs, downloadable WHOIS/DNS database feeds, and quarterly gTLD+ccTLD snapshots distributed via Snowflake. The lookup APIs offer a free tier (500 requests); larger usage is credit-based, and bulk feeds are quote-only. It suits threat-intelligence and brand-protection teams that value deep historical WHOIS and breadth of data types over the lowest per-query price. Watch for opaque bulk pricing, a fragmented multi-product credit model, and NRD "Lite" tiers that ship domain names only, without WHOIS records.

3. SecurityTrails — passive DNS and historical pivoting

SecurityTrails (now a Recorded Future company) is built for investigation rather than bulk registry resale. Its strength is breadth of pivots: current and historical DNS, passive DNS, WHOIS and WHOIS-change history, SSL/Certificate Transparency data, subdomain enumeration, associated domains, and IP intelligence. DNS history reaches back to mid-2008 and WHOIS to roughly 2002, though headline domain counts vary across its own pages (203M vs. 630M), so treat them as marketing.

The product is a read-only REST API with a domain-specific query language (DSL) and a production SQL API, plus customizable bulk CSV feeds (including newly-registered and deleted domain lists back to May 2018). Notably, WHOIS is served through SecurityTrails' own endpoints, not a standards-based RDAP API. Pricing is a subscription plus credit/quota model, with enterprise effectively contact-for-pricing. It fits security and threat-intel engineers who want interactive infrastructure mapping and pre-GDPR registrant history; the main caveat is that heavy enumeration burns quota quickly.

4. DomainTools — enterprise passive DNS and predictive risk

DomainTools sits at the enterprise end of domain intelligence. It combines decades of WHOIS/RDAP history with the Farsight DNSDB passive DNS dataset (widely cited at 100B+ records since 2010, though current docs claim more), predictive risk scoring, and real-time threat feeds such as Newly Observed Domains and Domain Hotlist. The company markets visibility into "97% of the internet."

It is API-first: distinct REST products (Iris Investigate/Enrich/Detect, DNSDB with flexible regex search, and a real-time Threat Feeds API) plus deep SIEM/SOAR/TIP integrations (Splunk, Cortex XSOAR, Maltego, ServiceNow). It is not a bulk-download product: access is query- and feed-based, with enrichment batched at up to 100 domains per request. Pricing is enterprise-contract only, with no public figures or free tier; expect a five-figure entry point. Choose DomainTools when you need the largest passive DNS corpus, predictive scoring, and tight tooling integration; it is overkill for indie developers or small SEO shops that just want cheap bulk lists.

5. webatla.com — high-volume bulk domain dataset downloads

webatla.com is a newer, download-first domain data provider aimed at teams that want a large bulk domain dataset without enterprise contracts. As of 2026 its pages report roughly 404M domains across 1,434 TLDs (for example, ~188M .com), with an RDAP & WHOIS scan dataset of around 388M records, plus DNS records, detected web technologies, and ranking metadata.

The data refreshes daily ("daily fresh export"), and each record carries a last-checked date. Delivery is the differentiator: full datasets download as zstd-compressed JSONL (with S3 delivery) alongside a rate-limited REST download API (~240 requests/minute) that filters by TLD, country, or technology, and a free 10,000-row sample for evaluation. Pricing is unusually transparent and one-time rather than subscription: published tiers run from €29 for all active domains to €299 for the RDAP & WHOIS database up to €599 for the full bundle, with one month of access per purchase (see pricing). It suits cost-conscious bulk acquisition for enrichment, attack-surface mapping, SEO analysis, or ML training corpora. The trade-offs are real: it is a periodically-refreshed snapshot, not a live per-domain lookup service, and GDPR-redacted WHOIS fields are commonly empty, with JSONL the default format (CSV/SQL on request).

6. ViewDNS.info — affordable lookups and a low-cost NRD feed

ViewDNS.info is a long-running, low-cost toolset for OSINT-style pivoting: WHOIS and reverse WHOIS, reverse IP, reverse MX/NS, DNS records, IP history, and a daily newly-registered-domains feed, plus ccTLD zone and domain lists. Coverage is described in vendor terms as "hundreds of millions" of domains, with the NRD feed spanning 1,125 TLDs.

Everything is exposed through a simple REST API (JSON/XML), and the NRD feed is deliverable by direct download, API, or S3/GCS. Pricing combines prepaid query credits with a subscription, shared across all live-lookup endpoints, plus a free web UI and a free, week-delayed, domains-only NRD evaluation set; exact figures sit behind a Cloudflare-gated pricing page, so confirm them directly. ViewDNS is best as an affordable supplementary source — quick reverse-IP and co-hosted-domain pivots, plus a cheap NRD feed for blocklisting and typosquat detection. The live-lookup APIs are metered, not a bulk-export channel (reverse-IP results cap at 10,000 per page), and WHOIS is GDPR-redacted as everywhere.

7. DomainsDB.info — free domain-name search for quick recon

DomainsDB.info (note: there is no "DomainsDB.io") is a free front-end and API over the commercial bulk vendor domains-index.com. The free REST/JSON API lets you search registered domains by keyword/substring, TLD, or registration date and returns inline DNS metadata (A/NS/MX/TXT, created/updated dates, country) for roughly 255–260M domains across 1,000+ TLDs.

It is genuinely useful for lightweight reconnaissance and brand/keyword research, but it is not a bulk channel: the free API caps results at about 100 rows per query, and for real bulk work (full zone snapshots, daily NRD feeds, expired/deleted lists) the intended path is buying datasets one-time from domains-index.com. There is no RDAP or live WHOIS endpoint. Two cautions: the public API's GitHub repository was archived in 2026, so long-term freshness is uncertain, and rate limits are undocumented — plan for unpredictable throttling. Treat it as a free first-pass triage tool that graduates to a paid bulk download.

Freshness vs. cost, coverage vs. completeness: the trade-offs

Freshness drives price more than anything else. Real-time threat feeds (DomainTools) cost the most; daily snapshots (CZDS, webatla, WhoisXML feeds) are mid-tier; quarterly bulk dumps are cheapest per row but stale between releases. Match the cadence to the decision: blocklisting needs hours-fresh NRDs, while market research tolerates a monthly snapshot.

No single source is complete. A zone file lists every delegated domain in a gTLD but tells you nothing about ownership; a WHOIS/RDAP dataset has ownership and dates but is redacted and never covers 100% of ccTLDs. Serious pipelines combine sources — CZDS for the gTLD baseline, a WHOIS/RDAP provider for registration data, and passive DNS for resolution history.

Mind the edge cases. ccTLD zone files are largely unavailable through CZDS and are restricted by many country registries, so ccTLD coverage is where providers diverge most. GDPR redaction empties registrant name, email, and address for most post-2018 domains, shifting WHOIS value toward infrastructure signals (creation date, registrar, name servers) and pre-GDPR historical archives. As a rule of thumb, a downloadable bulk domain dataset (JSONL or Parquet) fits warehouse analytics, while a metered RDAP/WHOIS API fits live enrichment — many teams run both.

Frequently asked questions

What is the difference between WHOIS and RDAP?

WHOIS is the legacy protocol (port 43, plus web forms) that returns registration data as unstructured text whose layout varies by registry and registrar. RDAP (Registration Data Access Protocol) is its standardized successor: it returns structured JSON over HTTPS, supports differentiated access and internationalization, and is defined in RFC 9083 and related RFCs. ICANN has required gTLD registries and registrars to run RDAP since 2019, and legacy WHOIS is being phased out. For bulk pipelines, RDAP's predictable JSON is far easier to parse than free-text WHOIS, but many ccTLDs still expose only WHOIS or nothing at all.

How often is bulk domain data updated?

It depends on the data type. gTLD zone files via ICANN CZDS regenerate roughly every 24 hours. Commercial newly-registered-domain feeds typically refresh daily, sometimes with intra-day deltas. WHOIS/RDAP snapshots are usually re-scanned on a rolling cycle of days to weeks for the full corpus, because re-querying hundreds of millions of domains is rate-limited at the source. Passive DNS updates continuously as new resolutions are observed. Always confirm whether a provider sells a point-in-time snapshot or a live, incrementally updated feed.

Is zone file access free?

For most gTLDs, yes. ICANN's CZDS gives approved users free access to gTLD zone files: you register, accept each registry's terms, and wait for per-TLD approval. The catch is coverage — CZDS holds gTLDs only, and a zone file lists only delegated domains (those with name servers), not parked or registered-but-undelegated names. ccTLD zone files (.de, .uk, .cn, and most country codes) are generally not available through CZDS and are often restricted entirely, which is exactly the gap commercial bulk domain datasets exist to fill.

What format do bulk domain datasets come in?

The common formats are line-delimited JSONL (one record per line, ideal for streaming pipelines), CSV for spreadsheet and SQL-import workflows, Parquet for columnar analytics in Spark/DuckDB/BigQuery, and raw MySQL/PostgreSQL dumps for direct database loading. Zone files arrive as compressed DNS master-file text; WHOIS/RDAP snapshots are usually JSON or JSONL. For hundreds of millions of rows, prefer Parquet or compressed JSONL over CSV — they stream without loading the whole file into memory.

Why are so many WHOIS fields blank or redacted?

Since GDPR took effect in 2018, registrars and registries redact personal data (registrant name, email, address, phone) for most domains, replacing it with "REDACTED FOR PRIVACY" or a privacy-service proxy. Organization-registered and pre-GDPR domains may still expose contacts, and registrar, dates, name servers, and status codes remain visible. In practice WHOIS is now most reliable for infrastructure signals rather than identity. RDAP supports tiered access that can, in principle, return more fields to authorized requesters, but public RDAP is redacted the same way as public WHOIS.

The bottom line: start from the data type you actually need, then weigh freshness against cost. ICANN CZDS is the free zone-file baseline; WhoisXML API, SecurityTrails, and DomainTools lead on commercial WHOIS/RDAP, passive DNS, and threat feeds; and for downloadable bulk domain datasets at a lower price point, webatla.com, ViewDNS, and DomainsDB.info are worth evaluating. Most mature pipelines end up combining two or three of them.

Explore More Domain Resources

Browse our guides and domain database for comprehensive domain information.