DuckDuckGo not often will get the primary slot when groups resolve the place to trace rankings or pull SERP knowledge, since Google and Bing normally take the funds. That’s an oversight value fixing. A privacy-minded viewers, a outcomes web page assembled in a different way from Google’s, and a question stream that doubles as a window into Bing’s index all make DuckDuckGo a helpful knowledge supply for search engine optimization and market analysis. The sensible impediment is that gathering that knowledge at any actual quantity means placing a proxy for DuckDuckGo in entrance of your requests. This information explains why proxies are wanted, which sort suits the job, the right way to localize outcomes by market, and the right way to construct a group workflow that holds up previous the primary few hundred queries.
Why DuckDuckGo knowledge is value gathering
The dimensions is smaller than Google’s, however it isn’t a rounding error. Unbiased trackers and analytics roundups put DuckDuckGo at roughly 100 million searches a day, about 3 billion a month, with a world share within the 0.6 to 0.9 p.c vary and a stronger 1.8 to 2.1 p.c in the US, the place it ranks because the second most-used cell search engine. DuckDuckGo stopped publishing actual visitors in 2022, so deal with any exact determine as an estimate fairly than an audited quantity.
Two issues make that visitors attention-grabbing for knowledge groups. First, the viewers skews privacy-conscious and US-heavy: roughly half of all queries originate in the US, adopted by Germany, the UK, and India, and the 25 to 34 group is the biggest phase. In case your product or content material targets these customers, DuckDuckGo visibility issues greater than the worldwide share suggests. Second, as a result of DuckDuckGo’s conventional internet outcomes come largely from Bing, monitoring the DuckDuckGo SERP offers you an affordable, secondary learn on how Bing’s index ranks your pages, which is helpful if you can’t or don’t wish to scrape Bing immediately.
Typical jobs that justify a DuckDuckGo knowledge pipeline:
- Rank monitoring for a privacy-first or US-heavy viewers.
- SERP characteristic monitoring, together with on the spot solutions, the AI abstract, and which sources get cited.
- Advert verification, confirming that sponsored placements present up accurately per market. DuckDuckGo’s common cost-per-click is reported close to $0.41, far under Google’s, so it’s a actual channel for some advertisers.
- Market and competitor analysis, the place a non-personalized outcomes set is a bonus: DuckDuckGo doesn’t tailor outcomes to a profile, so two clear requests for a similar question and area ought to return the identical rating.
How DuckDuckGo serves outcomes
You can’t construct a secure scraper with out understanding what you might be scraping. DuckDuckGo shouldn’t be a single index. It blends outcomes from “over 400 sources,” with the majority of the usual ten blue hyperlinks coming from Bing, supplemented by its personal crawler (DuckDuckBot), instant-answer suppliers similar to Wikipedia, Wolfram Alpha and Apple Maps, and, since 2023, an AI reply layer that attracts on fashions from OpenAI and Anthropic. The Bing dependency shouldn’t be theoretical: throughout a Bing API outage in 2024, DuckDuckGo briefly stopped returning outcomes altogether.
For assortment functions, crucial element is that DuckDuckGo exposes multiple entrance door. The principle duckduckgo.com interface is JavaScript-heavy and gates pagination behind a dynamic vqd token, which makes it the toughest to script. DuckDuckGo additionally maintains stripped-down HTML and “lite” variations, initially constructed for browsers with out JavaScript, and people are far friendlier to a server-side parser.
| Endpoint | What it returns | JavaScript | Greatest for |
| duckduckgo.com/ | Full SERP: natural outcomes, on the spot solutions, adverts, AI abstract | Sure (vqd token + JS) | Capturing wealthy options and advert models |
| html.duckduckgo.com/html/ | Static HTML natural outcomes | No | Customary server-side parsing |
| lite.duckduckgo.com/lite/ | Minimal, fastest-loading outcomes | No | Excessive-volume rank checks |
DuckDuckBot shouldn’t be the identical as scraping DuckDuckGo
This journeys folks up continuously, so it’s value stating plainly. DuckDuckBot is DuckDuckGo’s personal crawler visiting your web site; you management it by means of robots.txt the identical method you’ll Googlebot, and it’s broadly reported as one of many extra compliant, low-aggression crawlers on the internet. Scraping DuckDuckGo is the other: you ship automated requests to DuckDuckGo’s SERP to gather knowledge. The 2 don’t have anything to do with one another operationally, and solely the second wants a proxy. In case your objective is just to verify DuckDuckGo indexes your pages, the lever that issues most is being listed in Bing, not something proxy-related.
Why you want proxies for DuckDuckGo
DuckDuckGo’s outcomes are nameless by design, with no account, no profile, and no personalised rating. That removes one drawback, because you by no means must handle logged-in periods, nevertheless it leaves the one that really stops automated assortment: charge limiting tied to your IP deal with.
DuckDuckGo actively pushes again on automated visitors to its HTML endpoints. In follow you will notice 202, 403, and comparable responses as soon as a single deal with sends too many requests too rapidly. There isn’t any formally revealed threshold, and the group consensus, mirrored within the widespread duckduckgo-search (now ddgs) Python library, is to rotate proxies, keep properly below about 30 requests per minute per IP, and area requests with randomized delays. Unbiased scraping write-ups charge DuckDuckGo as solely average issue, with charge limiting as the first protection and lightweight TLS-fingerprint checks layered on prime.
The repair is easy in precept. You unfold the workload throughout many IP addresses in order that no single one journeys the per-IP restrict, and also you rotate addresses as you go. This isn’t about disguising who you might be. It’s about staying contained in the request funds DuckDuckGo tolerates and never degrading the service for anybody else. A handful of queries from one machine is okay with no proxy; a rank-tracking job protecting hundreds of keyword-region pairs shouldn’t be.
Selecting the best proxy kind
Not each proxy behaves the identical towards DuckDuckGo’s charge limiting. The choice normally comes all the way down to detection danger versus price and pace, so understanding the right way to use a proxy with DuckDuckGo may also help you select the fitting choice on your workload.
| Proxy kind | Detection danger on DuckDuckGo | Relative price | Velocity | Greatest match |
| Datacenter | Greater; flagged quicker below sustained load | Lowest | Quickest | Low-volume, non-time-sensitive monitoring |
| Residential (rotating) | Low | Medium | Medium | Sustained, large-scale SERP assortment |
| Cell | Lowest | Highest | Medium | Hardest jobs, or when residential IPs get flagged |
Just a few protocol notes have an effect on implementation greater than folks count on. IPv4 is the protected default. IPv6 is cheaper however extra more likely to be handled as suspect by some endpoints, so check it on a small batch earlier than committing a big job to it. On the transport layer, HTTP(S) proxies cowl most scraping stacks, however some toolchains route cleanly solely over SOCKS5, so it pays to verify each can be found earlier than you wire every little thing collectively. Suppliers that expose datacenter, residential and cell IPv4 from a single account, proxys.io amongst them, make this simpler in follow, as a result of you can begin a job on cheap datacenter addresses for gentle monitoring and transfer the identical job to residential or cell IPs the second DuckDuckGo begins returning 202s below load.
Localizing outcomes by market
DuckDuckGo helps you to pin the SERP to a market with the kl area parameter: us-en for the US, uk-en for the UK, de-de for Germany, fr-fr for France, wt-wt for worldwide, and so forth. For rank monitoring and advert verification, that is the parameter that permits you to see the web page roughly as a searcher in that market would.
| Area code (kl) | Market |
| us-en | United States (English) |
| uk-en | United Kingdom |
| de-de | Germany |
| fr-fr | France |
| ru-ru | Russia |
| wt-wt | No area / worldwide |
The refined half is consistency. The kl worth units the locale of the outcomes, however the IP deal with you join from nonetheless carries a geographic sign, and a mismatch between the 2 can introduce noise into your pattern, which is precisely what you don’t want when the purpose is an correct, repeatable learn on a selected market. The clear strategy is to align your exit IP’s nation with the area code you ship, so the request is internally coherent. Protection breadth is the sensible restrict right here: a pool that lists exit places throughout the US, UK, Germany, France, Spain, Poland, the Netherlands, Canada, Brazil and India (the nation listing proxys.io publishes is a consultant instance) is normally sufficient to pair an IP with every area code an search engine optimization staff tracks. The place a rustic shouldn’t be provided, the extra trustworthy choice is to drop that locale from the run fairly than ship a mismatched IP that quietly skews the info.
A sensible assortment workflow
Here’s a workflow that holds up properly for medium-scale DuckDuckGo assortment. It assumes you might be utilizing one of many no-JavaScript endpoints, which removes a lot of the fragility.
1. Decide the endpoint to match the info you want. Use lite.duckduckgo.com for uncooked rankings at quantity, html.duckduckgo.com if you desire a bit extra construction, and the principle web site provided that you particularly want adverts, on the spot solutions, or the AI abstract.
2. Set the area with kl, and align the proxy’s exit nation to it.
3. Rotate the proxy on each request, or at the least on each new job, and cap concurrency so that you keep below the roughly 30-requests-per-minute-per-IP guideline.
4. Add randomized delays between requests fairly than a set interval, since a continuing cadence is itself a detectable sample.
5. Deal with pagination rigorously. An preliminary request returns as much as about 35 natural outcomes; utilizing an offset can return as much as about 50, however larger offsets additionally elevate the chances of duplicated or variable end result counts, so de-duplicate on the way in which in. On the principle web site, pagination will depend on the vqd token; if it expires, pagination merely fails, so re-fetch the primary web page to tug a recent token and retry.
6. Again off on errors. Deal with 202 and 403 as a sign to pause, rotate to a brand new IP, and decelerate, to not retry tougher.
A minimal illustration of the request layer, with a rotating proxy and a area code:
| import random, time, requests PROXIES = [ … ] # your rotating pool, e.g. http://person:go@host:port def fetch(question, kl=”us-en”): proxy = random.selection(PROXIES) r = requests.get( “https://html.duckduckgo.com/html/”, params={“q”: question, “kl”: kl}, proxies={“http”: proxy, “https”: proxy}, headers={“Consumer-Agent”: “Mozilla/5.0 … Chrome/120 Safari/537.36”}, timeout=30, ) if r.status_code in (202, 403): time.sleep(random.uniform(5, 12)) # again off, then rotate subsequent name return None return r.textual content # hand off to BeautifulSoup for parsing |
Parsing itself is undramatic. DuckDuckGo’s lite and HTML pages are static, so a regular request plus BeautifulSoup go extracts titles, hyperlinks, and snippets with out browser automation. Reserve headless browsers for the instances the place you genuinely want JavaScript-rendered options.
Limitations, trade-offs, and accountable use
A proxy solves the IP rate-limit drawback. It doesn’t make scraping consequence-free, and an trustworthy information has to say so.
It sits in a authorized and coverage grey space. Accumulating publicly accessible knowledge is lawful in most jurisdictions, however DuckDuckGo’s phrases prohibit automated, non-personal use of the positioning, and the corporate states plainly that it really works to dam scrapers. Respect robots.txt, preserve request charges modest, accumulate solely public knowledge, and get authorized recommendation in case your use case is industrial or large-scale. Restraint right here isn’t just compliance theater. Overloading the endpoints degrades the service for actual customers, which is the conduct the speed limiting exists to stop.
Customized scrapers are brittle. When DuckDuckGo modifications its HTML construction, a hand-rolled parser breaks, and the vqd token mechanics on the principle web site change periodically. The upside of proudly owning the scraper is which you could patch it by yourself schedule. The upside of a library like ddgs is that another person maintains it, however then you definately wait for his or her repair when it breaks. There isn’t any free lunch, solely a selection about who carries the upkeep burden.
There isn’t any official SERP API. DuckDuckGo gives an Immediate Reply API, nevertheless it returns on the spot solutions, not the total internet outcomes most groups need, and industrial use requires figuring out your app, attributing outcomes, and getting e-mail approval first. (DuckDuckGo is, to its credit score, identified to really reply these emails.) For full SERP knowledge you might be both scraping the HTML endpoints your self or paying a managed SERP-API vendor that does the scraping and proxy rotation for you. That build-versus-buy name hinges on quantity, in-house engineering capability, and the way a lot you worth management over reliability.
The Bing dependency is a standing danger. As a result of a lot of DuckDuckGo’s end result set comes from Bing, upstream modifications ripple downward. Microsoft’s 2025 strikes round its search APIs are a reminder that the info supply you might be monitoring shouldn’t be totally in DuckDuckGo’s arms, and due to this fact not totally in yours.
Just a few classes that are inclined to repeat throughout initiatives: begin slower than you assume it’s good to and ramp up; rotate IPs and differ timing, as a result of charge limiting keys on patterns and never simply quantity; align proxy geography with the kl area or settle for noisier knowledge; and de-duplicate aggressively when you web page previous the primary set of outcomes.
FAQ
Do I would like a proxy to scrape DuckDuckGo?
For a number of one-off queries, no. For something constant or at scale, sure: charge limiting is tied to your IP, and rotating proxies preserve you contained in the request funds DuckDuckGo tolerates.
What number of requests per minute per IP is protected?
DuckDuckGo publishes no official restrict. A standard rule of thumb is below roughly 30 per minute per IP, with randomized delays, backing off instantly on 202 or 403 responses.
Which proxy kind works greatest for DuckDuckGo?
Rotating residential proxies are the dependable default for sustained assortment. Cell IPs are probably the most resilient however the priciest. Datacenter addresses swimsuit low-volume monitoring.
Does DuckDuckGo have an official API for search outcomes?
Not for full internet outcomes. The Immediate Reply API covers on the spot solutions solely. Full SERP knowledge comes from scraping the HTML or lite endpoints, or from a third-party SERP API.
Can I get country-specific DuckDuckGo outcomes?
Sure. The kl parameter units the market (for instance us-en, uk-en, de-de). For correct outcomes, align the exit nation of your proxy with the area code you ship.
