This repository offers two reliable solutions for extracting data from Yandex Search Engine Results Pages (SERPs):
- Free Yandex Scraper: A basic tool for scraping Yandex Search Results at small scale
- Enterprise-grade Yandex SERP API: A scalable, production-ready solution for high-volume, real-time data extraction (part of Bright Data's SERP Scraper API)
- Free Yandex SERP Scraper
- Yandex SERP Scraper API
- Implementation Methods
- Yandex Search Query Parameters
- Practical Example
- Support & Resources
The free scraper provides a straightforward way to collect Yandex SERP data at a small scale. It's perfect for developers needing limited data for personal projects, research, or testing purposes.
- Python 3.9+
- Required packages:
playwright
for browser automationBeautifulSoup
for HTML parsing
pip install playwright beautifulsoup4
playwright install
New to web scraping? Explore our Beginner's Guide to Web Scraping with Python
- Open yandex-search-results-scraper.py
- Customize the search terms and page count variables:
PAGES_PER_TERM = {
"ergonomic office chair": 2,
}
- Run the script
One of the biggest challenges when scraping Yandex is its aggressive CAPTCHA protection:
Yandex uses a strict and constantly evolving anti-bot system to prevent automated data extraction. Frequent CAPTCHA triggers can quickly lead to IP blocks, making it tough to maintain stable, long-running scrapers.
While the free scraper handles basic tasks, it has several important limitations:
- High risk of IP blocking
- Limited request volume
- Constant CAPTCHA interruptions
- Not suitable for production environments
For a scalable and stable solution, consider Bright Data's dedicated API detailed below. 👇
The Yandex Search API is part of Bright Data’s SERP Scraping API suite. It leverages our industry-leading proxy infrastructure to deliver real-time Yandex search results with a single API call.
- Global Accuracy: Get tailored results for specific locations worldwide
- Pay-Per-Success: Only pay for successful requests
- Real-Time Data: Access up-to-date search results in seconds
- Unlimited Scalability: Handle high-volume scraping effortlessly
- Cost-Efficient: Eliminates the need for costly infrastructure
- Reliable Performance: Built-in anti-blocking technology
- 24/7 Expert Support: Access to technical assistance whenever needed
📌 Try Before You Buy: Test it for free in our SERP API Live Demo
- Create a Bright Data account (new users receive a $5 credit)
- Generate your API key
- Follow our step-by-step guide to configure the SERP API
The simplest way to use the API is by making a direct request to Bright Data's API endpoint.
cURL Example:
curl https://api.brightdata.com/request \
-H "Content-Type: application/json" \
-H "Authorization: Bearer API_TOKEN" \
-d '{
"zone": "ZONE_NAME",
"url": "https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en",
"format": "raw"
}'
Python Example:
import requests
import json
url = "https://api.brightdata.com/request"
headers = {"Content-Type": "application/json", "Authorization": "Bearer API_TOKEN"}
payload = {
"zone": "ZONE_NAME",
"url": "https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en",
"format": "raw",
}
response = requests.post(url, headers=headers, json=payload)
with open("yandex-scraper-api-result.html", "w", encoding="utf-8") as file:
file.write(response.text)
print("Response saved!")
This alternative method uses proxy routing for direct access to search results.
cURL Example:
curl -i \
--proxy brd.superproxy.io:33335 \
--proxy-user brd-customer-<CUSTOMER_ID>-zone-<ZONE_NAME>:<ZONE_PASSWORD> \
-k \
"https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en"
Python Example:
import requests
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
host = "brd.superproxy.io"
port = 33335
username = "brd-customer-<customer_id>-zone-<zone_name>"
password = "<zone_password>"
proxy_url = f"http://{username}:{password}@{host}:{port}"
proxies = {"http": proxy_url, "https": proxy_url}
url = "https://www.yandex.com/search/?text=apple+watch+series+10+review&lr=95&lang=en"
response = requests.get(url, proxies=proxies, verify=False)
with open("yandex-scraper-api-result.html", "w", encoding="utf-8") as file:
file.write(response.text)
print("Response saved!")
Note: When using the native proxy approach, it's recommended to install Bright Data's SSL certificate for production use. Learn more in the SSL Certificate Guide.
👉 See the full HTML output
The query parameters like lr
and lang
are explained in the next section.
This parameter defines which geographic region or country to target for search results.
Region | Code |
---|---|
Moscow | 1 |
Saint-Petersburg | 2 |
USA | 84 |
Canada | 95 |
China | 134 |
Example - Check how "best wireless earbuds" ranks in the USA:
curl --proxy brd.superproxy.io:33335 \
--proxy-user brd-customer-<id>-zone-<zone>:<password> \
"https://www.yandex.com/search/?text=best+wireless+earbuds&lr=84"
Sets the language preference using two-letter language codes:
lang=en
- Englishlang=es
- Spanishlang=fr
- French
Example - Get sports news in Spanish:
https://www.yandex.com/search/?text=local+sports+news&lang=es
Controls which page of results to display:
p=0
- First page (default)p=1
- Second pagep=4
- Fifth page
Each Yandex SERP page typically returns 10 results.
Example - Scrape page 3 (results 21-30) for "nike running shoes":
https://www.yandex.com/search/?text=nike+running+shoes&p=2
Limits results to a specific time period:
within=77
- Results from the past 24 hourswithin=1
- Results from the past 2 weekswithin=[%pm]
- Results from the past month
Example - Get "iPhone 15 review" results from the past 24 hours:
https://www.yandex.com/search/?text=iphone+15+review&within=77
Specifies which device type to simulate:
brd_mobile=0
or omitted - Random desktop user-agentbrd_mobile=1
- Random mobile user-agentbrd_mobile=ios
orbrd_mobile=iphone
- iPhone user-agentbrd_mobile=ipad
orbrd_mobile=ios_tablet
- iPad user-agentbrd_mobile=android
- Android phone user-agentbrd_mobile=android_tablet
- Android tablet user-agent
Example - Simulate an iPhone searching for responsive website testing:
https://www.yandex.com/search/?text=responsive+website+testing&brd_mobile=ios
Defines which browser to simulate:
- Default (omitted) - Random browser
brd_browser=chrome
- Google Chromebrd_browser=safari
- Safaribrd_browser=firefox
- Mozilla Firefox
Example - Simulate Safari browser searching for Python tutorials:
https://www.yandex.com/search/?text=how+to+learn+python&brd_browser=safari
Note: Don't combine
brd_browser=firefox
withbrd_mobile=1
as they're incompatible.
For comprehensive targeting, you can combine multiple parameters:
https://www.yandex.com/search/?text=organic+skincare+products
&lr=95
&lang=en
&p=2
&within=1
&brd_mobile=ios
&brd_browser=safari
This search:
- Targets Canadian users (
lr=95
) - Shows English results (
lang=en
) - Displays the second page (
p=2
) - Limits to the past 2 weeks (
within=1
) - Simulates an iPhone user (
brd_mobile=ios
) - Uses Safari browser (
brd_browser=safari
)
Perfect for a skincare company researching recent organic product trends in the Canadian market as viewed by iOS mobile users.
- Documentation: SERP API Documentation
- Related APIs:
- Use Cases:
- Additional Reading: Best SERP APIs
- Contact Support: [email protected]