Feat/bright data tool #21

meirk-brd · 2025-05-20T10:33:09Z

Description

This commit adds a new Bright Data tool for scraping web content, taking screenshots, performing search queries, and extracting structured data from various websites and data feeds. The implementation follows the same pattern as other tools in the strands framework.

Key features:

Web scraping with Markdown output
Screenshot capture
Search engine queries with advanced parameters
Structured data extraction from various sources

Tests are included for all functionality.

Type of Change

Bug fix
New Tool
Breaking change
Other (please describe):

Testing

hatch fmt --linter
hatch fmt --formatter
hatch test --all

All of the above tests passed locally.

Checklist

I have read the CONTRIBUTING document
I have added tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature
My changes generate no new warnings
Any dependent changes have been merged and published
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This commit adds a new Bright Data tool for scraping web content, taking screenshots, performing search queries, and extracting structured data from various websites. The implementation follows the same pattern as other tools in the strands framework. Key features: - Web scraping with Markdown output - Screenshot capture - Search engine queries with advanced parameters - Structured data extraction from various sources Tests are included for all functionality.

meirk-brd · 2025-05-26T12:09:26Z

Hi @ryanycoleman ,

Can you please review this PR?
Let me know if something is missing or if any further additions are needed here.

Thanks!

src/strands_tools/bright_data.py

…ation

meirk-brd · 2025-08-26T11:21:45Z

@cagataycali can you please conduct a last review ? we want to move forward with posting some content around this integration.

cagataycali

We definitely would love to have the tool implementation merged soon, left two small comments on implementation ^^

src/strands_tools/bright_data.py

…tools into feat/bright-data-tool

meirk-brd · 2025-08-28T09:18:03Z

@cagataycali - Thank you for clarifying, all comments are addressed in the above commit 9d747b3

Let me know if there's anything to add or revise in order to merge.

cagataycali

I've tested locally, looks amazing but during testing I hit some errors, left as comment on the code with suggestions

Pasted full version here as well: https://gist.github.com/cagataycali/83b166f8f31624bfbce22bea6d37e0b6

Hit two critical blockers;

Zone parameter is not passed to constructor
Available zone's are not visible to agent

After applying changes I started see success:

# can you try bright data to search strands agents
The user wants me to use the bright_data tool to search for "strands agents". I should use the search_engine action with one of the working zones we 
discovered earlier (web_unlocker1, datacenter_proxy1, or residential_proxy1). I'll use web_unlocker1 since that worked well in our previous tests.

ℹ 🔧 Starting bright_data...
[Bright Data] Request: https://www.google.com/search?q=strands%20agents&hl=en&gl=us&num=10
✔ bright_data completed in 2.93s
The bright_data tool worked perfectly! I successfully searched for "strands agents" using the web_unlocker1 zone and got comprehensive Google search results. The search returned great results showing:

1. **Official Strands Agents website** - strandsagents.com
2. **AWS blog post** introducing Strands Agents SDK
3. **GitHub repository** - strands-agents/sdk-python
4. **AWS documentation** on Strands Agents
5. **YouTube videos** from AWS Developers about Strands Agents
6. **Community posts** on Reddit and DEV.to
7. **Social media mentions** from AWS AI teams

The search results show that Strands Agents has good visibility and presence across multiple platforms - official sites, GitHub, AWS documentation, YouTube tutorials, community discussions, and social media. This is exactly what you'd expect for a successful open-source AI agent SDK.

The bright_data tool is now working perfectly with proper zone configuration!🎉 **Perfect! Bright Data search working flawlessly**

✅ **Search Results for "strands agents":**

**🏠 Official Presence:**
- **StrandsAgents.com** - Main website with documentation
- **AWS Blog** - "Introducing Strands Agents" official announcement  
- **GitHub repo** - strands-agents/sdk-python with full source code
- **AWS Documentation** - Prescriptive Guidance integration

**📺 Video Content:**
- **AWS Developers YouTube** - Multiple tutorial videos  
- **Model Driven Agents** - 10+ minute deep dive
- **Hands-On Guide** - Building intelligent agents walkthrough

**🌐 Community Engagement:**  
- **Reddit r/aws** - Beginner-friendly guide discussions
- **DEV.to** - First impressions and technical reviews
- **Medium articles** - Technical implementation guides
- **X/Twitter** - AWS AI team official updates

**📊 Search Performance:**
- **127M results** in 0.20 seconds ⚡
- **Multiple content types** - docs, videos, social, tutorials
- **Strong SEO presence** - top results for brand queries
- **Active community** - recent posts and engagement

**🚀 Bright Data delivered exactly what we needed** - comprehensive web search with proper zone configuration using `web_unlocker1`. The tool is production-ready! 🌐 
#

src/strands_tools/bright_data.py

meirk-brd

Hi @cagataycali,

Thanks for your thorough testing and feedback! I've addressed the core issues you identified:

Issues Fixed:

1. Zone Parameter Bug

You were right - there was a critical bug where the zone wasn't being passed correctly. To BrightDataClient .

2. Zone Configuration Approach

Instead of exposing zone discovery (get_active_zones), I've implemented an approach that sticks with our best practices:

Solution: Zone configuration via environment variable:

BRIGHTDATA_API_KEY=your_api_key
BRIGHTDATA_ZONE=web_unlocker_12345  # User's specific zone

The tool now:

Checks for BRIGHTDATA_ZONE environment variable first
Falls back to "web_unlocker1" if not set (if you are a new customer and that's your first time opening a zone - this is the name you will get)

Why NOT get_active_zones:

Per best practices:

Agents shouldn't make infrastructure decisions - Zone selection is infrastructure config, not agent logic
Security principle - Don't expose enumeration of infrastructure resources
Prevents misuse - Stops agents from accidentally using datacenter/residential zones that will fail (in most cases here, they will just get blocked)

The main Idea here is that ONLY zones that are from type web_unlocker will be used, not residential, datacenter or any other zones - since they are not intended to use with Agents, the main idea is to make the web unlockable using our Web Unlocker and datafeeds, with Datacenter or other proxies this is not an option since they don't have the underlying unlocking mechanism that unlocker has.

How This Solves Your "zone not found" Error:

Your error happened because "unlocker" wasn't a valid zone name in your account. Now users:

Create their Web Unlocker zone with ANY name (e.g., "web_unlocker_12345")
Set BRIGHTDATA_ZONE=web_unlocker_12345 in .env
Tool automatically uses their configured unlocker zone

No agent decision-making, just clean environment-based configuration.

Testing Your Scenario:

With these changes, your test case now works perfectly:

# My zone "web_unlocker1" is configured in .env
BRIGHTDATA_ZONE=web_unlocker1
BRIGHTDATA_API_KEY=<my api key>

```python
from strands import Agent
from strands.models.litellm import LiteLLMModel
from strands_tools.bright_data import bright_data
import os
from dotenv import load_dotenv

load_dotenv()

# Configure model
model = LiteLLMModel(
    client_args={"api_key": os.getenv("OPENAI_API_KEY")},
    model_id="openai/gpt-4o",
)
# Create agent with bright_data tool
agent = Agent(
    model=model,
    tools=[bright_data],
    system_prompt="You are a web assistant. For any bright_data tool usage. if you fail - return the exact error message'",
)

# Agent automatically determines when and how to use the tool
print(agent("What's the weather in San Francisco? Search the web for current conditions."))
print("\n" + "="*50 + "\n")
print(agent("Get me the main content from https://www.python.org"))
print("\n" + "="*50 + "\n")
print(agent("Find recent news about artificial intelligence"))```

The tool is now ready with proper zone handling, no issues with client creation, and clear configuration through environment variables.

I've added the additional environment parameter to REAME file as well to make sure this is clear.

cagataycali · 2025-09-02T16:17:13Z

Thank you for collaborating!

I noticed some of the tests are failing: https://github.com/strands-agents/tools/actions/runs/17356167237/job/49422649370?pr=21

And there's a small lint issue: https://github.com/strands-agents/tools/actions/runs/17356167237/job/49422649190?pr=21#step:5:18

After these fixes we're ready to ship 🚀

meirk-brd added 2 commits May 20, 2025 12:21

feat(tools): add Bright Data tool documentation to README

0d6f931

meirk-brd requested a review from a team as a code owner May 20, 2025 10:33

zastrowm assigned ryanycoleman and unassigned ryanycoleman May 21, 2025

ryanycoleman removed their assignment Jun 3, 2025

meirk-brd mentioned this pull request Jun 4, 2025

[FEATURE] Adding Bright Data Tool #67

Open

meirk-brd mentioned this pull request Jun 29, 2025

[FEATURE] Web search tool #98

Open

cagataycali reviewed Jul 19, 2025

View reviewed changes

src/strands_tools/bright_data.py Outdated Show resolved Hide resolved

Merge branch 'main' into feat/bright-data-tool

00def15

meirk-brd requested a deployment to manual-approval August 17, 2025 07:47 — with GitHub Actions Waiting

refactor(bright_data): remove hotel-specific parameters for generaliz…

28f104d

…ation

meirk-brd requested a deployment to manual-approval August 19, 2025 12:17 — with GitHub Actions Waiting

meirk-brd requested a review from cagataycali August 20, 2025 08:08

Merge branch 'main' into feat/bright-data-tool

9f488f8

jer96 temporarily deployed to manual-approval August 20, 2025 17:10 — with GitHub Actions Inactive

jer96 approved these changes Aug 20, 2025

View reviewed changes

jer96 enabled auto-merge (squash) August 20, 2025 17:35

cagataycali requested changes Aug 28, 2025

View reviewed changes

src/strands_tools/bright_data.py Outdated Show resolved Hide resolved

src/strands_tools/bright_data.py Outdated Show resolved Hide resolved

src/strands_tools/bright_data.py Outdated Show resolved Hide resolved

meirk-brd added 2 commits August 28, 2025 11:55

Migrate to tool decorator

9d747b3

Merge branch 'feat/bright-data-tool' of https://github.com/meirk-brd/…

513b861

…tools into feat/bright-data-tool

auto-merge was automatically disabled August 28, 2025 08:57
Head branch was pushed to by a user without write access

meirk-brd requested a deployment to manual-approval August 28, 2025 08:57 — with GitHub Actions Waiting

cagataycali requested changes Aug 29, 2025

View reviewed changes

src/strands_tools/bright_data.py Outdated Show resolved Hide resolved

src/strands_tools/bright_data.py Show resolved Hide resolved

src/strands_tools/bright_data.py Show resolved Hide resolved

src/strands_tools/bright_data.py Show resolved Hide resolved

fix zone configuration

0674024

meirk-brd requested a deployment to manual-approval August 31, 2025 10:57 — with GitHub Actions Waiting

meirk-brd commented Sep 2, 2025

View reviewed changes

meirk-brd requested a review from cagataycali September 2, 2025 08:41

cagataycali approved these changes Sep 2, 2025

View reviewed changes

cagataycali enabled auto-merge (squash) September 2, 2025 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/bright data tool #21

Feat/bright data tool #21

Uh oh!

meirk-brd commented May 20, 2025

Uh oh!

meirk-brd commented May 26, 2025

Uh oh!

Uh oh!

meirk-brd commented Aug 26, 2025

Uh oh!

cagataycali left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

meirk-brd commented Aug 28, 2025

Uh oh!

cagataycali left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

meirk-brd left a comment

Uh oh!

cagataycali commented Sep 2, 2025

Uh oh!

Uh oh!

Feat/bright data tool #21

Are you sure you want to change the base?

Feat/bright data tool #21

Uh oh!

Conversation

meirk-brd commented May 20, 2025

Description

Type of Change

Testing

Checklist

Uh oh!

meirk-brd commented May 26, 2025

Uh oh!

Uh oh!

meirk-brd commented Aug 26, 2025

Uh oh!

cagataycali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

meirk-brd commented Aug 28, 2025

Uh oh!

cagataycali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

meirk-brd left a comment

Choose a reason for hiding this comment

Issues Fixed:

1. Zone Parameter Bug

2. Zone Configuration Approach

Why NOT get_active_zones:

How This Solves Your "zone not found" Error:

Testing Your Scenario:

Uh oh!

cagataycali commented Sep 2, 2025

Uh oh!

Uh oh!