Skip to content

Early Alpha Release - Browser Agent

Pre-release
Pre-release
Compare
Choose a tag to compare
@rkvalandas rkvalandas released this 28 Apr 14:36
· 4 commits to main since this release
0f109c8

πŸš€ Browser Agent v1.0.0-alpha β€” Early Alpha Release

Release Date: 2025-04-28


✨ Overview

This is an early alpha version of Browser Agent intended for initial testing and feedback.
⚑ Expect bugs, incomplete features, and frequent changes in upcoming versions.


πŸ“‹ What's Added

  • ✨ Initial implementation of Natural Language Browser Control using Azure OpenAI.
  • ✨ Basic element detection and interaction through Page Analyzer technology.
  • ✨ Comprehensive DOM analysis for mapping interactive page elements.
  • ✨ Support for form filling, element clicking, and webpage navigation.
  • ✨ Flexible CLI interface with multiple command options.
  • ✨ Preliminary handling of dynamic content including scrolling and basic AJAX support.
  • ✨ Command-line interface with run, launch, debug, and version commands.

🐞 Known Issues

  • ❗ Fill Input functionality fails on certain types of input fields.
  • ❗ Cannot reliably handle CAPTCHA challenges or complex authentication flows.
  • ❗ Struggles with highly dynamic interfaces that use advanced JavaScript frameworks.
  • ❗ Processing time can be slow for complex instructions.
  • ❗ Limited recovery options for certain error edge cases.
  • ❗ No long-term memory of previous browsing sessions.

πŸ› οΈ What's Coming Next

  • πŸ”₯ Improved error handling and recovery strategies.
  • πŸ”₯ Better support for dynamic web content and complex UIs.
  • πŸ”₯ Enhanced form filling capabilities to address current input field issues.
  • πŸ”₯ Performance optimizations for faster response times.
  • πŸ”₯ Session persistence and browsing history.
  • πŸ”₯ Visual element recognition and screenshot capabilities.

⚠️ Notes for Testers

  • This version is NOT production-ready.
  • Not recommended for use with sensitive financial or personal information.
  • Please report any bugs, crashes, or strange behavior.
  • Feedback on usability, functionality, and performance is highly appreciated.

πŸ“© How to Report Issues

Please open a GitHub Issue with:

  • Steps to reproduce the problem
  • Expected vs actual behavior
  • Screenshots (if possible)
  • Environment details (browser, OS, device)
  • Sample commands that failed

🧹 Installation/Usage Instructions

# Clone the repository
git clone https://github.com/yourusername/browser-agent.git
cd browser-agent

# Install dependencies
pip install -r requirements.txt
playwright install

# Set up environment variables
export OPENAI_API_KEY=your_api_key_here
export AZURE_ENDPOINT=your_azure_endpoint

# Run the Browser Agent
python main.py run

πŸ”– Tagging

  • Version: v1.0.0-alpha
  • Status: Pre-release (Early Testing)
  • Stability: Unstable, changes expected

πŸ’‘ Example Commands

# Basic information retrieval
Enter your instruction: Go to Wikipedia, search for "artificial intelligence", and summarize the introduction

# Simple online shopping
Enter your instruction: Find a mid-range laptop on Amazon with at least 16GB RAM and tell me the top three options

# Email management
Enter your instruction: Go to Gmail, compose an email to my team about the project update, and draft it for my review

⚠️ Security Note

Browser Agent can access and interact with any website you visit. As with any automation tool:

  • Do not use for sensitive activities (banking, confidential work)
  • Be cautious with personal accounts
  • Review all actions before executing
  • This early alpha does not encrypt or securely store any data

Thank you for trying Browser Agent! Your feedback will help shape the future of this project.