Skip to content

Releases: CursorTouch/Web-Navigator

v0.2

07 Jul 15:32

Choose a tag to compare

Feature

  • Improved the grounding to handle more corner cases.

Fix

  • Fixed the bug that causes stucking in the pages of pdf or blank pages.
  • Removed redundant parts in the agent implementation

v0.1

17 Jun 16:46

Choose a tag to compare

Key Features & Updates

  • Dual Agent Modes: Supports both non-vision and vision-based agent operation (to support both LLM and VLM).
  • Scrollable vs. Interactive Elements: A clear separation improves DOM recognition and interaction.
  • Scrolling Logic: Enables scrolling through distinct webpage sections, including nested containers.
  • HTML → Markdown: Upgraded to markdownify in the Scrape Tool for better content conversion.
  • Tab Management: Tracks the number of open tabs, active tab, and supports basic tab control.
  • Extensible Tools: Add custom tools to the agent via the additional_tools parameter.
  • Iframe & Shadow DOM Access: Enhanced ability to interact with embedded or encapsulated elements.
  • Structured Output: Returns well-defined BaseModel outputs using the structured_output parameter.
  • Human-in-the-Loop: Add manual checkpoints in the workflow via the include_human_in_loop parameter (thanks @tanmaysk001!)
  • Inference Wrapper: Fixed the bug in the open router implementation (thanks @thecoderwithHat)
  • Navigation Fixes: Improved handling of edge-case navigations across complex sites.