Skip to content

Commit fccf850

Browse files
committed
some chanegs in the system prompt for checking on the scrollable elements
1 parent a7294d0 commit fccf850

File tree

3 files changed

+13
-3
lines changed

3 files changed

+13
-3
lines changed

src/agent/web/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,7 @@ async def action(self,state:AgentState):
130130
'informative_elements':browser_state.dom_state.informative_elements_to_string(),
131131
'scrollable_elements':browser_state.dom_state.scrollable_elements_to_string()
132132
})
133+
print(browser_state.dom_state.scrollable_nodes)
133134
messages=[AIMessage(action_prompt),ImageMessage(text=observation_prompt,image_obj=image_obj) if self.use_vision and image_obj is not None else HumanMessage(observation_prompt)]
134135
return {**state,'messages':messages,'prev_observation':observation}
135136

src/agent/web/dom/views.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ def interactive_elements_to_string(self)->str:
8080
return '\n'.join([f'{index} - Tag: {node.tag} Role: {node.role} Name: {node.name} Attributes: {node.attributes} Cordinates: {node.center.to_string()}' for index,(node) in enumerate(self.interactive_nodes)])
8181

8282
def informative_elements_to_string(self)->str:
83-
return '\n'.join([f'Tag: {node.tag} Role: {node.role} Content: {node.content} Cordinates: {node.center.to_string()}' for node in self.informative_nodes])
83+
return '\n'.join([f'Tag: {node.tag} Role: {node.role} Content: {node.content}' for node in self.informative_nodes])
8484

8585
def scrollable_elements_to_string(self)->str:
8686
n=len(self.interactive_nodes)

src/agent/web/prompt/system.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,14 @@ You are a highly advanced, expert-level Web Agent with human-like browsing capab
5555
```
5656
<element_index> - Tag: <element_tag> Role: <element_role> Name: <element_name> Attributes: <element_attributes> Coordinates: <element_coordinate>
5757
```
58+
6. **Scrollable Elements:** Available webpage elements in format:
59+
```
60+
<element_index> - Tag: <element_tag> Role: <element_role> Name: <element_name> Attributes: <element_attributes>
61+
```
62+
7. **Informative Elements:** Available webpage elements in format:
63+
```
64+
Tag: <element_tag> Role: <element_role> Content: <element_content>
65+
```
5866

5967
## Execution Framework:
6068

@@ -63,6 +71,7 @@ You are a highly advanced, expert-level Web Agent with human-like browsing capab
6371
- Thoroughly analyze element properties before interaction
6472
- Reference elements exclusively by their numeric index
6573
- Consider element position and visibility when planning interactions
74+
- Use scrollable elements to scroll through specific sections of the webpage
6675

6776
### Visual Analysis Protocol:
6877

@@ -85,7 +94,7 @@ You are a highly advanced, expert-level Web Agent with human-like browsing capab
8594
- Wait for complete page loading before proceeding with interactions
8695

8796
### Human in Loop:
88-
- Human in the loop enabled: {human_in_loop}
97+
- Human in the loop is enabled: {human_in_loop}
8998
- Talk to user when needed using `Human Tool` if its available
9099
- Use it for clarification, credentials, for performing a new task within the main task and more
91100
- When unsolvable security challenges comes in the way during navigation
@@ -133,4 +142,4 @@ Respond exclusively in this XML format:
133142
<Action-Name>[Selected tool name]</Action-Name>
134143
<Action-Input>{{'param1':'value1','param2':'value2'}}</Action-Input>
135144
</Option>
136-
```
145+
```

0 commit comments

Comments
 (0)