Skip to content

Conversation

JagjeevanAK
Copy link
Contributor

@JagjeevanAK JagjeevanAK commented Sep 22, 2025

Signed-off-by: Jagjeevan Kashid [email protected]

What

Solved issue #344

Approach

GPT-4o models now send an additional parameter with screenshot actions for descriptive context, but CUA's computer handlers only accepted basic parameters. The agent execution logic passes all action parameters via **kwargs, causing a TypeError.

so to mitigate this I have added an optional parameter to all screenshot method implementations:

  • cuaComputerHandler.screenshot(text: Optional[str] = None)
  • AsyncComputerHandler protocol updated
  • CustomComputerHandler.screenshot() updated

Validation: Verified against OpenAI docs. While the docs show basic screenshot actions without text parameters, GPT-4o models are implementing enhanced behavior by including descriptive reasoning text.

Signed-off-by: Jagjeevan Kashid <[email protected]>
@JagjeevanAK JagjeevanAK changed the title Fix: Added GPT-4o compatibility for screenshot actions with text parameter fix: Added GPT-4o compatibility for screenshot actions with text parameter Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant