Skip to content

bayshanntech/experiment_adk_agentcore_jvm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Claude ADK Agent - Kotlin/JVM Edition πŸ΄β€β˜ οΈ

A multi-agent browser automation system using the Google Agent Development Kit (ADK) with Claude API and Playwright for JVM, deployable to AWS Bedrock AgentCore.

This is the Kotlin/JVM version of the original Python implementation, providing the same functionality with the benefits of the JVM ecosystem.

πŸš€ Quick Start

Prerequisites

  • Java 17+ (OpenJDK recommended)
  • Docker (for deployment)
  • AWS CLI configured (for cloud deployment)
  • Anthropic API key

Local Development

  1. Setup everything:

    make setup
  2. Configure your API key:

    # Edit .env file
    vim .env
    # Add your Anthropic API key:
    # ANTHROPIC_API_KEY=your-api-key-here
  3. Run the agent:

    make run PROMPT="summarise the headlines on https://www.abc.net.au/news"

πŸ› οΈ Technology Stack

  • Language: Kotlin 1.9.20
  • Runtime: JVM (Java 17+)
  • Build Tool: Gradle with Kotlin DSL
  • Browser Automation: Playwright for JVM
  • AI API: Claude API via OkHttp
  • AWS Integration: AWS SDK for Java v2
  • JSON Processing: Jackson with Kotlin module
  • Async Processing: Kotlin Coroutines
  • Testing: JUnit 5 + Kotlin Test

πŸ“ Project Structure

src/main/kotlin/com/example/agent/
β”œβ”€β”€ Main.kt                    # Entry point for local development
β”œβ”€β”€ BrowserAgent.kt         # Main agent with multi-step workflow
β”œβ”€β”€ ApiKeyRetriever.kt        # Secure credential retrieval
β”œβ”€β”€ PlaywrightAgent.kt        # Browser automation agent  
β”œβ”€β”€ AgentCoreHandler.kt       # AgentCore integration handler
└── Config.kt                 # Configuration management

src/main/resources/
└── application.conf          # Configuration defaults

build.gradle.kts              # Gradle build configuration
Makefile                      # Build and deployment automation

πŸ”§ Architecture

The system uses a multi-agent delegation pattern:

  1. Main Agent (ClaudeApiAgent) - Orchestrates the workflow:

    • Plans browser automation tasks using Claude API
    • Delegates execution to Playwright Agent
    • Analyzes results using Claude API
  2. Playwright Agent - Handles browser automation:

    • Executes planned browser actions
    • Provides intelligent fallbacks for serverless environments
    • Supports general web automation tasks
  3. API Key Retriever - Manages secure credentials:

    • Priority fallback: AgentCore β†’ AWS Secrets Manager β†’ Environment
    • Handles multiple credential formats
  4. AgentCore Handler - Bridges cloud deployment:

    • Adapts between AgentCore and local interfaces
    • Handles payload processing

🌊 Multi-Step Workflow

suspend fun processRequest(userPrompt: String): String {
    // Step 1: Plan browser automation task
    val browserPlan = planBrowserAutomationTask(userPrompt)
    
    // Step 2: Execute browser automation plan
    val extractedData = executeBrowserAutomation(browserPlan)
    
    // Step 3: Analyze and summarize results
    val claudeAnalysis = analyzeBrowserResults(userPrompt, extractedData)
    
    // Return consolidated response
    return buildFinalResponse(browserPlan, extractedData, claudeAnalysis)
}

πŸ” Secure Configuration

The agent supports multiple credential sources with automatic fallback:

  1. AgentCore Outbound Identity (Production)
  2. AWS Secrets Manager (Cloud environments)
  3. Environment Variables (Local development)

Configuration is managed through:

  • .env files for local development
  • application.conf for defaults
  • Environment variables for production

πŸ“‹ Available Commands

Development

make setup          # Complete environment setup
make build          # Build the Kotlin project
make run            # Run agent locally
make test           # Run tests
make clean          # Clean build artifacts

AWS Deployment

make create-iam-role    # Create IAM execution role
make configure          # Configure AgentCore
make launch-local       # Test locally with AgentCore
make launch             # Deploy to AWS
make invoke             # Test deployed agent

πŸ§ͺ Testing

Run the test suite:

make test

Quick local test:

make test-local

πŸ“¦ Deployment

Local Packaging

make package

Creates dist/claude-adk-agent-kotlin.jar

Docker Build

make docker-build  

AWS AgentCore

# 1. Create IAM role
make create-iam-role

# 2. Configure deployment
make configure

# 3. Deploy to AWS
make launch

# 4. Test deployment
make invoke

🌟 Key Features

  • Intelligent Browser Automation - Uses Claude AI to plan and analyze web tasks
  • Multi-Agent Architecture - Clean separation of concerns
  • Secure Credential Management - Multiple secure sources with fallback
  • Serverless Ready - Intelligent fallbacks for AWS Lambda environments
  • Type Safe - Full Kotlin type safety with coroutines
  • Production Ready - Comprehensive error handling and logging

πŸ”„ Migration from Python

This Kotlin version provides equivalent functionality to the Python implementation:

Python Kotlin
main.py Main.kt + BrowserAgent.kt
api_key_retriever.py ApiKeyRetriever.kt
playwright_agent.py PlaywrightAgent.kt
agentcore_handler.py AgentCoreHandler.kt
config.py Config.kt + application.conf

πŸ†˜ Troubleshooting

Common Issues

  1. Java Version

    java -version  # Should be 17+
  2. Gradle Build Issues

    ./gradlew --version
    ./gradlew clean build
  3. Playwright Browser Installation

    make install-playwright
  4. API Key Issues

    • Check .env file exists and has valid API key
    • Verify AWS credentials for Secrets Manager
    • Check IAM permissions for AgentCore

πŸ”§ Technical Details

Google ADK Function Tool Mechanism

The system uses Google ADK's reflection-based function calling to enable true agent-to-agent communication:

How FunctionTool.create() Works

// Register a static method as a tool
val browserTool = FunctionTool.create(
    AdkMultiAgentSystem::class.java,
    "executeBrowserAutomation"
)

The ADK framework uses reflection to:

  1. Method Discovery: Uses reflection to locate the static method executeBrowserAutomation on the AdkMultiAgentSystem class
  2. Schema Generation: Reads @Schema annotations to create JSON schema for the LLM:
    @JvmStatic
    fun executeBrowserAutomation(
        @Schema(description = "The target URL to navigate to") url: String,
        @Schema(description = "JSON string containing the browser automation plan") actionPlanJson: String
    ): Map<String, Any>
  3. Runtime Invocation: When the BrowserExecutor agent decides to use this tool, ADK invokes it via reflection
  4. Response Handling: The structured return type (Map<String, Any>) is automatically serialized to JSON

Requirements for ADK Function Tools

  • @JvmStatic: Method must be static (accessible without class instance)
  • @Schema annotations: Each parameter needs description for LLM understanding
  • Structured return type: Must return serializable types (Map, data classes, primitives)
  • Exception handling: Wrap operations in try-catch for graceful error responses

This mechanism allows the BrowserExecutor agent to autonomously decide when and how to call browser automation functions, while ADK handles all the reflection and serialization mechanics behind the scenes.

πŸ“š Dependencies

Key dependencies managed by Gradle:

  • Google ADK Core - Agent Development Kit framework
  • Google GenAI - Content and model integration
  • Kotlin Standard Library - Core language support
  • Kotlin Coroutines - Async programming
  • Anthropic SDK - Claude API integration
  • Playwright for JVM - Browser automation
  • RxJava - Reactive streams for ADK events
  • SLF4J + Logback - Logging framework

πŸ΄β€β˜ οΈ Captain's Notes

This Kotlin implementation provides:

  • Better Performance - JVM optimizations and compilation
  • Type Safety - Compile-time error detection
  • Rich Ecosystem - Access to Java libraries
  • Enterprise Ready - Mature tooling and monitoring
  • Async First - Native coroutine support

The architecture remains identical to the Python version, ensuring feature parity while leveraging JVM strengths!

Current State

The multi-agent wiring is working and calls to Claude API succeed. Depending on the prompt / web browsing problem given the system may fail though. This is partly due to the generic nature of the system setup that cannot suit all possible requests. It is also due to the very prescriptive setup that this solution takes in telling the browsing agent what to do: there is a fixed set of BrowserActionType that may limit the browsing too much to be able to achieve its goals.

About

Claude ADK Agent implementation in Kotlin/JVM with Playwright browser automation

Topics

Resources

Stars

Watchers

Forks

Contributors 2

  •  
  •