Skip to main content

How do I use AWS Bedrock with the SDK?

Yes, the OpenHands SDK supports AWS Bedrock through LiteLLM. Since LiteLLM requires boto3 for Bedrock requests, you need to install it alongside the SDK.

Step 1: Install boto3

Install the SDK with boto3:
# Using pip
pip install openhands-sdk boto3

# Using uv
uv pip install openhands-sdk boto3

# Or when installing as a CLI tool
uv tool install openhands --with boto3

Step 2: Configure Authentication

You have two authentication options:Option A: API Key Authentication (Recommended)Use the AWS_BEARER_TOKEN_BEDROCK environment variable:
export AWS_BEARER_TOKEN_BEDROCK="your-bedrock-api-key"
Option B: AWS CredentialsUse traditional AWS credentials:
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION_NAME="us-west-2"

Step 3: Configure the Model

Use the bedrock/ prefix for your model name:
from openhands.sdk import LLM, Agent

llm = LLM(
    model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
    # api_key is read from AWS_BEARER_TOKEN_BEDROCK automatically
)
For cross-region inference profiles, include the region prefix:
llm = LLM(
    model="bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0",  # US region
    # or
    model="bedrock/apac.anthropic.claude-sonnet-4-20250514-v1:0",  # APAC region
)
For more details on Bedrock configuration options, see the LiteLLM Bedrock documentation.

Does the agent SDK support parallel tool calling?

Yes, the OpenHands SDK supports parallel tool calling by default. The SDK automatically handles parallel tool calls when the underlying LLM (like Claude or GPT-4) returns multiple tool calls in a single response. This allows agents to execute multiple independent actions before the next LLM call.
When the LLM generates multiple tool calls in parallel, the SDK groups them using a shared llm_response_id:
ActionEvent(llm_response_id="abc123", thought="Let me check...", tool_call=tool1)
ActionEvent(llm_response_id="abc123", thought=[], tool_call=tool2)
# Combined into: Message(role="assistant", content="Let me check...", tool_calls=[tool1, tool2])
Multiple ActionEvents with the same llm_response_id are grouped together and combined into a single LLM message with multiple tool_calls. Only the first event’s thought/reasoning is included. The parallel tool calling implementation can be found in the Events Architecture for detailed explanation of how parallel function calling works, the prepare_llm_messages in utils.py which groups ActionEvents by llm_response_id when converting events to LLM messages, the agent step method where actions are created with shared llm_response_id, and the ActionEvent class which includes the llm_response_id field. For more details, see the Events Architecture for a deep dive into the event system and parallel function calling, the Tool System for understanding how tools work with the agent, and the Agent Architecture for how agents process and execute actions.

Does the agent SDK support image content?

Yes, the OpenHands SDK fully supports image content for vision-capable LLMs. The SDK supports both HTTP/HTTPS URLs and base64-encoded images through the ImageContent class.

Check Vision Support

Before sending images, verify your LLM supports vision:
from openhands.sdk import LLM
from pydantic import SecretStr

llm = LLM(
    model="anthropic/claude-sonnet-4-5-20250929",
    api_key=SecretStr("your-api-key"),
    usage_id="my-agent"
)

# Check if vision is active
assert llm.vision_is_active(), "Model does not support vision"

Using HTTP URLs

from openhands.sdk import ImageContent, Message, TextContent

message = Message(
    role="user",
    content=[
        TextContent(text="What do you see in this image?"),
        ImageContent(image_urls=["https://example.com/image.png"]),
    ],
)

Using Base64 Images

Base64 images are supported using data URLs:
import base64
from openhands.sdk import ImageContent, Message, TextContent

# Read and encode an image file
with open("my_image.png", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode("utf-8")

# Create message with base64 image
message = Message(
    role="user",
    content=[
        TextContent(text="Describe this image"),
        ImageContent(image_urls=[f"data:image/png;base64,{image_base64}"]),
    ],
)

Supported Image Formats

The data URL format is: data:<mime_type>;base64,<base64_encoded_data>Supported MIME types:
  • image/png
  • image/jpeg
  • image/gif
  • image/webp
  • image/bmp

Built-in Image Support

Several SDK tools automatically handle images:
  • FileEditorTool: When viewing image files (.png, .jpg, .jpeg, .gif, .webp, .bmp), they’re automatically converted to base64 and sent to the LLM
  • BrowserUseTool: Screenshots are captured and sent as base64 images
  • MCP Tools: Image content from MCP tool results is automatically converted to base64 data URLs

Disabling Vision

To disable vision for cost reduction (even on vision-capable models):
llm = LLM(
    model="anthropic/claude-sonnet-4-5-20250929",
    api_key=SecretStr("your-api-key"),
    usage_id="my-agent",
    disable_vision=True,  # Images will be filtered out
)
For a complete example, see the image input example in the SDK repository.

How do I handle MessageEvent in one-off tasks?

The SDK provides utilities to automatically respond to agent messages when running tasks end-to-end. When running one-off tasks, some models may send a MessageEvent (proposing an action or asking for confirmation) instead of directly using tools. This causes conversation.run() to return, even though the agent hasn’t finished the task.
When an agent sends a message (via MessageEvent) instead of using the finish tool, the conversation ends because it’s waiting for user input. In automated pipelines, there’s no human to respond, so the task appears incomplete.Key event types:
  • ActionEvent: Agent uses a tool (terminal, file editor, etc.)
  • MessageEvent: Agent sends a text message (waiting for user response)
  • FinishAction: Agent explicitly signals task completion
The solution is to automatically send a “fake user response” when the agent sends a message, prompting it to continue.
The run_conversation_with_fake_user_response function wraps your conversation and automatically handles agent messages:
from openhands.sdk.conversation.state import ConversationExecutionStatus
from openhands.sdk.event import ActionEvent, MessageEvent
from openhands.sdk.tool.builtins.finish import FinishAction

def run_conversation_with_fake_user_response(conversation, max_responses: int = 10):
    """Run conversation, auto-responding to agent messages until finish or limit."""
    for _ in range(max_responses):
        conversation.run()
        if conversation.state.execution_status != ConversationExecutionStatus.FINISHED:
            break
        events = list(conversation.state.events)
        # Check if agent used finish tool
        if any(isinstance(e, ActionEvent) and isinstance(e.action, FinishAction) for e in reversed(events)):
            break
        # Check if agent sent a message (needs response)
        if not any(isinstance(e, MessageEvent) and e.source == "agent" for e in reversed(events)):
            break
        # Send continuation prompt
        conversation.send_message(
            "Please continue. Use the finish tool when done. DO NOT ask for human help."
        )
from openhands.sdk import Agent, Conversation, LLM
from openhands.workspace import DockerWorkspace
from openhands.tools.preset.default import get_default_tools

llm = LLM(model="anthropic/claude-sonnet-4-20250514", api_key="...")
agent = Agent(llm=llm, tools=get_default_tools())
workspace = DockerWorkspace()
conversation = Conversation(agent=agent, workspace=workspace, max_iteration_per_run=100)

conversation.send_message("Fix the bug in src/utils.py")
run_conversation_with_fake_user_response(conversation, max_responses=10)
# Results available in conversation.state.events
Pro tip: Add a hint to your task prompt:
“If you’re 100% done with the task, use the finish action. Otherwise, keep going until you’re finished.”
This encourages the agent to use the finish tool rather than asking for confirmation.
For the full implementation used in OpenHands benchmarks, see the fake_user_response.py module.

More questions?

If you have additional questions: