Image Agent
Code
1from pathlib import Path23from kern.agent import Agent4from kern.media import Image5from kern.models.ibm import WatsonX6from kern.tools.hackernews import HackerNewsTools78agent = Agent(9 model=WatsonX(id="meta-llama/llama-3-2-11b-vision-instruct"),10 tools=[HackerNewsTools()],11 markdown=True,12)1314image_path = Path(__file__).parent.joinpath("sample.jpg")1516# Read the image file content as bytes17with open(image_path, "rb") as img_file:18 image_bytes = img_file.read()1920agent.print_response(21 "Tell me about this image and give me the latest news about it.",22 images=[23 Image(content=image_bytes),24 ],25 stream=True,26)Usage
Set up your virtual environment
1uv venv --python 3.122source .venv/bin/activate1uv venv --python 3.122.venv\Scripts\activateSet your API key
1export IBM_WATSONX_API_KEY=xxx2export IBM_WATSONX_PROJECT_ID=xxxInstall dependencies
1uv pip install -U ibm-watsonx-ai kern-aiAdd sample image
Place a sample image named "sample.jpg" in the same directory as the script.
Run Agent
1python cookbook/11_models/ibm/watsonx/image_agent_bytes.pyThis example shows how to use IBM WatsonX with vision capabilities. It loads an image from a file and passes it to the model along with a prompt. The model can then analyze the image and provide relevant information.
Note: This example uses a vision-capable model (meta-llama/llama-3-2-11b-vision-instruct) and requires a sample image file.