Image Bytes Input Agent

Code

1import requests
2from kern.agent import Agent
3from kern.media import Image
4from kern.models.mistral.mistral import MistralChat
5
6agent = Agent(
7 model=MistralChat(id="pixtral-12b-2409"),
8 markdown=True,
9)
10
11image_url = (
12 "https://tripfixers.com/wp-content/uploads/2019/11/eiffel-tower-with-snow.jpeg"
13)
14
15def fetch_image_bytes(url: str) -> bytes:
16 headers = {
17 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
18 "Accept": "image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8",
19 "Accept-Language": "en-US,en;q=0.9",
20 }
21 response = requests.get(url, headers=headers)
22 response.raise_for_status()
23 return response.content
24
25image_bytes_from_url = fetch_image_bytes(image_url)
26
27agent.print_response(
28 "Tell me about this image.",
29 images=[
30 Image(content=image_bytes_from_url),
31 ],
32)

Usage

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Set your API key

1export MISTRAL_API_KEY=xxx

Install dependencies

1uv pip install -U mistralai kern-ai

Run Agent

1python cookbook/11_models/mistral/image_bytes_input_agent.py