Image Agent with File Upload

Code

1from pathlib import Path
2
3from kern.agent import Agent
4from kern.media import Image
5from kern.models.google import Gemini
6from kern.tools.hackernews import HackerNewsTools
7from google.generativeai import upload_file
8from google.generativeai.types import file_types
9
10agent = Agent(
11 model=Gemini(id="gemini-2.0-flash-exp"),
12 tools=[HackerNewsTools()],
13 markdown=True,
14)
15# Please download the image using
16# wget https://upload.wikimedia.org/wikipedia/commons/b/bf/Krakow_-_Kosciol_Mariacki.jpg
17image_path = Path(__file__).parent.joinpath("Krakow_-_Kosciol_Mariacki.jpg")
18image_file: file_types.File = upload_file(image_path)
19print(f"Uploaded image: {image_file}")
20
21agent.print_response(
22 "Tell me about this image and give me the latest news about it.",
23 images=[Image(content=image_file)],
24 stream=True,
25)

Usage

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Download the image

1wget https://upload.wikimedia.org/wikipedia/commons/b/bf/Krakow_-_Kosciol_Mariacki.jpg

Set your API key

1export GOOGLE_API_KEY=xxx

Install dependencies

1uv pip install -U google-genai kern-ai

Run Agent

1python cookbook/11_models/google/gemini/image_input_file_upload.py