Image Transcribe Document Agent

Code

1from kern.agent import Agent
2from kern.media import Image
3from kern.models.mistral.mistral import MistralChat
4
5agent = Agent(
6 model=MistralChat(id="pixtral-12b-2409"),
7 markdown=True,
8)
9
10agent.print_response(
11 "Transcribe this document.",
12 images=[
13 Image(url="https://ciir.cs.umass.edu/irdemo/hw-demo/page_example.jpg"),
14 ],
15)

Usage

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Set your API key

1export MISTRAL_API_KEY=xxx

Install dependencies

1uv pip install -U mistralai kern-ai

Run Agent

1python cookbook/11_models/mistral/image_transcribe_document_agent.py