GCS for Workflows

Kern supports using Google Cloud Storage (GCS) as a storage backend for Workflows using the GcsJsonDb class. This storage backend stores session data as JSON blobs in a GCS bucket.

Usage

Configure your workflow with GCS storage to enable cloud-based session persistence.

1import uuid
2import google.auth
3from kern.agent import Agent
4from kern.db.gcs_json import GcsJsonDb
5from kern.models.openai import OpenAIResponses
6from kern.team import Team
7from kern.tools.hackernews import HackerNewsTools
8from kern.tools.hackernews import HackerNewsTools
9from kern.workflow.step import Step
10from kern.workflow.workflow import Workflow
11
12# Obtain the default credentials and project id from your gcloud CLI session.
13credentials, project_id = google.auth.default()
14
15# Generate a unique bucket name using a base name and a UUID4 suffix.
16base_bucket_name = "example-gcs-bucket"
17unique_bucket_name = f"{base_bucket_name}-{uuid.uuid4().hex[:12]}"
18print(f"Using bucket: {unique_bucket_name}")
19
20# Setup the JSON database
21db = GcsJsonDb(
22    bucket_name=unique_bucket_name,
23    prefix="workflow/",
24    project=project_id,
25    credentials=credentials,
26)
27
28# Define agents
29hackernews_agent = Agent(
30    name="Hackernews Agent",
31    model=OpenAIResponses(id="gpt-5.2"),
32    tools=[HackerNewsTools()],
33    role="Extract key insights and content from Hackernews posts",
34)
35web_agent = Agent(
36    name="Web Agent",
37    model=OpenAIResponses(id="gpt-5.2"),
38    tools=[HackerNewsTools()],
39    role="Search the web for the latest news and trends",
40)
41
42# Define research team for complex analysis
43research_team = Team(
44    name="Research Team",
45    members=[hackernews_agent, web_agent],
46    instructions="Research tech topics from Hackernews and the web",
47)
48
49content_planner = Agent(
50    name="Content Planner",
51    model=OpenAIResponses(id="gpt-5.2"),
52    instructions=[
53        "Plan a content schedule over 4 weeks for the provided topic and research content",
54        "Ensure that I have posts for 3 posts per week",
55    ],
56)
57
58# Define steps
59research_step = Step(
60    name="Research Step",
61    team=research_team,
62)
63
64content_planning_step = Step(
65    name="Content Planning Step",
66    agent=content_planner,
67)
68
69# Create and use workflow
70if __name__ == "__main__":
71    content_creation_workflow = Workflow(
72        name="Content Creation Workflow",
73        description="Automated content creation from blog posts to social media",
74        db=db,
75        steps=[research_step, content_planning_step],
76    )
77    content_creation_workflow.print_response(
78        input="AI trends in 2024",
79        markdown=True,
80    )

Prerequisites

Google Cloud SDK Setup

Install the Google Cloud SDK
Run gcloud init to configure your account and project

GCS Permissions

Ensure your account has sufficient permissions (e.g., Storage Admin) to create and manage GCS buckets:

1gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
2    --member="user:YOUR_EMAIL@example.com" \
3    --role="roles/storage.admin"

Authentication

Use default credentials from your gcloud CLI session:

1gcloud auth application-default login

Alternatively, if using a service account, set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of your service account JSON file.

Python Dependencies

Install the required Python packages:

1pip install google-auth google-cloud-storage openai ddgs

Setup with Docker

For local testing without using real GCS, you can use fake-gcs-server.

Create a docker-compose.yml file:

1version: '3.8'
2services:
3  fake-gcs-server:
4    image: fsouza/fake-gcs-server:latest
5    ports:
6      - "4443:4443"
7    command: ["-scheme", "http", "-port", "4443", "-public-host", "localhost"]
8    volumes:
9      - ./fake-gcs-data:/data

Start the fake GCS server:

1docker-compose up -d

Using Fake GCS with Docker

Set the environment variable to direct API calls to the emulator:

1export STORAGE_EMULATOR_HOST="http://localhost:4443"
2python gcs_for_agent.py

When using Fake GCS, authentication isn't enforced and the client will automatically detect the emulator endpoint.

Params

Parameter	Type	Default	Description
`id`	`Optional[str]`	-	The ID of the database instance. UUID by default.
`bucket_name`	`str`	-	Name of the GCS bucket where JSON files will be stored.
`prefix`	`Optional[str]`	-	Path prefix for organizing files in the bucket. Defaults to "kern/".
`session_table`	`Optional[str]`	-	Name of the JSON file to store sessions (without .json extension).
`memory_table`	`Optional[str]`	-	Name of the JSON file to store user memories.
`metrics_table`	`Optional[str]`	-	Name of the JSON file to store metrics.
`eval_table`	`Optional[str]`	-	Name of the JSON file to store evaluation runs.
`knowledge_table`	`Optional[str]`	-	Name of the JSON file to store knowledge content.
`traces_table`	`Optional[str]`	-	Name of the JSON file to store traces.
`spans_table`	`Optional[str]`	-	Name of the JSON file to store spans.
`project`	`Optional[str]`	-	GCP project ID. If None, uses default project.
`credentials`	`Optional[Any]`	-	GCP credentials. If None, uses default credentials.