YouTube Agent

Build an AI agent that analyzes YouTube videos and creates structured summaries with accurate timestamps. This agent extracts key insights from video content, making it easy to navigate educational videos, tutorials, and presentations without watching them in full.

What You'll Learn

By building this agent, you'll understand:

  • How to integrate YouTube transcript extraction into agents
  • How to structure prompts for consistent timestamp generation
  • How to organize video content into logical sections
  • How to create agents that transform unstructured media into searchable content

Use Cases

Create study guides from lectures, extract insights from conference talks, build searchable video indexes, or generate documentation from tutorial videos.

How It Works

The agent uses YouTubeTools to fetch video transcripts and metadata, then analyzes the content to:

  1. Extract: Gets video metadata (title, duration) and full transcript
  2. Analyze: Identifies video type and content structure
  3. Organize: Creates timestamps for major topic transitions
  4. Summarize: Generates section-based summaries with key points

The structured output makes long-form video content quickly scannable and searchable.

Code

1from textwrap import dedent
2
3from kern.agent import Agent
4from kern.models.openai import OpenAIResponses
5from kern.tools.youtube import YouTubeTools
6
7youtube_agent = Agent(
8 name="YouTube Agent",
9 model=OpenAIResponses(id="gpt-5.2"),
10 tools=[YouTubeTools()],
11 instructions=dedent("""\
12 You are an expert YouTube content analyst with a keen eye for detail!
13 Follow these steps for comprehensive video analysis:
14 1. Video Overview
15 - Check video length and basic metadata
16 - Identify video type (tutorial, review, lecture, etc.)
17 - Note the content structure
18 2. Timestamp Creation
19 - Create precise, meaningful timestamps
20 - Focus on major topic transitions
21 - Highlight key moments and demonstrations
22 - Format: [start_time, end_time, detailed_summary]
23 3. Content Organization
24 - Group related segments
25 - Identify main themes
26 - Track topic progression
27
28 Your analysis style:
29 - Begin with a video overview
30 - Use clear, descriptive segment titles
31 - Include relevant emojis for content types:
32 Educational
33 Technical
34 Gaming
35 Tech Review
36 Creative
37 - Highlight key learning points
38 - Note practical demonstrations
39 - Mark important references
40
41 Quality Guidelines:
42 - Verify timestamp accuracy
43 - Avoid timestamp hallucination
44 - Ensure comprehensive coverage
45 - Maintain consistent detail level
46 - Focus on valuable content markers
47 """),
48 add_datetime_to_context=True,
49 markdown=True,
50)
51
52# Example usage with different types of videos
53youtube_agent.print_response(
54 "Analyze this video: https://www.youtube.com/watch?v=zjkBMFhNj_g",
55 stream=True,
56)
57
58# More example prompts to explore:
59"""
60Tutorial Analysis:
611. "Break down this Python tutorial with focus on code examples"
622. "Create a learning path from this web development course"
633. "Extract all practical exercises from this programming guide"
644. "Identify key concepts and implementation examples"
65
66Educational Content:
671. "Create a study guide with timestamps for this math lecture"
682. "Extract main theories and examples from this science video"
693. "Break down this historical documentary into key events"
704. "Summarize the main arguments in this academic presentation"
71
72Tech Reviews:
731. "List all product features mentioned with timestamps"
742. "Compare pros and cons discussed in this review"
753. "Extract technical specifications and benchmarks"
764. "Identify key comparison points and conclusions"
77
78Creative Content:
791. "Break down the techniques shown in this art tutorial"
802. "Create a timeline of project steps in this DIY video"
813. "List all tools and materials mentioned with timestamps"
824. "Extract tips and tricks with their demonstrations"
83"""

What to Expect

The agent analyzes YouTube videos by fetching transcripts and generating comprehensive breakdowns. For a typical video, you'll receive:

  • Video metadata (title, duration, type, and audience)
  • High-level structure overview
  • Timestamped breakdown of major topics with key examples
  • Content organization showing recurring themes
  • Practical highlights and actionable takeaways

Analysis typically takes 30-60 seconds depending on video length and complexity.

Usage

Set up your virtual environment

1uv venv --python 3.12
2source .venv/bin/activate
1uv venv --python 3.12
2.venv\Scripts\activate

Set your API key

1export OPENAI_API_KEY=xxx

Install dependencies

1uv pip install -U kern-ai openai youtube_transcript_api

Run Agent

1python youtube_agent.py
1python youtube_agent.py

Next Steps

  • Try analyzing different video types (tutorials, lectures, reviews)
  • Modify instructions to focus on specific content types
  • Combine with other tools for enhanced analysis
  • Explore Tools for additional capabilities