YouTube Agent
Build an AI agent that analyzes YouTube videos and creates structured summaries with accurate timestamps. This agent extracts key insights from video content, making it easy to navigate educational videos, tutorials, and presentations without watching them in full.
What You'll Learn
By building this agent, you'll understand:
- How to integrate YouTube transcript extraction into agents
- How to structure prompts for consistent timestamp generation
- How to organize video content into logical sections
- How to create agents that transform unstructured media into searchable content
Use Cases
Create study guides from lectures, extract insights from conference talks, build searchable video indexes, or generate documentation from tutorial videos.
How It Works
The agent uses YouTubeTools to fetch video transcripts and metadata, then analyzes the content to:
- Extract: Gets video metadata (title, duration) and full transcript
- Analyze: Identifies video type and content structure
- Organize: Creates timestamps for major topic transitions
- Summarize: Generates section-based summaries with key points
The structured output makes long-form video content quickly scannable and searchable.
Code
1from textwrap import dedent23from kern.agent import Agent4from kern.models.openai import OpenAIResponses5from kern.tools.youtube import YouTubeTools67youtube_agent = Agent(8 name="YouTube Agent",9 model=OpenAIResponses(id="gpt-5.2"),10 tools=[YouTubeTools()],11 instructions=dedent("""\12 You are an expert YouTube content analyst with a keen eye for detail! ��13 Follow these steps for comprehensive video analysis:14 1. Video Overview15 - Check video length and basic metadata16 - Identify video type (tutorial, review, lecture, etc.)17 - Note the content structure18 2. Timestamp Creation19 - Create precise, meaningful timestamps20 - Focus on major topic transitions21 - Highlight key moments and demonstrations22 - Format: [start_time, end_time, detailed_summary]23 3. Content Organization24 - Group related segments25 - Identify main themes26 - Track topic progression2728 Your analysis style:29 - Begin with a video overview30 - Use clear, descriptive segment titles31 - Include relevant emojis for content types:32 �� Educational33 �� Technical34 �� Gaming35 �� Tech Review36 �� Creative37 - Highlight key learning points38 - Note practical demonstrations39 - Mark important references4041 Quality Guidelines:42 - Verify timestamp accuracy43 - Avoid timestamp hallucination44 - Ensure comprehensive coverage45 - Maintain consistent detail level46 - Focus on valuable content markers47 """),48 add_datetime_to_context=True,49 markdown=True,50)5152# Example usage with different types of videos53youtube_agent.print_response(54 "Analyze this video: https://www.youtube.com/watch?v=zjkBMFhNj_g",55 stream=True,56)5758# More example prompts to explore:59"""60Tutorial Analysis:611. "Break down this Python tutorial with focus on code examples"622. "Create a learning path from this web development course"633. "Extract all practical exercises from this programming guide"644. "Identify key concepts and implementation examples"6566Educational Content:671. "Create a study guide with timestamps for this math lecture"682. "Extract main theories and examples from this science video"693. "Break down this historical documentary into key events"704. "Summarize the main arguments in this academic presentation"7172Tech Reviews:731. "List all product features mentioned with timestamps"742. "Compare pros and cons discussed in this review"753. "Extract technical specifications and benchmarks"764. "Identify key comparison points and conclusions"7778Creative Content:791. "Break down the techniques shown in this art tutorial"802. "Create a timeline of project steps in this DIY video"813. "List all tools and materials mentioned with timestamps"824. "Extract tips and tricks with their demonstrations"83"""What to Expect
The agent analyzes YouTube videos by fetching transcripts and generating comprehensive breakdowns. For a typical video, you'll receive:
- Video metadata (title, duration, type, and audience)
- High-level structure overview
- Timestamped breakdown of major topics with key examples
- Content organization showing recurring themes
- Practical highlights and actionable takeaways
Analysis typically takes 30-60 seconds depending on video length and complexity.
Usage
Set up your virtual environment
1uv venv --python 3.122source .venv/bin/activate1uv venv --python 3.122.venv\Scripts\activateSet your API key
1export OPENAI_API_KEY=xxxInstall dependencies
1uv pip install -U kern-ai openai youtube_transcript_apiRun Agent
1python youtube_agent.py1python youtube_agent.pyNext Steps
- Try analyzing different video types (tutorials, lectures, reviews)
- Modify
instructionsto focus on specific content types - Combine with other tools for enhanced analysis
- Explore Tools for additional capabilities