YouTube Reader
The YouTube Reader allows you to extract transcripts from YouTube videos synchronously and convert them into vector embeddings for your knowledge base.
Code
1from kern.agent import Agent2from kern.knowledge.knowledge import Knowledge3from kern.knowledge.reader.youtube_reader import YouTubeReader4from kern.vectordb.pgvector import PgVector56db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"78# Create Knowledge Instance9knowledge = Knowledge(10 name="YouTube Knowledge Base",11 description="Knowledge base from YouTube video transcripts",12 vector_db=PgVector(13 table_name="youtube_vectors", 14 db_url=db_url15 ),16)1718# Add YouTube video content synchronously19knowledge.insert(20 metadata={"source": "youtube", "type": "educational"},21 urls=[22 "https://www.youtube.com/watch?v=dQw4w9WgXcQ", # Replace with actual educational video23 "https://www.youtube.com/watch?v=example123" # Replace with actual video URL24 ],25 reader=YouTubeReader(),26)2728# Create an agent with the knowledge29agent = Agent(30 knowledge=knowledge,31 search_knowledge=True,32)3334# Query the knowledge base35agent.print_response(36 "What are the main topics discussed in the videos?",37 markdown=True38)Usage
Set up your virtual environment
1uv venv --python 3.122source .venv/bin/activate1uv venv --python 3.122.venv\Scripts\activateInstall dependencies
1uv pip install -U youtube-transcript-api pytube sqlalchemy psycopg pgvector kern-ai openaiSet environment variables
1export OPENAI_API_KEY=xxxRun PgVector
1docker run -d \2 -e POSTGRES_DB=ai \3 -e POSTGRES_USER=ai \4 -e POSTGRES_PASSWORD=ai \5 -e PGDATA=/var/lib/postgresql/data/pgdata \6 -v pgvolume:/var/lib/postgresql/data \7 -p 5532:5432 \8 --name pgvector \9 kern/pgvector:16Run Agent
1python examples/basics/knowledge/concepts/readers/overview/youtube_reader_sync.py1python examples/basics/knowledge/concepts/readers/overview/youtube_reader_sync.pyParams
| Parameter | Type | Default | Description |
|---|---|---|---|
video_url | str | Required | URL of the YouTube video to extract transcript from |