Apify
This guide demonstrates how to integrate and use Apify Actors within the Kern framework to enhance your AI agents with web scraping, crawling, data extraction, and web automation capabilities.
What is Apify?
Apify is a platform that provides:
- Data collection services for AI Agents, specializing in extracting data from social media, search engines, online maps, e-commerce sites, travel portals, or general websites
- A marketplace of ready-to-use Actors (specialized tools) for various data tasks
- Infrastructure to run and monetize our own AI Agents
Prerequisites
- Sign up for an Apify account
- Obtain your Apify API token (can be obtained from Apify)
- Install the required packages:
1uv pip install kern-ai apify-clientBasic Usage
The Kern framework makes it easy to integrate Apify Actors into your agents. Here's a simple example:
1from kern.agent import Agent2from kern.tools.apify import ApifyTools34# Create an agent with ApifyTools5agent = Agent(6 tools=[7 ApifyTools(8 actors=["apify/rag-web-browser"], # Specify which Apify Actors to use, use multiple ones if needed9 apify_api_token="your_apify_api_key" # Or set the APIFY_API_TOKEN environment variable 10 )11 ],12 markdown=True13)1415# Use the agent to get website content16agent.print_response("What information can you find on https://kern.ndx.rocks/introduction ?", markdown=True)Available Apify Tools
You can easily integrate any Apify Actor as a tool. Here are some examples:
1. RAG Web Browser
The RAG Web Browser Actor is specifically designed for AI and LLM applications. It searches the web for a query or processes a URL, then cleans and formats the content for your agent. This tool is enabled by default.
1from kern.agent import Agent2from kern.tools.apify import ApifyTools34agent = Agent(5 tools=[6 ApifyTools(actors=["apify/rag-web-browser"])7 ],8 markdown=True9)1011# Search for information and process the results12agent.print_response("What are the latest developments in large language models?", markdown=True)2. Website Content Crawler
This tool uses Apify's Website Content Crawler Actor to extract text content from websites, making it perfect for RAG applications.
1from kern.agent import Agent2from kern.tools.apify import ApifyTools34agent = Agent(5 tools=[6 ApifyTools(actors=["apify/website-content-crawler"])7 ],8 markdown=True9)1011# Ask the agent to process web content12agent.print_response("Summarize the content from https://kern.ndx.rocks/introduction", markdown=True)3. Google Places Crawler
The Google Places Crawler extracts data about businesses from Google Maps and Google Places.
1from kern.agent import Agent2from kern.tools.apify import ApifyTools34agent = Agent(5 tools=[6 ApifyTools(actors=["compass/crawler-google-places"])7 ]8)910# Find business information in a specific location11agent.print_response("What are the top-rated restaurants in San Francisco?", markdown=True)12agent.print_response("Find coffee shops in Prague", markdown=True)Example Scenarios
RAG Web Browser + Google Places Crawler
This example combines web search with local business data to provide comprehensive information about a topic:
1from kern.agent import Agent2from kern.tools.apify import ApifyTools34agent = Agent(5 tools=[6 ApifyTools(actors=[7 "apify/rag-web-browser",8 "compass/crawler-google-places"9 ])10 ]11)1213# Get general information and local businesses14agent.print_response(15 """16 I'm traveling to Tokyo next month.17 1. Research the best time to visit and major attractions18 2. Find one good rated sushi restaurants near Shinjuku19 Compile a comprehensive travel guide with this information.20 """,21 markdown=True22)Toolkit Params
| Parameter | Type | Default | Description |
|---|---|---|---|
apify_api_token | str | None | Apify API token (or set via APIFY_API_TOKEN environment variable) |
actors | str or List[str] | None | Single Actor ID or list of Actor IDs to register |
Developer Resources
- View Tools