Apify

This guide demonstrates how to integrate and use Apify Actors within the Kern framework to enhance your AI agents with web scraping, crawling, data extraction, and web automation capabilities.

What is Apify?

Apify is a platform that provides:

Data collection services for AI Agents, specializing in extracting data from social media, search engines, online maps, e-commerce sites, travel portals, or general websites
A marketplace of ready-to-use Actors (specialized tools) for various data tasks
Infrastructure to run and monetize our own AI Agents

Prerequisites

Sign up for an Apify account
Obtain your Apify API token (can be obtained from Apify)
Install the required packages:

1uv pip install kern-ai apify-client

Basic Usage

The Kern framework makes it easy to integrate Apify Actors into your agents. Here's a simple example:

1from kern.agent import Agent
2from kern.tools.apify import ApifyTools
3
4# Create an agent with ApifyTools
5agent = Agent(
6    tools=[
7        ApifyTools(
8            actors=["apify/rag-web-browser"],  # Specify which Apify Actors to use, use multiple ones if needed
9            apify_api_token="your_apify_api_key"  # Or set the APIFY_API_TOKEN environment variable 
10        )
11    ],
12        markdown=True
13)
14
15# Use the agent to get website content
16agent.print_response("What information can you find on https://kern.ndx.rocks/introduction ?", markdown=True)

Available Apify Tools

You can easily integrate any Apify Actor as a tool. Here are some examples:

1. RAG Web Browser

The RAG Web Browser Actor is specifically designed for AI and LLM applications. It searches the web for a query or processes a URL, then cleans and formats the content for your agent. This tool is enabled by default.

1from kern.agent import Agent
2from kern.tools.apify import ApifyTools
3
4agent = Agent(
5    tools=[
6        ApifyTools(actors=["apify/rag-web-browser"])
7    ],
8        markdown=True
9)
10
11# Search for information and process the results
12agent.print_response("What are the latest developments in large language models?", markdown=True)

2. Website Content Crawler

This tool uses Apify's Website Content Crawler Actor to extract text content from websites, making it perfect for RAG applications.

1from kern.agent import Agent
2from kern.tools.apify import ApifyTools
3
4agent = Agent(
5    tools=[
6        ApifyTools(actors=["apify/website-content-crawler"])
7    ],
8    markdown=True
9)
10
11# Ask the agent to process web content
12agent.print_response("Summarize the content from https://kern.ndx.rocks/introduction", markdown=True)

3. Google Places Crawler

The Google Places Crawler extracts data about businesses from Google Maps and Google Places.

1from kern.agent import Agent
2from kern.tools.apify import ApifyTools
3
4agent = Agent(
5    tools=[
6        ApifyTools(actors=["compass/crawler-google-places"])
7    ]
8)
9
10# Find business information in a specific location
11agent.print_response("What are the top-rated restaurants in San Francisco?", markdown=True)
12agent.print_response("Find coffee shops in Prague", markdown=True)

Example Scenarios

RAG Web Browser + Google Places Crawler

This example combines web search with local business data to provide comprehensive information about a topic:

1from kern.agent import Agent
2from kern.tools.apify import ApifyTools
3
4agent = Agent(
5    tools=[
6        ApifyTools(actors=[
7            "apify/rag-web-browser",
8            "compass/crawler-google-places"
9        ])
10    ]
11)
12
13# Get general information and local businesses
14agent.print_response(
15    """
16    I'm traveling to Tokyo next month.
17    1. Research the best time to visit and major attractions
18    2. Find one good rated sushi restaurants near Shinjuku
19    Compile a comprehensive travel guide with this information.
20    """,
21    markdown=True
22)

Toolkit Params

Parameter	Type	Default	Description
`apify_api_token`	`str`	`None`	Apify API token (or set via APIFY_API_TOKEN environment variable)
`actors`	`str` or `List[str]`	`None`	Single Actor ID or list of Actor IDs to register

Developer Resources

View Tools