> **Building with AI coding agents?** If you're using an AI coding agent, install the official Scalekit plugin. It gives your agent full awareness of the Scalekit API — reducing hallucinations and enabling faster, more accurate code generation.
>
> - **Claude Code**: `/plugin marketplace add scalekit-inc/claude-code-authstack` then `/plugin install <auth-type>@scalekit-auth-stack`
> - **GitHub Copilot CLI**: `copilot plugin marketplace add scalekit-inc/github-copilot-authstack` then `copilot plugin install <auth-type>@scalekit-auth-stack`
> - **Codex**: run the bash installer, restart, then open Plugin Directory and enable `<auth-type>`
> - **Skills CLI** (Windsurf, Cline, 40+ agents): `npx skills add scalekit-inc/skills --list` then `--skill <skill-name>`
>
> `<auth-type>` / `<skill-name>`: `agent-auth`, `full-stack-auth`, `mcp-auth`, `modular-sso`, `modular-scim` — [Full setup guide](https://docs.scalekit.com/dev-kit/build-with-ai/)

---

# Apify MCP

<div class="grid grid-cols-5 gap-4 items-center">
 <div class="col-span-4">
  Connect to Apify MCP to run web scraping, browser automation, and data extraction Actors directly from your AI workflows.
 </div>
 <div class="flex justify-center">
  <img src="https://cdn.scalekit.cloud/sk-connect/assets/provider-icons/apify.svg" width="64" height="64" alt="Apify MCP logo" />
 </div>
</div>

Supports authentication: Bearer Token

## Set up the agent connector

<SetupApifymcpSection />

## Usage

<UsageApifymcpSection />

## Tool list

## `apifymcp_call_actor`

Call any Actor from the Apify Store. By default waits for completion and returns results with a dataset preview. Use async mode to start a run in the background and get a runId immediately.

Workflow:

1. Use apifymcp_fetch_actor_details with output: `{"inputSchema": true}` to get the Actor's exact input schema
2. Call this tool with the actor name and input matching that schema exactly
3. Use apifymcp_get_actor_output with the returned datasetId to fetch full results if needed

For MCP server Actors, use format 'actorName:toolName' (e.g. 'apify/actors-mcp-server:fetch-apify-docs').
Use dedicated Actor tools (e.g. apifymcp_rag_web_browser) when available instead of this tool.

When NOT to use:

- You don't know the Actor's input schema — use apifymcp_fetch_actor_details first

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `actor` | string | Yes | Actor ID or full name in 'username/name' format (e.g. 'apify/rag-web-browser'). For MCP server Actors use 'actorName:toolName' format. |
| `async` | boolean | No | If true, starts the run and returns immediately with a runId. Use only when the user explicitly asks to run in the background or does not need immediate results. |
| `callOptions` | `object` | No | Optional run configuration options |
| `input` | `object` | Yes | Input JSON to pass to the Actor. Must match the Actor's input schema exactly — use apifymcp_fetch_actor_details with output: `{"inputSchema": true}` first to get the required fields and types. |
| `previewOutput` | boolean | No | When true (default), includes preview items in the response. Set to false when you plan to fetch full results separately via apifymcp_get_actor_output — avoids duplicate data and saves tokens. |

## `apifymcp_fetch_actor_details`

Get detailed information about an Actor by its ID or full name (format: 'username/name', e.g. 'apify/rag-web-browser').

WARNING: Omitting the 'output' parameter returns ALL fields including the full README, which can be extremely token-heavy. Always pass 'output' with only the fields you need. To get the input schema before calling an Actor, use: `{"inputSchema": true}`.

When to use:

- You need an Actor's input schema before calling it — use output: `{"inputSchema": true}`
- User wants details about a specific Actor (pricing, description, README)
- You need to list MCP tools provided by an MCP server Actor — use output: `{"mcpTools": true}`

When NOT to use:

- You already have the input schema and are ready to run — use apifymcp_call_actor directly

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `actor` | string | Yes | Actor ID or full name in 'username/name' format (e.g. 'apify/rag-web-browser') |
| `output` | `object` | No | JSON object with boolean flags to control which fields are returned. Always specify this to avoid a large token-heavy response. Set only the fields you need to true. Available fields: description, inputSchema, mcpTools, metadata, outputSchema, pricing, rating, readme, stats. All default to true if omitted (very large response) except mcpTools. Example: `{"inputSchema": true}` |

## `apifymcp_fetch_apify_docs`

Fetch the full content of an Apify or Crawlee documentation page by its URL. Use this after finding a relevant page with apifymcp_search_apify_docs.

When to use:

- You have a documentation URL and need the complete page content
- User asks for detailed documentation on a specific Apify or Crawlee page

When NOT to use:

- You don't have a URL yet — use apifymcp_search_apify_docs first

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `url` | string | Yes | Full URL of the Apify or Crawlee documentation page (e.g. 'https://docs.apify.com/platform/actors') |

## `apifymcp_get_actor_output`

Retrieve output dataset items from a specific Actor run using its datasetId. Supports field selection (including dot notation) and pagination.

When to use:

- You have a datasetId from an Actor run and need the full results
- The preview from apifymcp_call_actor didn't include all needed fields
- You need to paginate through large datasets

When NOT to use:

- You don't have a datasetId yet — run an Actor with apifymcp_call_actor first

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `datasetId` | string | Yes | Actor output dataset ID to retrieve from |
| `fields` | string | No | Comma-separated list of fields to include. Supports dot notation for nested fields (e.g. 'crawl.httpStatusCode,metadata.url'). Note: dot-notation fields are returned as flat keys in the output — e.g. requesting 'crawl.httpStatusCode' returns `{"crawl.httpStatusCode": 200}`, not a nested object. |
| `limit` | number | No | Maximum number of items to return (default: 100) |
| `offset` | number | No | Number of items to skip for pagination (default: 0) |

## `apifymcp_get_actor_run`

Get detailed information about a specific Actor run by runId. Returns run metadata (status, timestamps), performance stats, and resource IDs (datasetId, keyValueStoreId, requestQueueId).

When to use:

- You have a runId from apifymcp_call_actor (async mode) and want to check its status
- User asks about details of a specific run started outside the current conversation

When NOT to use:

- The run was just started via apifymcp_call_actor in sync mode — results are already in the response
- You want the output data — use apifymcp_get_actor_output with the datasetId

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `runId` | string | Yes | The ID of the Actor run |

## `apifymcp_rag_web_browser`

Web browser for AI agents and RAG pipelines. Queries Google Search, scrapes the top N pages, and returns content as Markdown. Can also scrape a specific URL directly.

When to use:

- User wants current/immediate data (e.g. 'Get flight prices for tomorrow', 'What's the weather today?')
- User needs to fetch specific content now (e.g. 'Fetch news from CNN', 'Get product info from Amazon')
- User has time indicators like 'today', 'current', 'latest', 'recent', 'now'

When NOT to use:

- User needs repeated/scheduled scraping of a specific platform — search for a dedicated Actor using apifymcp_search_actors instead

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `maxResults` | integer | No | Maximum number of top Google Search results to scrape and return. Ignored when query is a direct URL. Higher values increase response time and compute cost significantly — keep low (1-3) for latency-sensitive use cases. Default: 3. |
| `outputFormats` | `array<string>` | No | Output formats for the scraped page content. Options: 'markdown', 'text', 'html' (default: ['markdown']) |
| `query` | string | Yes | Google Search keywords or a specific URL to scrape. Supports advanced search operators. |

## `apifymcp_search_actors`

Search the Apify Store to FIND and DISCOVER what scraping tools/Actors exist for specific platforms or use cases. This tool provides INFORMATION about available Actors — it does NOT retrieve actual data or run any scraping tasks.

When to use:

- Find what scraping tools exist for a platform (e.g. 'What tools can scrape Instagram?')
- Discover available Actors for a use case (e.g. 'Find an Actor for Amazon products')
- Browse existing solutions before calling an Actor

When NOT to use:

- User wants immediate data retrieval — use apifymcp_rag_web_browser instead
- You already know the Actor ID — use apifymcp_fetch_actor_details or apifymcp_call_actor directly

Always do at least two searches: first with broad keywords, then with more specific terms if needed.

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `keywords` | string | No | Space-separated keywords to search Actors in the Apify Store. Use 1-3 simple terms (e.g. 'Instagram posts', 'Amazon products'). Avoid generic terms like 'scraper' or 'crawler'. Omitting keywords or passing an empty string returns popular/general Actors — always provide keywords for relevant results. |
| `limit` | integer | No | Maximum number of Actors to return (1-100, default: 5) |
| `offset` | integer | No | Number of results to skip for pagination (default: 0) |

## `apifymcp_search_apify_docs`

Search Apify and Crawlee documentation using full-text search. Use keywords only, not full sentences. Select the documentation source explicitly via docSource.

Sources:

- 'apify': Platform docs, SDKs (JS, Python), CLI, REST API, Academy, Actor development
- 'crawlee-js': Crawlee JavaScript web scraping library
- 'crawlee-py': Crawlee Python web scraping library

When to use:

- User asks how to use Apify APIs, SDK, or platform features
- You need to look up Apify or Crawlee documentation

When NOT to use:

- You already have a documentation URL — use apifymcp_fetch_apify_docs directly

| Name | Type | Required | Description |
| --- | --- | --- | --- |
| `docSource` | string | No | Documentation source to search. Options: 'apify' (default), 'crawlee-js', 'crawlee-py' |
| `limit` | number | No | Maximum number of results to return (1-20, default: 5) |
| `offset` | number | No | Offset for pagination (default: 0) |
| `query` | string | Yes | Algolia full-text search query using keywords only (e.g. 'standby actor', 'proxy configuration'). Do not use full sentences. |

---

## More Scalekit documentation

| Resource | What it contains | When to use it |
|----------|-----------------|----------------|
| [/llms.txt](/llms.txt) | Structured index with routing hints per product area | Start here — find which documentation set covers your topic before loading full content |
| [/llms-full.txt](/llms-full.txt) | Complete documentation for all Scalekit products in one file | Use when you need exhaustive context across multiple products or when the topic spans several areas |
| [sitemap-0.xml](https://docs.scalekit.com/sitemap-0.xml) | Full URL list of every documentation page | Use to discover specific page URLs you can fetch for targeted, page-level answers |
