Ollama Integration Example

📹 Watch the demo video to see this example in action!

Note: This video was recorded before the project was renamed from automcp to gen-mcp. The functionality remains the same—just replace automcp with genmcp in commands.

Overview

This example demonstrates how to expose Ollama, a popular local language model runtime, as MCP tools using gen-mcp. By wrapping Ollama’s functionality, you can enable AI assistants to interact with your local language models seamlessly—no custom server code required.

What You’ll Learn

How to expose HTTP APIs as MCP tools
How to wrap CLI commands as MCP tools
The difference between HTTP-based and CLI-based integrations
Core gen-mcp configuration concepts for tool definitions

Prerequisites

Ollama installed: Download from ollama.com
gen-mcp installed: See the Quick Start guide
Basic understanding of YAML: For configuration files

Two Integration Approaches

gen-mcp supports two different methods for integrating with Ollama, each with its own advantages:

HTTP-Based Integration (Recommended)

The HTTP approach calls Ollama’s REST API directly:

Advantages:

✅ More reliable with structured JSON responses
✅ Better error handling
✅ Supports advanced features like streaming control
✅ Easier to debug

Use when: You want production-grade integration with complete feature access

CLI-Based Integration

The CLI approach executes ollama commands directly:

Advantages:

✅ Simpler configuration
✅ Works without HTTP endpoint
✅ Familiar to command-line users

Use when: You need quick prototyping or prefer command-line interaction

HTTP-Based Integration Tutorial

Let’s walk through creating an HTTP-based Ollama integration step-by-step.

Step 1: Start Ollama

Ensure Ollama is running locally:

ollama serve

Ollama will start on http://localhost:11434 by default. You can verify it’s running:

curl http://localhost:11434
# Should return: "Ollama is running"

Step 2: Understanding the Configuration

GenMCP uses two separate files. Here’s the complete configuration:

MCP File (ollama-http-mcpfile.yaml):

kind: MCPToolDefinitions
schemaVersion: "0.2.0"
name: ollama
version: "1.0.0"
tools:
- name: generate
  title: "Generate a response"
  description: "Generates a response for a given prompt."
  inputSchema:
    type: object
    properties:
      model:
        type: string
        description: "The name of the model to use."
      prompt:
        type: string
        description: "The prompt to generate a response for."
      system:
        type: string
        description: "A system message to override the model's default behavior."
      template:
        type: string
        description: "The prompt template to use."
      context:
        type: string
        description: "The context from a previous response to maintain conversational memory."
      stream:
        type: boolean
        description: "Whether to stream the response back or not. Must be false."
      keep_alive:
        type: string
        description: "How long to keep the model loaded in memory."
    required:
    - model
    - prompt
    - stream
  invocation:
    http:
      method: POST
      url: http://localhost:11434/api/generate

- name: chat
  title: "Generate a chat response"
  description: "Generates a response for a chat-based conversation."
  inputSchema:
    type: object
    properties:
      model:
        type: string
        description: "The name of the model to use."
      messages:
        type: array
        items:
          type: object
          properties:
            role:
              type: string
              description: "The role: 'user' or 'assistant'."
            content:
              type: string
              description: "The message content."
          required:
          - role
          - content
      stream:
        type: boolean
        description: "Whether to stream the response back or not. Must be false."
      keep_alive:
        type: string
        description: "How long to keep the model loaded in memory."
    required:
    - model
    - messages
    - stream
  invocation:
    http:
      method: POST
      url: http://localhost:11434/api/chat

- name: tags
  title: "List downloaded models"
  description: "Lists all downloaded models."
  inputSchema:
    type: object
    properties: {}
  invocation:
    http:
      method: GET
      url: http://localhost:11434/api/tags

- name: show
  title: "Show model information"
  description: "Returns information about a model, including its Modelfile, template, and parameters."
  inputSchema:
    type: object
    properties:
      model:
        type: string
        description: "The name of the model to show."
    required:
    - model
  invocation:
    http:
      method: POST
      url: http://localhost:11434/api/show

- name: pull_model
  title: "Pull model"
  description: "Download a model from the ollama library."
  inputSchema:
    type: object
    properties:
      model:
        type: string
        description: "The name of the model to pull."
      stream:
        type: boolean
        description: "whether the response will be returned as a single response object. Must be false"
    required:
    - model
    - stream
  invocation:
    http:
      method: POST
      url: http://localhost:11434/api/pull

- name: running_models
  title: "Get running models"
  description: "List models that are currently loaded into memory"
  inputSchema:
    type: object
    properties:
  invocation:
    http:
      method: GET
      url: http://localhost:11434/api/ps

Server Config File (ollama-http-mcpserver.yaml):

kind: MCPServerConfig
schemaVersion: "0.2.0"
runtime:
  transportProtocol: streamablehttp
  streamableHttpConfig:
    port: 8009

Step 3: Configuration Breakdown

Let’s understand each section:

Runtime Configuration (Server Config File)

runtime:
  transportProtocol: streamablehttp
  streamableHttpConfig:
    port: 8009

transportProtocol: streamablehttp: Uses HTTP streaming protocol for real-time communication
port: 8009: The MCP server will listen on this port

Tool Definition Structure (MCP File)

Each tool follows this pattern:

- name: generate              # Unique tool identifier
  title: "Generate a response" # Human-readable name
  description: "..."           # What the tool does (for LLM understanding)
  inputSchema:                 # JSON Schema for input validation
    type: object
    properties:
      model:
        type: string
        description: "..."
    required:
    - model
  invocation:                  # How to execute the tool
    http:
      method: POST
      url: http://localhost:11434/api/generate

Key concepts:

name: Must be unique across all tools
description: Help LLMs understand when to use this tool
inputSchema: Validates inputs before calling Ollama
invocation: Maps the tool to an HTTP endpoint

Step 4: Run the MCP Server

Start the gen-mcp server with both configuration files:

genmcp run -f ollama-http-mcpfile.yaml -s ollama-http-mcpserver.yaml

You should see:

INFO    runtime/server.go:138	Setting up streamable HTTP server	{"port": 8009, "base_path": "/mcp", "stateless": true}
INFO    runtime/server.go:181	Starting MCP server on port 8009
INFO    runtime/server.go:196	Starting HTTP server

Step 5: Test Your Integration

You can now connect an MCP client (like Claude Desktop or any MCP-compatible tool) to http://localhost:8009/mcp and use the Ollama tools.

Example tool calls:

List available models:

{
  "tool": "tags"
}

Generate a completion:

{
  "tool": "generate",
  "arguments": {
    "model": "llama2",
    "prompt": "Explain quantum computing in simple terms",
    "stream": false
  }
}

CLI-Based Integration Tutorial

The CLI approach is simpler but more limited. Here’s the complete configuration:

CLI Configuration Files

MCP File (ollama-cli-mcpfile.yaml):

kind: MCPToolDefinitions
schemaVersion: "0.2.0"
name: Ollama
version: 0.0.1
tools:
- name: start_ollama
  title: Start Ollama
  description: Start ollama. Only run this if Ollama has not already started (use check_ollama_running).
  inputSchema:
    type: object
  invocation:
    cli:
      command: nohup ollama start > /dev/null 2>&1 &

- name: check_ollama_running
  title: Check if Ollama is Running
  description: Check if Ollama is running.
  inputSchema:
    type: object
  invocation:
    cli:
      command: curl http://localhost:11434 || echo "ollama is not running"

- name: pull_model
  title: Pull model
  description: Pull a model so that Ollama can use it
  inputSchema:
    type: object
    properties:
      model:
        type: string
        description: The name of the model to pull
  invocation:
    cli:
      command: ollama pull {model}

- name: list_models
  title: List models
  description: List all models ollama has pulled currently.
  inputSchema:
    type: object
  invocation:
    cli:
      command: ollama list

- name: generate_completion
  title: Generate completion
  description: Generate a completion from Ollama
  inputSchema:
    type: object
    properties:
      model:
        type: string
        description: The name of the model to use for the completion
      prompt:
        type: string
        description: The prompt to give the model
  invocation:
    cli:
      command: 'ollama run {model} {prompt}'
      templateVariables:
        prompt:
          format: '"{prompt}"'

Server Config File (ollama-cli-mcpserver.yaml):

kind: MCPServerConfig
schemaVersion: "0.2.0"
runtime:
  transportProtocol: streamablehttp
  streamableHttpConfig:
    port: 7008

Running the CLI-Based Server

genmcp run -f ollama-cli-mcpfile.yaml -s ollama-cli-mcpserver.yaml

Summary

Both HTTP and CLI integrations require two files:

MCP File: Defines the tools (what capabilities are available)
Server Config File: Defines runtime configuration (how the server runs)

For HTTP-based integrations, tools call Ollama’s REST API. For CLI-based integrations, tools execute shell commands directly.

Next Steps

Read the GenMCP Config File Format Guide: Deep dive into tool definitions and server configuration

Understanding Input Schema Validation

Input schemas ensure tools receive valid data before execution. Here’s how they work:

Basic Schema

inputSchema:
  type: object
  properties:
    model:
      type: string
      description: "Model name"
  required:
  - model

Validation behavior:

✅ {"model": "llama2"} - Valid
❌ {} - Missing required field
❌ {"model": 123} - Wrong type

Array Schema

inputSchema:
  type: object
  properties:
    messages:
      type: array
      items:
        type: object
        properties:
          role:
            type: string
          content:
            type: string
        required:
        - role
        - content

This validates complex nested structures like chat messages.

Common Patterns and Best Practices

Pattern 1: Tool Chaining

Design tools to work together:

- name: check_ollama_running
  description: "Check if Ollama is running. Run before other tools."

- name: pull_model
  description: "Download a model. Run check_ollama_running first."

Pattern 2: Safe Defaults

Require explicit flags for dangerous operations:

inputSchema:
  properties:
    stream:
      type: boolean
      description: "Must be false for MCP compatibility"
  required:
  - stream

Pattern 3: Clear Descriptions

Help LLMs understand tool usage:

description: "Download a model from the ollama library. This may take several minutes for large models. Always check if the model exists first using tags."

Troubleshooting

Issue: “Connection refused” error

Solution: Ensure Ollama is running:

ollama serve

Issue: “Model not found” error

Solution: Pull the model first:

ollama pull llama2

Issue: Tools not appearing in MCP client

Solution: Check the server is running and the port matches your client configuration.

Next Steps

Explore HTTP Conversion: Learn how to convert any REST API to MCP tools in the HTTP Conversion Example
Read the GenMCP Config File Format Guide: Deep dive into tool definitions and server configuration
Join the Community: Get help on Discord

Ollama Integration Example

Overview

What You’ll Learn

Prerequisites

Two Integration Approaches

HTTP-Based Integration (Recommended)

CLI-Based Integration

HTTP-Based Integration Tutorial

Step 1: Start Ollama

Step 2: Understanding the Configuration

Step 3: Configuration Breakdown

Runtime Configuration (Server Config File)

Tool Definition Structure (MCP File)

Step 4: Run the MCP Server

Step 5: Test Your Integration

CLI-Based Integration Tutorial

CLI Configuration Files

Running the CLI-Based Server

Summary

Next Steps

Understanding Input Schema Validation

Basic Schema

Array Schema

Common Patterns and Best Practices

Pattern 1: Tool Chaining

Pattern 2: Safe Defaults

Pattern 3: Clear Descriptions

Troubleshooting

Issue: “Connection refused” error

Issue: “Model not found” error

Issue: Tools not appearing in MCP client

Next Steps

Resources