Skip to content

Embeddings Configuration

Flexible, vendor-agnostic embeddings generation for MCP Gateway Registry's semantic search functionality.

Overview

The MCP Gateway Registry provides semantic search capabilities across MCP servers, tools, and AI agents. You can choose from three embedding provider options to power this search:

  1. Sentence Transformers (Default) - Local models
  2. OpenAI - Cloud embeddings via API
  3. Any LiteLLM-supported provider - Amazon Bedrock Titan, Cohere, and 100+ other models

Switch between providers with simple configuration changes - no code modifications required.

Features

  • Vendor-agnostic: Switch between embeddings providers with configuration changes
  • Local & Cloud Support: Use local models or cloud APIs (OpenAI, Cohere, Amazon Bedrock, etc.)
  • Backward Compatible: Works seamlessly with existing FAISS indices
  • Easy Configuration: Simple environment variable setup
  • Extensible: Easy to add new providers
  • Production-Ready: Terraform support for AWS deployments

Quick Start

Option 1: Sentence Transformers (Default)

Local embedding models that run on your infrastructure.

# In .env
EMBEDDINGS_PROVIDER=sentence-transformers
EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2
EMBEDDINGS_MODEL_DIMENSIONS=384

Characteristics: - Runs locally on your infrastructure - No API costs - No external network calls required - Requires CPU/GPU resources - Model files stored locally - Data stays within your infrastructure

Option 2: OpenAI

Cloud-based embedding service via OpenAI API.

# In .env
EMBEDDINGS_PROVIDER=litellm
EMBEDDINGS_MODEL_NAME=openai/text-embedding-ada-002
EMBEDDINGS_MODEL_DIMENSIONS=1536
EMBEDDINGS_API_KEY=sk-your-openai-api-key

Characteristics: - Cloud-based service - Requires API key - API costs per 1K tokens - No local compute resources needed - Network dependency - Data sent to OpenAI

Option 3: Amazon Bedrock Titan

Cloud-based embedding service via AWS Bedrock.

# In .env
EMBEDDINGS_PROVIDER=litellm
EMBEDDINGS_MODEL_NAME=bedrock/amazon.titan-embed-text-v1
EMBEDDINGS_MODEL_DIMENSIONS=1536
EMBEDDINGS_AWS_REGION=us-east-1
# No API key needed - uses IAM

Characteristics: - Cloud-based service - Uses IAM authentication (no API key required) - Integrates with AWS security model - API costs apply - Requires AWS credentials - Available in select AWS regions

Configuration

Environment Variables

Variable Description Default Required
EMBEDDINGS_PROVIDER Provider type: sentence-transformers or litellm sentence-transformers No
EMBEDDINGS_MODEL_NAME Model identifier all-MiniLM-L6-v2 Yes
EMBEDDINGS_MODEL_DIMENSIONS Embedding dimension 384 Yes
EMBEDDINGS_API_KEY API key for cloud provider (OpenAI, Cohere, etc.) - For cloud*
EMBEDDINGS_API_BASE Custom API endpoint (LiteLLM only) - No
EMBEDDINGS_AWS_REGION AWS region for Bedrock (LiteLLM only) - For Bedrock

*Not required for AWS Bedrock - use standard AWS credential chain (IAM roles, environment variables, ~/.aws/credentials)

Terraform Configuration

For AWS ECS deployments, configure embeddings in your terraform.tfvars:

Using Sentence Transformers (Default)

# Local embeddings - no additional configuration needed
# Uses defaults: sentence-transformers with all-MiniLM-L6-v2

Using OpenAI

embeddings_provider         = "litellm"
embeddings_model_name       = "openai/text-embedding-ada-002"
embeddings_model_dimensions = 1536
embeddings_api_key          = "sk-proj-YOUR-OPENAI-API-KEY"

Using Amazon Bedrock

embeddings_provider         = "litellm"
embeddings_model_name       = "bedrock/amazon.titan-embed-text-v1"
embeddings_model_dimensions = 1536
embeddings_aws_region       = "us-east-1"
embeddings_api_key          = ""  # Empty for Bedrock (uses IAM)

See terraform/aws-ecs/terraform.tfvars.example for complete examples.

Supported Models

Sentence Transformers (Local)

Model Dimensions Description
all-MiniLM-L6-v2 384 Fast, lightweight (default)
all-mpnet-base-v2 768 High quality
paraphrase-multilingual-MiniLM-L12-v2 384 Multilingual

Any model from Hugging Face sentence-transformers is supported.

LiteLLM (Cloud-based)

LiteLLM supports 100+ embedding models from various providers:

OpenAI

  • openai/text-embedding-3-small (1536 dimensions)
  • openai/text-embedding-3-large (3072 dimensions)
  • openai/text-embedding-ada-002 (1536 dimensions)

Cohere

  • cohere/embed-english-v3.0 (1024 dimensions)
  • cohere/embed-multilingual-v3.0 (1024 dimensions)

Amazon Bedrock

  • bedrock/amazon.titan-embed-text-v1 (1536 dimensions)
  • bedrock/cohere.embed-english-v3 (1024 dimensions)
  • bedrock/cohere.embed-multilingual-v3 (1024 dimensions)

Other Providers

  • Azure OpenAI
  • Anthropic (Claude)
  • Google Vertex AI
  • Hugging Face Inference API
  • And 100+ more via LiteLLM

Migration Between Providers

Switching Providers

When you switch embedding providers or models with different dimensions, the registry automatically:

  1. Detects dimension mismatch
  2. Rebuilds the FAISS index
  3. Regenerates embeddings for all registered items

Example logs when switching from sentence-transformers (384) to OpenAI (1536):

WARNING: Embedding dimension mismatch detected
  Expected: 384 (from existing index)
  Got: 1536 (from current model)
Rebuilding FAISS index with new dimensions...
Regenerating embeddings for all items...
Index rebuild complete

No Code Changes Required

Just update your environment variables or Terraform configuration:

# From
EMBEDDINGS_PROVIDER=sentence-transformers
EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2
EMBEDDINGS_MODEL_DIMENSIONS=384

# To
EMBEDDINGS_PROVIDER=litellm
EMBEDDINGS_MODEL_NAME=openai/text-embedding-ada-002
EMBEDDINGS_MODEL_DIMENSIONS=1536
EMBEDDINGS_API_KEY=sk-your-key

Restart the service and the index will be automatically rebuilt.

AWS Bedrock Setup

IAM Permissions

For Amazon Bedrock embeddings, ensure your ECS task role has the following permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel"
      ],
      "Resource": [
        "arn:aws:bedrock:*::foundation-model/amazon.titan-embed-text-v1"
      ]
    }
  ]
}

Authentication Methods

IAM Roles (Recommended for ECS/EC2/EKS)

# No additional configuration needed
# ECS task, EC2 instance, or EKS pod automatically uses attached IAM role

Architecture

Embeddings Module Design

EmbeddingsClient (Abstract Base Class)
├── SentenceTransformersClient (Local models)
└── LiteLLMClient (Cloud APIs via LiteLLM)

The embeddings module integrates seamlessly with the FAISS search service:

# In registry/search/service.py
from registry.embeddings import create_embeddings_client

class FaissService:
    async def _load_embedding_model(self):
        self.embedding_model = create_embeddings_client(
            provider=settings.embeddings_provider,
            model_name=settings.embeddings_model_name,
            api_key=settings.embeddings_api_key,
            aws_region=settings.embeddings_aws_region,
            embedding_dimension=settings.embeddings_model_dimensions,
        )

Performance Considerations

Local Models (Sentence Transformers)

  • Runs on your infrastructure (CPU/GPU)
  • No external API calls
  • No per-request costs
  • Model files stored locally
  • Network-independent operation

Cloud APIs (LiteLLM)

  • Runs on provider infrastructure
  • Requires network connectivity
  • API costs apply (varies by provider)
  • No local compute requirements
  • Data transmitted to provider

Troubleshooting

LiteLLM Not Installed

RuntimeError: LiteLLM is not installed. Install it with: uv add litellm

Solution:

uv add litellm

Dimension Mismatch

WARNING: Embedding dimension mismatch: expected 384, got 1536

Solution: Update EMBEDDINGS_MODEL_DIMENSIONS to match your model's actual output dimension. The system will automatically rebuild the index.

API Authentication Errors

OpenAI:

# Verify API key is set correctly
echo $EMBEDDINGS_API_KEY
# Should start with sk-

Bedrock:

# Verify AWS credentials
aws sts get-caller-identity

# Check Bedrock access
aws bedrock list-foundation-models --region us-east-1

Missing IAM Permissions

If using AWS ECS and Bedrock, ensure the task execution role has access to the embeddings API key secret:

# Check IAM policy in terraform/aws-ecs/modules/mcp-gateway/iam.tf
# Should include: aws_secretsmanager_secret.embeddings_api_key.arn

API Reference

Factory Function

from registry.embeddings import create_embeddings_client

client = create_embeddings_client(
    provider: str,                    # "sentence-transformers" or "litellm"
    model_name: str,                  # Model identifier
    api_key: Optional[str] = None,    # API key (litellm only)
    aws_region: Optional[str] = None, # AWS region (Bedrock only)
    embedding_dimension: Optional[int] = None,
)

Client Methods

Generate Embeddings:

embeddings = client.encode(["text1", "text2"])
# Returns: numpy array of shape (n_texts, embedding_dim)

Get Dimension:

dim = client.get_embedding_dimension()
# Returns: int (e.g., 384, 1536)

Best Practices

  1. Choose the provider that matches your deployment requirements
  2. Consider IAM authentication if deploying on AWS
  3. Monitor costs when using cloud APIs - implement caching if needed
  4. Keep dimension consistent - changing models requires index rebuild
  5. Test search results after switching providers to ensure they meet your requirements

Further Reading

Contributing

To add a new embeddings provider:

  1. Create a new client class inheriting from EmbeddingsClient
  2. Implement encode() and get_embedding_dimension() methods
  3. Update create_embeddings_client() factory function
  4. Add configuration options to registry/core/config.py
  5. Update this documentation

License

Apache 2.0 - See LICENSE file for details