Embeddings Configuration¶
Flexible, vendor-agnostic embeddings generation for MCP Gateway Registry's semantic search functionality.
Overview¶
The MCP Gateway Registry provides semantic search capabilities across MCP servers, tools, and AI agents. You can choose from three embedding provider options to power this search:
- Sentence Transformers (Default) - Local models
- OpenAI - Cloud embeddings via API
- Any LiteLLM-supported provider - Amazon Bedrock Titan, Cohere, and 100+ other models
Switch between providers with simple configuration changes - no code modifications required.
Features¶
- Vendor-agnostic: Switch between embeddings providers with configuration changes
- Local & Cloud Support: Use local models or cloud APIs (OpenAI, Cohere, Amazon Bedrock, etc.)
- Backward Compatible: Works seamlessly with existing FAISS indices
- Easy Configuration: Simple environment variable setup
- Extensible: Easy to add new providers
- Production-Ready: Terraform support for AWS deployments
Quick Start¶
Option 1: Sentence Transformers (Default)¶
Local embedding models that run on your infrastructure.
# In .env
EMBEDDINGS_PROVIDER=sentence-transformers
EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2
EMBEDDINGS_MODEL_DIMENSIONS=384
Characteristics: - Runs locally on your infrastructure - No API costs - No external network calls required - Requires CPU/GPU resources - Model files stored locally - Data stays within your infrastructure
Option 2: OpenAI¶
Cloud-based embedding service via OpenAI API.
# In .env
EMBEDDINGS_PROVIDER=litellm
EMBEDDINGS_MODEL_NAME=openai/text-embedding-ada-002
EMBEDDINGS_MODEL_DIMENSIONS=1536
EMBEDDINGS_API_KEY=sk-your-openai-api-key
Characteristics: - Cloud-based service - Requires API key - API costs per 1K tokens - No local compute resources needed - Network dependency - Data sent to OpenAI
Option 3: Amazon Bedrock Titan¶
Cloud-based embedding service via AWS Bedrock.
# In .env
EMBEDDINGS_PROVIDER=litellm
EMBEDDINGS_MODEL_NAME=bedrock/amazon.titan-embed-text-v1
EMBEDDINGS_MODEL_DIMENSIONS=1536
EMBEDDINGS_AWS_REGION=us-east-1
# No API key needed - uses IAM
Characteristics: - Cloud-based service - Uses IAM authentication (no API key required) - Integrates with AWS security model - API costs apply - Requires AWS credentials - Available in select AWS regions
Configuration¶
Environment Variables¶
| Variable | Description | Default | Required |
|---|---|---|---|
EMBEDDINGS_PROVIDER | Provider type: sentence-transformers or litellm | sentence-transformers | No |
EMBEDDINGS_MODEL_NAME | Model identifier | all-MiniLM-L6-v2 | Yes |
EMBEDDINGS_MODEL_DIMENSIONS | Embedding dimension | 384 | Yes |
EMBEDDINGS_API_KEY | API key for cloud provider (OpenAI, Cohere, etc.) | - | For cloud* |
EMBEDDINGS_API_BASE | Custom API endpoint (LiteLLM only) | - | No |
EMBEDDINGS_AWS_REGION | AWS region for Bedrock (LiteLLM only) | - | For Bedrock |
*Not required for AWS Bedrock - use standard AWS credential chain (IAM roles, environment variables, ~/.aws/credentials)
Terraform Configuration¶
For AWS ECS deployments, configure embeddings in your terraform.tfvars:
Using Sentence Transformers (Default)¶
# Local embeddings - no additional configuration needed
# Uses defaults: sentence-transformers with all-MiniLM-L6-v2
Using OpenAI¶
embeddings_provider = "litellm"
embeddings_model_name = "openai/text-embedding-ada-002"
embeddings_model_dimensions = 1536
embeddings_api_key = "sk-proj-YOUR-OPENAI-API-KEY"
Using Amazon Bedrock¶
embeddings_provider = "litellm"
embeddings_model_name = "bedrock/amazon.titan-embed-text-v1"
embeddings_model_dimensions = 1536
embeddings_aws_region = "us-east-1"
embeddings_api_key = "" # Empty for Bedrock (uses IAM)
See terraform/aws-ecs/terraform.tfvars.example for complete examples.
Supported Models¶
Sentence Transformers (Local)¶
| Model | Dimensions | Description |
|---|---|---|
all-MiniLM-L6-v2 | 384 | Fast, lightweight (default) |
all-mpnet-base-v2 | 768 | High quality |
paraphrase-multilingual-MiniLM-L12-v2 | 384 | Multilingual |
Any model from Hugging Face sentence-transformers is supported.
LiteLLM (Cloud-based)¶
LiteLLM supports 100+ embedding models from various providers:
OpenAI¶
openai/text-embedding-3-small(1536 dimensions)openai/text-embedding-3-large(3072 dimensions)openai/text-embedding-ada-002(1536 dimensions)
Cohere¶
cohere/embed-english-v3.0(1024 dimensions)cohere/embed-multilingual-v3.0(1024 dimensions)
Amazon Bedrock¶
bedrock/amazon.titan-embed-text-v1(1536 dimensions)bedrock/cohere.embed-english-v3(1024 dimensions)bedrock/cohere.embed-multilingual-v3(1024 dimensions)
Other Providers¶
- Azure OpenAI
- Anthropic (Claude)
- Google Vertex AI
- Hugging Face Inference API
- And 100+ more via LiteLLM
Migration Between Providers¶
Switching Providers¶
When you switch embedding providers or models with different dimensions, the registry automatically:
- Detects dimension mismatch
- Rebuilds the FAISS index
- Regenerates embeddings for all registered items
Example logs when switching from sentence-transformers (384) to OpenAI (1536):
WARNING: Embedding dimension mismatch detected
Expected: 384 (from existing index)
Got: 1536 (from current model)
Rebuilding FAISS index with new dimensions...
Regenerating embeddings for all items...
Index rebuild complete
No Code Changes Required¶
Just update your environment variables or Terraform configuration:
# From
EMBEDDINGS_PROVIDER=sentence-transformers
EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2
EMBEDDINGS_MODEL_DIMENSIONS=384
# To
EMBEDDINGS_PROVIDER=litellm
EMBEDDINGS_MODEL_NAME=openai/text-embedding-ada-002
EMBEDDINGS_MODEL_DIMENSIONS=1536
EMBEDDINGS_API_KEY=sk-your-key
Restart the service and the index will be automatically rebuilt.
AWS Bedrock Setup¶
IAM Permissions¶
For Amazon Bedrock embeddings, ensure your ECS task role has the following permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/amazon.titan-embed-text-v1"
]
}
]
}
Authentication Methods¶
IAM Roles (Recommended for ECS/EC2/EKS)
# No additional configuration needed
# ECS task, EC2 instance, or EKS pod automatically uses attached IAM role
Architecture¶
Embeddings Module Design¶
EmbeddingsClient (Abstract Base Class)
├── SentenceTransformersClient (Local models)
└── LiteLLMClient (Cloud APIs via LiteLLM)
Integration with FAISS Search¶
The embeddings module integrates seamlessly with the FAISS search service:
# In registry/search/service.py
from registry.embeddings import create_embeddings_client
class FaissService:
async def _load_embedding_model(self):
self.embedding_model = create_embeddings_client(
provider=settings.embeddings_provider,
model_name=settings.embeddings_model_name,
api_key=settings.embeddings_api_key,
aws_region=settings.embeddings_aws_region,
embedding_dimension=settings.embeddings_model_dimensions,
)
Performance Considerations¶
Local Models (Sentence Transformers)¶
- Runs on your infrastructure (CPU/GPU)
- No external API calls
- No per-request costs
- Model files stored locally
- Network-independent operation
Cloud APIs (LiteLLM)¶
- Runs on provider infrastructure
- Requires network connectivity
- API costs apply (varies by provider)
- No local compute requirements
- Data transmitted to provider
Troubleshooting¶
LiteLLM Not Installed¶
Solution:
Dimension Mismatch¶
Solution: Update EMBEDDINGS_MODEL_DIMENSIONS to match your model's actual output dimension. The system will automatically rebuild the index.
API Authentication Errors¶
OpenAI:
Bedrock:
# Verify AWS credentials
aws sts get-caller-identity
# Check Bedrock access
aws bedrock list-foundation-models --region us-east-1
Missing IAM Permissions¶
If using AWS ECS and Bedrock, ensure the task execution role has access to the embeddings API key secret:
# Check IAM policy in terraform/aws-ecs/modules/mcp-gateway/iam.tf
# Should include: aws_secretsmanager_secret.embeddings_api_key.arn
API Reference¶
Factory Function¶
from registry.embeddings import create_embeddings_client
client = create_embeddings_client(
provider: str, # "sentence-transformers" or "litellm"
model_name: str, # Model identifier
api_key: Optional[str] = None, # API key (litellm only)
aws_region: Optional[str] = None, # AWS region (Bedrock only)
embedding_dimension: Optional[int] = None,
)
Client Methods¶
Generate Embeddings:
embeddings = client.encode(["text1", "text2"])
# Returns: numpy array of shape (n_texts, embedding_dim)
Get Dimension:
Best Practices¶
- Choose the provider that matches your deployment requirements
- Consider IAM authentication if deploying on AWS
- Monitor costs when using cloud APIs - implement caching if needed
- Keep dimension consistent - changing models requires index rebuild
- Test search results after switching providers to ensure they meet your requirements
Further Reading¶
- LiteLLM Documentation
- OpenAI Embeddings Guide
- Amazon Bedrock Embeddings
- Sentence Transformers Models
- FAISS Search Implementation
Contributing¶
To add a new embeddings provider:
- Create a new client class inheriting from
EmbeddingsClient - Implement
encode()andget_embedding_dimension()methods - Update
create_embeddings_client()factory function - Add configuration options to
registry/core/config.py - Update this documentation
License¶
Apache 2.0 - See LICENSE file for details