Ollama Integration
Run open-source large language models locally with Ollama integration. This guide explains how to set up and configure Ollama for your Aikeedo platform.
Introduction
Ollama allows you to run large language models locally on your own hardware. By integrating Ollama with Aikeedo, you can provide AI capabilities without relying on cloud services, ensuring data privacy and reducing costs.
Currently, Ollama integration is available only for the Chat tool in Aikeedo.
About Ollama
Ollama is an open-source project that simplifies running and managing large language models locally. It offers:
- Easy installation and setup
- Support for various open-source models
- Local execution for enhanced privacy
- Custom model configuration
- REST API for integration
For more information, visit the official Ollama website.
Setting Up Ollama
Step 1: Install Ollama
For Apple Silicon (M1/M2) Macs, Metal GPU acceleration is supported automatically.
For detailed installation instructions and troubleshooting, refer to the Ollama Installation Guide.
Step 2: Configure Your Model
- Start the Ollama service
- Pull your desired model. For example:
View all available models at Ollama Model Library.
Step 3: Configure Server Address
By default, Ollama runs on:
For remote access, you’ll need to:
- Configure your firewall to allow access to port 11434
- Set up proper security measures as Ollama doesn’t include authentication by default
When exposing Ollama to remote access, ensure you implement appropriate security measures to protect your server.
Integrating with Aikeedo
Step 1: Configure Aikeedo
- Log in to your Aikeedo admin panel
- Navigate to Settings → Integrations → Ollama
- Enter your Ollama server address (e.g.,
http://localhost:11434
) - Add your models:
- Key: A unique identifier (e.g.,
ollama/llama2:latest
) - Name: Display name for the model
- Provider: Set as “Meta” for Llama models, or appropriate provider
- Key: A unique identifier (e.g.,
- Click “Save changes”
The server address must include the scheme (http/https), host, and port number.
Available Models
Popular models you can use with Ollama include:
- Llama 2: Meta’s open-source model series
- Mistral: High-performance 7B model
- Phi: Microsoft’s compact yet powerful model
- Neural Chat: Optimized for conversation
- Code Llama: Specialized for programming tasks
- Vicuna: Fine-tuned for chat and assistance
For a complete list, check the Ollama Model Library.
Model Configuration
You can customize model behavior using Modelfiles. Example:
Learn more about model configuration in the Ollama Documentation.
Hardware Requirements
Minimum requirements vary by model:
- 7B models: 8GB RAM
- 13B models: 16GB RAM
- 33B+ models: 32GB+ RAM
GPU acceleration is supported for:
- NVIDIA GPUs with CUDA
- AMD GPUs with ROCm
- Apple Silicon (Metal)
Best Practices
-
Model Selection:
- Start with smaller models (7B) and test performance
- Consider your hardware capabilities
- Choose models based on your specific use case
-
Performance:
- Enable GPU acceleration when available
- Monitor system resources
- Adjust model parameters for optimal performance
-
Security:
- Implement proper firewall rules
- Use reverse proxy for additional security
- Regular system updates
Troubleshooting
Common issues and solutions:
-
Connection Issues:
- Verify Ollama service is running
- Check server address format
- Confirm firewall settings
-
Model Loading Failures:
- Ensure sufficient system resources
- Verify model is properly pulled
- Check model compatibility
-
Performance Issues:
- Monitor system resources
- Consider using a smaller model
- Adjust model parameters
Additional Resources
Keep Ollama and your models updated for the best performance and security.
Was this page helpful?