Introduction
Ollama allows you to run large language models locally on your own hardware. By integrating Ollama with Aikeedo, you can provide AI capabilities without relying on cloud services, ensuring data privacy and reducing costs.Currently, Ollama integration is available only for the Chat tool in Aikeedo.
About Ollama
Ollama is an open-source project that simplifies running and managing large language models locally. It offers:- Easy installation and setup
- Support for various open-source models
- Local execution for enhanced privacy
- Custom model configuration
- REST API for integration
Setting Up Ollama
Step 1: Install Ollama
For detailed installation instructions and troubleshooting, refer to the Ollama Installation Guide.
Step 2: Configure Your Model
- Start the Ollama service
- Pull your desired model. For example:
Step 3: Configure Server Address
By default, Ollama runs on:- Configure your firewall to allow access to port 11434
- Set up proper security measures as Ollama doesn’t include authentication by default
When exposing Ollama to remote access, ensure you implement appropriate security measures to protect your server.
Integrating with Aikeedo
Step 1: Configure Aikeedo
- Log in to your Aikeedo admin panel
- Navigate to Settings → Integrations → Ollama
- Enter your Ollama server address (e.g.,
http://localhost:11434
) - Add your models:
- Key: A unique identifier (e.g.,
ollama/llama2:latest
) - Name: Display name for the model
- Provider: Set as “Meta” for Llama models, or appropriate provider
- Key: A unique identifier (e.g.,
- Click “Save changes”
The server address must include the scheme (http/https), host, and port number.
Available Models
Popular models you can use with Ollama include:- Llama 2: Meta’s open-source model series
- Mistral: High-performance 7B model
- Phi: Microsoft’s compact yet powerful model
- Neural Chat: Optimized for conversation
- Code Llama: Specialized for programming tasks
- Vicuna: Fine-tuned for chat and assistance
Model Configuration
You can customize model behavior using Modelfiles. Example:Hardware Requirements
Minimum requirements vary by model:- 7B models: 8GB RAM
- 13B models: 16GB RAM
- 33B+ models: 32GB+ RAM
- NVIDIA GPUs with CUDA
- AMD GPUs with ROCm
- Apple Silicon (Metal)
Best Practices
-
Model Selection:
- Start with smaller models (7B) and test performance
- Consider your hardware capabilities
- Choose models based on your specific use case
-
Performance:
- Enable GPU acceleration when available
- Monitor system resources
- Adjust model parameters for optimal performance
-
Security:
- Implement proper firewall rules
- Use reverse proxy for additional security
- Regular system updates
Troubleshooting
Common issues and solutions:-
Connection Issues:
- Verify Ollama service is running
- Check server address format
- Confirm firewall settings
-
Model Loading Failures:
- Ensure sufficient system resources
- Verify model is properly pulled
- Check model compatibility
-
Performance Issues:
- Monitor system resources
- Consider using a smaller model
- Adjust model parameters
Additional Resources
Keep Ollama and your models updated for the best performance and security.