Smart Local Model Routing for Privacy-Focused AI Chat

This n8n workflow automates intelligent routing of user prompts to appropriate locally hosted large language models (LLMs) using Ollama. When a chat message is received via webhook, the workflow analyzes the user input through an LLM router node, which classifies the task and selects the best-suited Ollama model based on predefined rules and capabilities. Depending on the classification, it dynamically assigns either a text-only, code, vision, or combined model for processing. Conversations are maintained with memory buffers to ensure context-aware responses. Designed for privacy-conscious users and AI enthusiasts, this setup allows seamless, local AI model management without external data transmission, making it ideal for secure AI deployment and customized model utilization.

Node Count	11 – 20 Nodes
Nodes Used	@n8n/n8n-nodes-langchain.agent, @n8n/n8n-nodes-langchain.chatTrigger, @n8n/n8n-nodes-langchain.lmChatOllama, @n8n/n8n-nodes-langchain.memoryBufferWindow, stickyNote