Name: Automated Answer Evaluation for Dataset Questions - Zombie Bunny - n8n Workflows
Availability: InStock

This n8n workflow automates the process of evaluating whether the answers generated by an AI model match the reference answers in a dataset. It fetches question data from Google Sheets, uses OpenAI’s GPT-4 models to generate and evaluate answers, and calculates relevance metrics. The workflow is useful for QA teams, AI training, or benchmarking AI performance by systematically comparing generated responses to authoritative answers. The process includes data retrieval, similarity assessment, and conditional evaluation to optimize costs and improve accuracy in answer validation.

Node Count	11 – 20 Nodes
Nodes Used	@n8n/n8n-nodes-langchain.agent, @n8n/n8n-nodes-langchain.chatTrigger, @n8n/n8n-nodes-langchain.lmChatOpenAi, @n8n/n8n-nodes-langchain.openAi, evaluation, evaluationTrigger, noOp, set, stickyNote