Logo
find AI list
HomeTasksModelsAgentsToolsWorkflowsCompareStacksBlogFAQ
Promote Tool
Log in
Logo
find AI list

The intelligent platform for discovering, comparing, and deploying specialized AI capabilities. Built for the next generation of builders.

Platform

  • Capabilities
  • Stacks
  • Compare
  • Pricing

Company

  • About
  • Aether Protocol
  • Blog
  • Careers
  • Contact

Contribute

  • Deploy Capability
  • Manage Capability
  • Request Tool

Stay Synchronized

Get the latest AI capabilities in your inbox.

Β© 2026 findAIList. All rights reserved.

Privacy PolicyTerms of ServiceRefund Policy

SOLUTIONS HUB

7Solutions

🎯 Popular Tasks

πŸ“‚ Domains

1218
591
486
310
184
148
118
108
93
90
88
85
60
56
39

πŸ’° Pricing Model

All ToolsEvaluation
7 results
πŸ“‚

Evaluation

Browse AI tools related to Evaluation.

Core Capabilities

Patronus AI logo

Patronus AI

AI Infrastructure

Simulating the World's Intelligence to accelerate progress toward human-aligned AGI

Updated 13d ago
Has API
PricingFree
Free
LLM Testing
Hallucination Detection
AI Evaluation
Stanford HELM logo

Stanford HELM

AI Evaluation & Benchmarking

The industry-standard framework for holistic, multi-metric evaluation of large language models.

Updated 13d ago
Has API
PricingFree
Free
Automated Model Benchmarking
Bias and Toxicity Detection
Robustness Testing
Verity AI logo

Verity AI

Development & IT

Enterprise Hallucination Detection and Factual Verification Platform

Updated 13d ago
Has API
PricingEnterprise
Free to $850/yr
Verify factual accuracy
Monitor API outputs
Detect AI hallucinations
Braintrust (bt) logo

Braintrust (bt)

Development

The enterprise-grade stack for evaluating, logging, and refining AI applications with 10x developer velocity.

Updated 13d ago
Has API
PricingFreemium
Free to $100/yr
Automated AI Evaluation
Production LLM Logging
Dataset Management
Argilla logo

Argilla

Development

The open-source data curation platform for LLMs and Generative AI alignment.

Updated 13d ago
Has API
PricingOpen Source
Free to $30/yr
RLHF Data Collection
Model Evaluation
DPO Preference Ranking
Inspect logo

Inspect

AI Safety & Evaluation

The open-source framework for rigorous large language model evaluation and safety testing.

Updated 13d ago
Has API
PricingFree
Free
LLM Benchmarking
Safety Red Teaming
Agentic Workflow Testing
Tonic Validate logo

Tonic Validate

AI Evaluation

Open-source RAG evaluation tool for assessing accuracy, context quality, and latency of RAG systems.

Updated 13d ago
Has API
PricingFree
Free
RAG evaluation
Performance monitoring
Experiment tracking