Every enterprise software vendor now leads with AI. But the term covers a remarkable range of capabilities — from genuine machine learning integrated into core workflows, to basic automation relabelled for the current market cycle. Here is how to evaluate AI claims with rigour.

The AI Labelling Problem

The enterprise software market has a labelling problem. In the past three years, AI has been attached to capabilities that range from genuinely transformative to trivially incremental. The same term — AI-powered — appears in marketing materials for platforms that use foundational large language models for complex reasoning tasks, and for platforms that use rules-based logic to surface pre-defined recommendations.

These are not equivalent. They have different capability profiles, different risk profiles, different cost structures, and different implications for the organization’s data and operations. Evaluating them requires asking questions that vendor marketing materials are not structured to answer.

A Framework for AI Capability Assessment

Step 1: Classify the AI Claim

Before evaluating any AI capability, classify what type of AI it actually represents. Useful categories include:

Each category carries different performance characteristics, different data requirements, different governance implications, and different failure modes. Conflating them produces unrealistic expectations in both directions.

Step 2: Assess Training Data and Model Provenance

For AI capabilities that involve trained models, the provenance of training data is a material evaluation criterion. Questions to ask:

“An AI feature that improves through exposure to your organizational data is also an AI feature that learns from your organizational data. The governance implications are significant.”

Step 3: Test Under Realistic Conditions

Vendor demonstrations of AI capabilities are conducted under optimised conditions: carefully selected input data, use cases where the model performs well, and a presentation environment that smooths over edge cases.

Independent evaluation of AI capabilities requires testing under realistic conditions: using representative samples of the organization’s actual data and use cases, deliberately testing edge cases and ambiguous inputs, and evaluating performance across the full range of scenarios the organization will actually encounter.

This is the critical difference between watching a demo and conducting an evaluation. AI capabilities that look impressive in vendor-controlled demonstrations frequently reveal significant limitations when tested against real organizational requirements.

Step 4: Assess Explainability and Governance

For organizations in regulated sectors, AI governance is not optional. Financial services regulators in multiple jurisdictions require that automated decision-making processes be explainable — the organization must be able to demonstrate why a system reached a particular conclusion.

Key governance questions for AI capabilities include:

Step 5: Evaluate the AI Roadmap

AI capabilities in enterprise software are evolving rapidly. A platform’s current AI feature set is less important than the credibility and direction of its AI roadmap. Assessment questions include:

The Governance Imperative

Organizations that are embedding AI capabilities into operational workflows are implicitly accepting governance obligations: to monitor performance, to manage errors, to maintain explainability, and to ensure that AI-assisted decisions meet the same standards of accountability as human-made ones.

Evaluating whether a vendor’s AI capabilities are mature enough to support those obligations — not just whether they perform impressively in a demonstration — is the central question of rigorous AI capability assessment.