Validation Methodology

How empirical evidence is collected, qualified and weighted in the evolution of the Coaching Trust Framework.


In plain words

CTF claims to grow inductively — its common core is meant to expand only by absorbing what has been observed to work in practice across multiple declinations. For this claim to be more than a slogan, evidence has to be treated as a first-class object: it has to be collected systematically, qualified honestly, and weighed against itself when different observations point in different directions.

This document describes how all that works. It defines what counts as evidence in the context of CTF, what kinds of evidence are recognized, what makes evidence strong or weak, and how the various people involved — adopters, supervisors, researchers, the scientific panel, the caretaker — handle it. The goal is to make the inductive process of CTF as rigorous as it is collaborative.

The methodology is itself revisable. As CTF and its evidence base mature, the methodology will be refined through RFCs.

🆕 The EXPLAINER introduces the inductive philosophy of CTF in plain language. This document goes deeper into how evidence is actually handled.


1. Why empirical evidence matters in CTF

CTF is inductive by design. Its common core does not grow by decree but by validated convergence — when the same requirement is observed to be present and working across multiple declinations. This makes evidence the currency of the framework’s evolution.

Without serious evidence handling:

With serious evidence handling:

2. Types of evidence recognized

CTF recognizes the following types of empirical evidence, in roughly decreasing order of typical evidentiary weight:

2.1 Implementation observation

Documented observations from a platform implementing CTF, describing the actual behaviour of an AI coaching agent in production, the design decisions behind it, the audit findings, and the platform’s interpretation. Sent through the feedback template. This is the primary form of evidence in CTF; the framework’s whole inductive logic rests on it.

2.2 Supervision incident

A report from a certified supervisor describing a specific situation observed during the audit of a deployed agent — typically a case where the agent behaved unexpectedly or where existing CTF expectations proved unclear. Sent through the feedback template, with explicit mention of the supervisor’s qualification and independence.

2.3 Federation review

An analysis or position paper published by a professional federation that has either authored or endorsed a CTF declination, describing what the federation has observed in the application of its declination by adopters in its sphere.

2.4 Academic study

A peer-reviewed academic study that examines AI coaching agents, ethical frameworks, or the application of CTF or comparable frameworks. Reference to the publication, the methodology, the sample, the limitations.

2.5 Meta-analysis or systematic review

A study that aggregates findings from multiple primary sources — particularly valuable for assessing whether a tendency observed in one implementation is generalizable.

2.6 Regulatory or institutional feedback

Communications from regulators, ethics boards, professional ombudspersons, or other institutional actors regarding the application of CTF or comparable frameworks. These are typically less frequent but carry significant weight when they occur.

3. Quality criteria

Each piece of evidence is evaluated against the following criteria:

These criteria are guidelines, not a scoring algorithm. Quality assessment is qualitative and is performed by the scientific panel when the panel exists.

4. Weighting in decisions

When the caretaker (or any future deciding body) is faced with a decision on a substantive RFC, the weighting of evidence follows roughly the following logic:

5. The evidence log

All evidence accepted into the framework’s reasoning is recorded in the public evidence log. The log is:

6. Roles in evidence handling

6.1 Adopters

Adopters are the primary source of implementation observations. They are expected to send observations through the feedback template on a periodicity appropriate to their scale of operation (typically quarterly or semi-annually).

6.2 Certified supervisors

Certified supervisors are encouraged to share observations from their audit work, anonymized appropriately, especially when the audit surfaces something CTF expectations did not anticipate.

6.3 Researchers

Independent researchers studying AI coaching agents are encouraged to publish their work and to submit relevant publications (or summaries thereof) to the evidence log. CTF recognizes that academic timelines are slower than software development; periodic synthesis is welcomed.

6.4 Scientific panel (when constituted)

The scientific panel reviews the quality of evidence cited in substantive RFCs, especially promotion RFCs. Its opinions are non-binding but published. The panel does not collect evidence; it qualifies the evidence already in the log or attached to an RFC.

6.5 Caretaker

The caretaker is responsible for keeping the evidence log in good order: integrating accepted submissions, removing duplicates, flagging contested entries, ensuring that anonymization is respected.

6.6 Consultative council (when constituted)

The consultative council does not directly handle evidence. It draws on the evidence log when forming its non-binding opinions on substantive RFCs.

7. Two traditions, one method

The methodology of CTF combines two traditions of validation:

7.1 The evidence-based scientific tradition

From this tradition, CTF takes:

7.2 The technical standards tradition (W3C, IETF)

From this tradition, CTF takes:

Combining the two traditions gives CTF a methodology that is rigorous without being academic, and open without being lax.

8. Limits of the methodology

This methodology has limits that should be acknowledged:

9. Revisions

This methodology document is itself subject to RFCs. As CTF and its evidence base mature, the methodology will evolve. Revisions follow the rules in ../GOVERNANCE.md.


The integrity of CTF as a common good rests in part on the integrity of how it handles evidence. This document is the place where that integrity is made explicit and revisable.