A Benchmark Is Only As Good As Its Sample
Not every number with a peer comparison attached deserves the same trust. Four things decide which ones do.
Benchmarking is only as credible as the data behind it. Sample size, how the cohort is built, whether the data is anonymized, and how often it refreshes are what separate a benchmark you can run a decision on from a number dressed up to look like one. Knowing what to ask is its own kind of protection.
Benchmarking has an obvious appeal, which means it also has an obvious failure mode: a number with a peer comparison bolted onto it that looks authoritative and is not. As more of the industry starts talking about benchmarks, it becomes worth being precise about what makes one credible, because the gap between a real benchmark and a convincing-looking one is not visible on the surface. They can read identically on a slide. The difference is entirely in how each was made.
A benchmark is, at bottom, a claim about how you compare to others. And that claim is only ever as strong as the data underneath it. Four things decide whether it holds up, and the good news is that all four are things you can ask about in plain language, without any special training. Knowing the four questions is its own protection, because it lets you tell a number you can act on from one you should set aside.
Sample Size, Stated Plainly
The first question to ask any benchmark is how many businesses are actually in it. A comparison drawn from a handful of operators is too thin to read much into, however confidently it is presented, because a few unusual businesses can swing a small sample badly. A trustworthy benchmark states its sample on every cut of the data, not just the headline, because the sample size is what gives any particular number its weight. The cohort of fifty tells you something the cohort of five cannot.
If you cannot see how many businesses stand behind a figure, it is hard to know how much to lean on it, and the honest move is to lean on it lightly until you can. This is not a high bar to ask for. A benchmark built with care knows its own sample sizes and is happy to show them. One that hides them, or only reports a single grand total while quietly slicing it thin underneath, is telling you something by what it does not say.
A Cohort That Actually Fits
The second question is who, exactly, you are being compared to. Industry-wide averages are easy to produce and useful for broad context, but a precise comparison needs more than that. What matters is whether the cohort genuinely fits your situation: similar revenue, similar job mix, similar market conditions. Grouping a regional water-mitigation business together with a national contents operation produces a number, but it compares two things that do not really belong side by side, and the number it produces can mislead more than it informs.
The cohort is where the real methodology of a benchmark lives, and building it well takes deliberate effort. It is the difference between "here is the industry average" and "here is where you stand among businesses that actually look like yours." The first is context. The second is something you can make a decision on. When you are handed a benchmark, it is entirely fair to ask how the comparison group was chosen, because that choice quietly determines whether the number means anything for you specifically.
Anonymized Before It Is Pooled
The third question is about trust, and it runs in both directions. A benchmark that requires operators to expose their data to competitors in order to participate will not attract honest, widespread participation, and it should not. The data behind a credible benchmark is anonymized before it is ever aggregated, so that contributors can see exactly where they stand without revealing their own numbers to anyone.
This is not only an ethics point, though it is that too. It is also what makes the sample large and honest enough to be worth trusting in the first place. People contribute candidly when contributing is safe, and a benchmark drawn from candid, widespread participation is simply more accurate than one drawn from a guarded few. So anonymization is not a nicety layered on top of a good benchmark. It is part of what makes the benchmark good, because it is what makes the underlying sample both large and truthful.
Fresh Enough To Act On
The fourth question is about time. A benchmark has to keep up with reality, and a comparison built on data from two years ago describes a market that may no longer exist. Costs move, cycle times move, the whole competitive landscape moves. The entire value of benchmarking is catching where you stand while you can still do something about it, and that means the data has to refresh on a cadence that matches how fast the business actually changes. A number that is accurate but stale can be quietly worse than no number at all, because it carries the confidence of data while pointing at a market that has moved on.
Key Finding
Sample size, a cohort that fits, anonymization, and a refresh rate that keeps up. Miss any one of the four, and the comparison gets harder to rely on, however good it looks.
The Four Questions, And One That Applies To Us Too
Here is the practical takeaway, and it is meant to be genuinely useful rather than a warning. The next time anyone, including us, puts a benchmark in front of you, you now have four plain questions that cut straight to whether it deserves your trust. How many businesses are in it? Are they actually like mine? Was the data anonymized so participation is safe and honest? And is it current enough to reflect the market I am operating in today?
State the sample. Build the cohort honestly. Anonymize the inputs. Keep it fresh. A benchmark that does all four is an instrument you can run a real decision on. One that skips any of them is a number whose basis is worth questioning before you act on it. And the right response to any benchmark is to ask how it was built, ours included. A benchmark that cannot answer those four questions clearly has not earned a place in your decision, no matter how good it looks on the page. Asking is not skepticism. It is just how you tell the difference, and you are entitled to make whoever hands you a number show their work.