What Clients Actually Want
Most senior stakeholders are not asking for more complexity. They want usable answers. If synthetic data is part of the workflow, they want to know what it is doing, what it informs, what it does not inform, and what checks sit behind it.
What becomes hard to recover from is when synthetic data is not disclosed properly, or when it is presented in a way that makes it look like natural data.
“How you’ve collected the data, the methodology should always be available to clients and should always be communicated in the delivery of the data,” says JT.
If synthetic data is being used, it should not be implied or assumed.
“We need to be explicit,” he says. “We need to say where and if synthetic data has been used and why.”
That is a trust issue more than a technical issue. A stakeholder can handle complexity when it is explained, and they can even accept uncertainty when the boundaries are clear. What they cannot accept is surprise.
One useful way to make that land is to call it what it is. It is modeled data. That wording draws attention to the fact that it should not be trusted in the same way as natural data, and it pushes the right follow-up questions.
JT also points to three red flags to watch for.
The first red flag is when synthetic data is not marked as such in reporting. If someone cannot tell whether they are looking at natural data, natural plus synthetic, or synthetic data only, then the work is already on the wrong side of disclosure.
The second red flag is structural. There is no validation or governance process around testing it to real data. This is the difference between a model being a tool and a model being a gamble. If there is no process to compare synthetic outputs to natural data, you do not have synthetic “insight.” You have synthetic output.
The third red flag is one that can quietly destroy a system over time. Models are trained on their own outputs. This is how you end up with models that agree with themselves, drift away from reality, and still look good in internal testing.
Those three red flags map directly to the most common ways synthetic causes damage: