Make a testable claim.
Test it through observation and experiment.
Update the claim to match reality.
The common-sense notion that we should assess claims about the world through observation appears in texts dating back to ancient civilizations. But it was the Scientific Revolution of the 16th and 17th centuries that formalized and popularized what we now call the scientific method.
We are all steeped in this paradigm. And, at first glance, it seems entirely reasonable to expect that AI policy be firmly grounded in evidence. Many proponents of this view argue there simply isn’t enough proof to justify alarm over various AI risks: from AI systems perpetuating bias and discrimination, to enabling the development of biological weapons by malicious actors. Precautionary measures, they claim, are often premature, overly restrictive, and even unscientific or unethical — likely to stifle innovation and block the benefits AI could bring to society.
Yet there are pitfalls to this approach — not because solid evidence is a poor basis for decision making, but because getting hold of it is often much easier said than done. For numerous reasons, this may be especially true in the case of AI.
Most AI safety studies are currently conducted in isolation from the real world: in computer labs where researchers assess AI models against standardized tests called “benchmarks.” Benchmarks have their virtues; they poke and prod at a model in structured ways to identify what might trigger unwanted behaviors. And the fact that they are standardized means researchers can use them to test many different models and understand the differences between them.
The catch is that society at large could hardly be less standardized. Benchmarks are necessarily simplified, and AI models, once released, will inevitably encounter a host of scenarios that have not been simulated in the lab.
This looks like a recipe for disaster when you consider the sheer complexity of AI models. Although AI research grew from the exact disciplines of mathematics and computer science, today’s models are much like “black boxes,” systems so complex that details of their inner workings elude even their creators. Advanced AI systems are less like clockwork machines and more like the weather or the economy; they can surprise you.
This problem closely resembles a longstanding issue in the biological sciences. When you study an organism in the wild, countless factors can shape its behavior, making it difficult to isolate cause and effect.
In contrast, controlled experiments in the lab provide more reliable, repeatable results, yet they do so in an artificial setting that may not capture natural complexity. For instance, mice in a laboratory often behave differently from mice in their natural habitats.
In much the same way, AI systems that perform well on controlled benchmarks can exhibit unforeseen behaviors – and downstream consequences – when deployed in the dynamic, real world.
Even the best-designed benchmark will struggle to capture every nuance of society. However, the AI research community’s notorious homogeneity – consisting largely of highly educated white and Asian men from a handful of countries – can make that challenge even harder. There is significant potential that the benchmarks they develop might fail to address safety concerns associated with cultural contexts other than their own.
For example, countless biases more prominent in the Global South may fly under the radar of safety tests and be perpetuated (or amplified) by AI models. As a 2022 paper focused on the case of India concluded, “analysis reveals that sentiment model predictions are significantly sensitive to regional, religious, and caste identities.”
More broadly, many potential AI risks might go unnoticed because most researchers have similar experiences with technology. This group is likely to have benefited disproportionately from technological development and probably views technology more positively than average.
This positive experience with technology could lead them to insufficiently consider its risks. This was borne out by a 2022 study, which analyzed the values in 100 prominent AI research papers and found that just two percent mentioned any negative potential of the technology.
The limitations of benchmarks and of those designing them should provide plenty of reason to question whether today's safety evidence is adequate.
But the bigger concern comes from the incentives of AI companies.
For decades, the tobacco industry invested zealously in generating “evidence” to cast doubt on the negative health effects of smoking.
AI firms today do a large fraction of AI safety research, but they have become increasingly opaque about their results, leading to a dearth of information about any risks they may have discovered. Companies are motivated by profit, and it is reasonable to worry that this might influence any industry’s decisions on whether to study and report on risks. The evolution of the for-profit AI industry should similarly come as no surprise. What had for decades been a relatively open, transparent research community has, since 2020, rapidly drawn the blinds around its work.
The tension between safety and profit isn't new; we've seen it play out in the tobacco and fossil fuel industries, where companies exploited early doubts about the harms of smoking and the reality of climate change.
For decades, the tobacco industry invested zealously in generating “evidence” to cast doubt on the negative health effects of smoking.
Angling to avoid regulations and protect their profits, various industries follow the same playbook today; studies have found strong general correlations between industry funding for research and the likelihood of that research yielding pro-industry conclusions.
It's not hard to see why AI firms, pursuing windfall profits, could adopt a similar playbook.
Recent reporting suggests that Meta, one of the world’s leading AI developers, may have gamed a benchmark to make its latest model series look better than it really is, quietly submitting a specially tuned version different from the one it actually released.
This example underscores a broader concern: AI companies are deeply entangled with the research ecosystem. According to a 2024 paper, AI publications by authors exclusively from industry are around 73 percent more likely to be highly cited than those by exclusively academic teams.
So inextricably entwined are AI research and industry that studies coming from academic institutions cannot be assumed to be untouched by industry incentives. A 2020 study of a handful of prestigious universities revealed that 88 percent of faculty members in AI and 97 percent in ethics had at some point either received funding from or been employed by Big Tech. To deny that this corporate influence might skew the evidence base would be naïve at best, willfully ignorant at worst.
The parallels between corporate AI safety research and the tobacco industry’s interference in medical research are informative, but there are limits to any analogy. Some leading figures in AI do in fact highlight the risks associated with the technology their companies are developing.
But this is not necessarily for altruistic reasons. One cynical interpretation posits that, by emphasizing, or even overstating, the risks and the difficulty of addressing them, industry leaders are positioning themselves as the only people who can solve the problem. If this convinces regulators to lean on these companies for guidance, it could give the industry more control over its own oversight and create a false sense that everything’s under control.
...these are process regulations, which relate to how companies can develop AI models. Process regulations are far less of a burden than substantive regulations, which limit what companies may produce.
In light of all these challenges — the difficulty of getting good evidence, and the forces that may be actively distorting or suppressing it — waiting to implement safety measures until there is clear proof of harm does not make sense. Instead, we should design policy that addresses the current weaknesses in evidence production and proactively incentivizes better data.
Such policies could include the creation of industry-independent institutes responsible for researching risks and how to address them. They could also require that developers register models with governing bodies, document risk-related behaviors, report on mitigation practices, and continuously monitor models’ impacts within society.
Such suggestions are certain to elicit cries of excessive bureaucracy from some corners, but it should be noted that these are process regulations, which relate to how companies can develop AI models. Process regulations are far less of a burden than substantive regulations, which limit what companies may produce. This distinction is far from unique to the AI field. For example, in the food industry, nutrition facts are process regulations, while ingredient bans are substantive regulations. Major AI developers, including OpenAI, Anthropic, and Google DeepMind, have already publicly stated their commitment to practices such as risk assessments and staged deployments. But big tech companies have failed to live up to voluntary safety agreements in the past. A key step would merely be to hold them to their own purported standards.
Some worry that any regulation might eventually lead to too many restrictions. However, rules focused solely on process aren’t likely to trigger that “slippery slope.” For major overreach to occur, lawmakers would have to add not just extra process steps, but new substantive regulations — a different ball game entirely. What’s more, the recommendations above are highly focused on obtaining robust evidence – a prerequisite for well-informed policy. This way, any stricter rules would be firmly grounded in data, which is exactly what the evidence-based policy community is asking for.
As the oft-invoked truism teaches: an absence of evidence is not evidence of absence. Risks often lurk for a long time before they manifest, and that quiet period can mislead us into thinking everything is safe. And if our best theories warn us of potential risks, that should be enough to warrant a thorough investigation.
S. J. Green, who headed research at British American Tobacco in the '60s and '70s — back when the company was busy concealing evidence about smoking’s dangers — later reflected that:
Scientific proof, of course, is not, should not, and never has been the proper basis for legal and political action on social issues. A demand for scientific proof is always a formula for inaction and delay and usually the first reaction of the guilty. The proper basis for such decisions is, of course, quite simply that which is reasonable in the circumstance.
Stephen Casper’s contributions to this article reflect his personal views and were made in his capacity as a PhD student at the Massachusetts Institute of Technology (MIT). The article is independent of any government-related work and does not represent the views of any government agency or department.
Continued sales of advanced AI chips allow China to deploy AI at massive scale.
Realizing AI’s full potential requires designing for opportunity—not just guarding against risk.