The Limits of Regulating AI Through Compute Thresholds
The recent release of DeepSeek, a Chinese large language model that outperforms other models without relying on massive computational resources, has sparked widespread attention. As artificial intelligence (AI) becomes increasingly embedded in our everyday lives, relying primarily on compute thresholds for regulation risks missing the bigger picture of how these systems actually affect people and society.
Today’s frontier models may depend on vast computational resources, but tomorrow’s breakthroughs could emerge from more efficient architectures, novel training methods, or improved data quality. Algorithmic systems with relatively low computational demands have already been shaping decisions in policing, welfare, immigration, education, and employment, often producing significant harm, especially for marginalized groups.

Relying on technical thresholds, such as compute or dataset size, in AI regulation creates a system that will quickly become outdated. Instead, we should focus on context-specific, harm-based human rights approaches that evaluate actual impacts on people and communities. The use of compute thresholds, measured in floating-point operations (FLOPs), as a proxy for model capability and a trigger for regulatory oversight, has limitations. FLOPs alone do not capture the full range of a model’s capabilities or its potential for harm.
There are two key limitations to using compute as the primary metric to assess AI risks and harms. First, the assumption that more FLOPs make for more powerful models is flawed. Models like DeepSeek demonstrate that high performance doesn’t necessarily require massive computational resources. Second, the assumption that more FLOPs equal more problems is also incorrect. Risk is not solely a function of model size or training intensity but of the context of deployment, the design of the system, the data used, and the social structures it interacts with.
The emergence of models like DeepSeek challenges the core assumption behind compute-based regulation that greater computational resources directly equate to higher model capabilities or risks. This undermines the notion that thresholds based on FLOPs can reliably indicate an AI system’s potential impact. As the link between compute and capability becomes increasingly tenuous, it becomes clear that smaller, more efficiently engineered models can still pose significant real-world risks.
Recommendations for a Stronger Human Rights-Based Framework
- Use FLOPs as one input among many to assess model risk, rather than relying on them as the main indicator.
- Combine compute-based thresholds with capability evaluations and impact assessments that examine real-world effects.
- Introduce dynamic regulatory thresholds that evolve with technological advancement, comparing new models to the current top performers.
- Implement domain-specific requirements that acknowledge different risk profiles across sectors, tailoring regulations to the specific needs and risks of each sector.
To develop meaningful, future-proof AI governance, we must look beyond FLOPs and adopt a more nuanced regulatory framework that integrates adaptive, context-aware assessments of AI systems. This approach should account for how various elements interact and prioritize the actual impact a system has on people and society.