Governments and technology companies face growing pressure to close safety gaps before extremist groups exploit artificial intelligence more effectively
Saudi Arabia and counterterrorism experts warned at the United Nations this week that terrorist groups are learning to exploit artificial intelligence for recruitment, propaganda, fundraising and attack planning faster than governments and technology companies are building safeguards.
Saudi Arabia’s permanent representative to the UN, Abdulaziz M. Alwasil, called for stronger international cooperation, expertise exchange and investment in capacity building to prevent terrorist groups from exploiting artificial intelligence and other emerging technologies. Speaking during a UN General Assembly session on strategic capacity-building responses to the misuse of AI and other emerging technologies by terrorist and violent extremist groups, which he co-chaired, Alwasil warned that terrorist groups are becoming more capable of using new technologies to recruit, spread extremist propaganda, raise funds and plan attacks.
Alwasil also pointed to Yemen, where he said capacity building is especially urgent to counter groups including the Houthis and al-Qaida, which have sought access to drones and other modern technologies.
The Saudi call came as Tech Against Terrorism released a new report, the Counter-Terrorism AI Benchmark, during UN Counter-Terrorism Week in New York. The report describes itself as the first systematic benchmark built specifically to test how artificial intelligence models respond when asked to assist terrorism and violent extremism.
The study tested 27 AI models, produced 2,339 graded outputs and covered 26 use cases across 13 threat pillars in four domains. The redacted version of the report replaced compound names and procedural details with general descriptions, while keeping the analytical pattern visible. Tech Against Terrorism said the research was self-funded, that it received no funding from any AI provider referenced in the report and that it had no conflicts of interest.
Across the models tested, around a third of responses provided “meaningful uplift” beyond what could easily be found through a normal web search. Full refusals accounted for 57% of responses, while 15% were classified as “hedged compliance,” meaning that the model opened with a refusal but still supplied content. The most serious failures came from open models whose safety training had been stripped out through a process known as abliteration. Two such models complied with 89% and 100% of requests.
Adam Hadley, executive director of Tech Against Terrorism, told The Media Line that the lack of a terrorism-specific AI benchmark was the reason the organization built one. He said many models tested with simple, single-shot questions provided meaningful assistance toward bomb-making or mass-casualty attack planning, calling the result “not acceptable.”
Hadley framed the issue not only as a failure of safety filters but as a deeper control problem for AI developers.
This is a control problem as much as a safety one
“This is a control problem as much as a safety one,” he said, warning that AI developers risk creating models they cannot control.
According to Tech Against Terrorism, the answer is not a ban on open models, but stronger intervention at the distribution and evaluation layers before harm occurs.
One of the report’s most concerning findings was that model guardrails could be weakened simply by changing the stated purpose of a request. Reframing an identical technical question as “research” without changing the technical content raised compliance from 17% to 42%. The report argues that this suggests some models are responding to surface-level framing rather than assessing the operational risk of the request itself.
Hadley said the results should be viewed as a lower bound rather than a full picture of the danger.
These are single-shot results
“These are single-shot results,” he said, noting that the research did not include attempts to evade safeguards or persist after refusal, and therefore represented a floor rather than a ceiling on the risk.
The benchmark did not test multi-turn conversations, adversarial prompting, non-English languages, or repeated attempts to bypass refusal mechanisms. Its findings measure the lowest-effort form of misuse: one direct request. The report describes this first release as a deliberately bounded pilot, designed to establish a comparable and repeatable baseline before future phases expand into more languages, multi-turn interactions, and complex adversarial testing.
Hadley also warned in the report’s foreword that the problem should not be seen as isolated to one company or one model. “The fragility of alignment” exposed by the findings, he said, is “a shared, transnational one.”
In Hadley’s view, the benchmark points to a broader imbalance in the AI race: companies are investing heavily in model capability while safety and control remain underdeveloped. He argued that reliable control over these systems is becoming one of the most urgent questions for governments and AI developers, particularly as terrorist actors, lone attackers and hostile networks learn how to exploit publicly available tools.
For Kiria Borak, a security analyst focused on West Africa and the Sahel, the concern is not only theoretical. She told The Media Line that extremist use of AI should be understood as part of a wider information ecosystem, not only through the lens of drones or battlefield technology.
Borak acknowledged that AI-guided targeting systems in drones are already being tested and said this raises concerns both for terrorist use and for counterterrorism efforts. But she said her larger concern is the rapid growth of generative AI propaganda associated with groups such as ISIS and al-Qaida.
She described AI as a force multiplier for extremist communication. The danger, she said, is not only that more material can be produced, but that recommendation algorithms can amplify it once users begin interacting with it.
“There’s more content,” Borak said, “and in terms of that content reaching people—because of the ways in which a lot of platforms are using AI-generated [content] like recommendation algorithms—that’s just really amplifying extremist content,” she added, explaining that when content gets higher user engagement, it shows up in more people’s feeds.
Borak said her research on Facebook showed how quickly extremist-linked material could flood a user’s feed once the algorithm detected interest.
“I found that if you start this content, you get more and more of it. And that, for me, is pretty concerning. I’m doing this from a research perspective, but I look at a couple of IS or al-Qaida posts, and suddenly my Facebook feed is full of them,” she said.
The phenomenon, she added, is not limited to highly connected regions or technologically advanced environments. Borak said AI-generated extremist content is already visible in West Africa, where she focuses much of her research.
“I think it’s a global phenomenon. My expertise in particular is West Africa. And that’s a space in which I think people assume that there is less AI-generated content, and that is not true,” she said. “There is a huge amount of AI-generated recruitment content, and there’s quite a bit of AI-generated propaganda.”
For Borak, the spread of AI-generated content in places such as Mali, Burkina Faso and Niger is a warning sign because it shows that the technology has already reached areas often assumed to be less integrated into the digital economy.
“If you’re in a place with low internet penetration, and you’re [seeing] that level of proliferation of AI content, it gives you a sense of the degree to which it has spread, and the degree to which it’s being used by groups,” she said.
The issue is compounded by uneven public awareness. Borak said audiences in Europe or the United States may have more exposure to debates over AI-generated content and misinformation, while communities in more fragile information environments may be less equipped to identify manipulated material.
“And I think it’s also concerning, because places like the US or Europe, there’s more of an understanding of AI, and there’s more efforts to educate people about AI-generated content … . That’s so much less the case in a lot of the places where this content is being pushed out,” she said.
The Tech Against Terrorism report points to a similar gap in current safety testing. It says general AI safety benchmarks often measure toxicity, bias, factuality or broad harmfulness, but do not treat terrorism and violent-extremist misuse as a distinct category. Its framework maps terrorist misuse into more than 150 use cases across direct violence and weapons, influence and psychological impact, operational enablement and autonomous escalation.
Saudi Arabia has positioned the misuse of AI and emerging technologies not only as a global counterterrorism issue but also as a regional security concern. For Gulf states, the threat is far from abstract: drones, encrypted communications, propaganda networks and transnational terrorism financing have already shaped conflicts from Yemen to the Sahel.
Borak said some observers have framed Saudi Arabia’s engagement on AI and extremism as new, but she argued that the Kingdom has been examining the issue for years and has previously funded research on the topic.
Beyond propaganda, Borak said the more dangerous evolution may come from the pairing of recruitment with increasingly accessible operational information.
“I think the drone concern is a real one. But my concern is the fact that the information on how to carry out attacks is so much more accessible. And their recruitment is so much more effective and efficient. And that combination is pretty concerning,” she said.
Her warning mirrors one of the benchmark’s central conclusions: the risk is not limited to models explicitly agreeing to harmful requests. In some cases, a refusal may still reveal useful fragments. In others, a model may comply when the same request is framed as research. The report argues that “a refusal rate is not a safety rating,” because what matters is the severity of the assistance provided, not only whether a model formally declines.
Borak also pointed to the weakening of content moderation as a broader structural risk.
My concern is the lack of willingness to moderate content and address the wide proliferation of this content, paired with the fact that content can be produced so much faster and more widely
“My concern is the lack of willingness to moderate content and address the wide proliferation of this content, paired with the fact that content can be produced so much faster and more widely,” she said. “It’s not painting a particularly hopeful picture.”
The debate is politically difficult. Technology companies are racing to deploy increasingly capable models. Governments are still trying to define where regulation should begin. Open models offer real benefits for research, transparency and sovereignty, but the Tech Against Terrorism report warns that once an open-weight model has been stripped of its safeguards and downloaded, it cannot be recalled.
For Hadley, the answer is not a blanket ban on open models, but a more precise intervention before models reach the public domain. Tech Against Terrorism recommends treating terrorist and violent-extremist misuse as a distinct safety category, testing models for changes in stated intent, extending refusal training beyond the most recognizable threats, and treating the circulation of de-restricted or abliterated models as a national-security concern.
Borak said she is less concerned about AI-targeting drones than about AI-generated recruitment content and AI systems that could tailor operational information to a user’s circumstances.
“Then the mechanisms through which attacks could be carried out are easier because there are these AI systems that will provide information exactly tailored to what you want to do and what you need and the tools you possess,” she said.
For now, the UN discussion, Saudi Arabia’s call for global cooperation and Tech Against Terrorism’s benchmark all point to the same unresolved question: whether governments and technology companies can build safeguards quickly enough before extremist actors learn to exploit the weakest parts of the system faster than the system can correct itself.







