10.4 C
London
Thursday, April 2, 2026
Home Biz & IT New Rowhammer attacks give complete control of machines running Nvidia GPUs
new-rowhammer-attacks-give-complete-control-of-machines-running-nvidia-gpus
New Rowhammer attacks give complete control of machines running Nvidia GPUs

New Rowhammer attacks give complete control of machines running Nvidia GPUs

4
0

The cost of high-performance GPUs driven, typically $8,000 or more, means they are frequently shared among dozens of users in cloud environments. Two new attacks demonstrate how a malicious user can gain full root control of a host machine by performing novel Rowhammer attacks on high-performance GPU cards made by Nvidia.

The attacks exploit memory hardware’s increasing susceptibility to bit flips, in which 0s stored in memory switch to 1s and vice versa. In 2014, researchers first demonstrated that repeated, rapid access—or “hammering”—of memory hardware known as DRAM creates electrical disturbances that flip bits. A year later, a different research team showed that by targeting specific DRAM rows storing sensitive data, an attacker could exploit the phenomenon to escalate an unprivileged user to root or evade security sandbox protections. Both attacks targeted DDR3 generations of DRAM.

From CPU to GPU: Rowhammer’s decade-long journey

Over the past decade, dozens of newer Rowhammer attacks have evolved to, among other things:

The last feat proved that GDDR was susceptible to Rowhammer attacks, but the results were modest. The researchers achieved only eight bitflips, a small fraction of what has been possible on CPU DRAM, and the damage was limited to degrading the output of a neural network running on the targeted GPU.

On Thursday, two research teams, working independently of each other, demonstrated attacks against two cards from Nvidia’s Ampere generation that take GPU rowhammering into new—and potentially much more consequential—territory: GDDR bitflips that give adversaries full control of CPU memory, resulting in full system compromise of the host machine. For the attack to work, IOMMU memory management must be disabled, as is the default in BIOS settings.

“Our work shows that Rowhammer, which is well-studied on CPUs, is a serious threat on GPUs as well,” said Andrew Kwong, co-author of one of the papers. “GDDRHammer: Greatly Disturbing DRAM Rows—Cross-Component Rowhammer Attacks from Modern GPUs.” “With our work, we… show how an attacker can induce bit flips on the GPU to gain arbitrary read/write access to all of the CPU’s memory, resulting in complete compromise of the machine.”

Enter: GDDRHammer, GeForge

The attack demonstrated in the paper is GDDRHammer, with the first four initials standing for both “Graphics DDR” and “Greatly Disturbing DRAM Rows.” It works against the RTX 6000 from Nvidia’s Ampere generation of architecture. The attack doesn’t work against the RTX 6000 models from the more recent Ada generation because they use a newer form of GDDR that the researchers didn’t reverse-engineer.

Using novel hammering patterns and a technique called memory massaging, GDDRHammer induced an average of 129 flips per memory bank, a 64-fold increase over the previously mentioned GPUHammer from last year. More consequentially, GDDRHammer can manipulate the memory allocator to break isolation of GPU page tables—which, like CPU page tables, are the data structures used to store mappings between virtual addresses and physical DRAM addresses—and user data stored on the GPU. The result is that the attacker acquires the ability to both read and write to GPU memory.

In an email, Kwong continued:

What our work does that separates us from prior attacks is that we uncover that Rowhammer on GPU memory is just as severe of a security consequence as Rowhammer on the CPU and that Rowhammer mitigations on CPU memory are insufficient when they do not also consider the threat from Rowhammering GPU memory.

A large body of work exists, both theoretical and widely deployed, on both software and hardware level mitigations against Rowhammer on the CPU. However, we show that an attacker can bypass all of these protections by instead Rowhammering the GPU and using that to compromise the CPU. Thus, going forward, Rowhammer solutions need to take into consideration both the CPU and the GPU memory.

The second paper—“GeForge: Hammering GDDR Memory to Forge GPU Page Tables for Fun and Profit”—does largely the same thing, except that instead of exploiting the last-level page table, as GDDRHammer does, it manipulates the last-level page directory. It was able to induce 1,171 bitflips against the RTX 3060 and 202 bitflips against the RTX 6000.

GeForge, too, uses novel hammering patterns and memory massaging to corrupt GPU page table mappings in GDDR6 memory to acquire read and write access to the GPU memory space. From there, it acquires the same privileges over host CPU memory. The GeForge proof-of-concept exploit against the RTX 3060 concludes by opening a root shell window that allows the attacker to issue commands that run unfettered privileges on the host machine. The researchers said that both GDDRHammer and GeForge could do the same thing against the RTC 6000.

“By manipulating GPU address translation, we launch attacks that breach confidentiality and integrity across GPU contexts,” the authors of the GeForge paper (which currently isn’t available publicly) wrote. “More significantly, we forge system aperture mappings in corrupted GPU page tables to access host physical memory, enabling user-to-root escalation on Linux. To our knowledge, this is the first GPUside Rowhammer exploit that achieves host privilege escalation.”

Memory massaging: therapy for GPU exploitation

Nvidia’s GPU driver stores page tables in a reserved region of low-level memory where stored bits can’t be flipped by Rowhammering. To work around this design, both GDDRHammer and GeForge steer the tables into regions that aren’t protected against the electrical disturbance. For GDDRHammer, the massaging is accomplished by using Rowhammer to flip bits that allocate access to the protected region.

“Since these page tables dictate what memory is accessible, the attacker can modify the page table entry to give himself arbitrary access to all of the GPU’s memory,” Kwong explained by email. “Moreover, we found that an attacker can modify the page table on the GPU to point to memory on the CPU, thereby giving the attacker the ability to read/write all of the CPU’s memory as well, which of course completely compromises the machine.”

Meanwhile, Zhenkai Zhang, co-author of the GeForge paper, described the massaging process this way:

Given a steering destination, we first isolate the 2 MB page frame containing it. We then use sparse UVM [unified virtual memory] accesses to drain the driver’s default low-memory page-table allocation pool and free the isolated frame at exactly the right moment so it becomes the driver’s new page-table allocation region. Next, we carefully advance allocations so that a page directory entry lands on the vulnerable subpage inside that frame. Finally, we trigger the bit flip so the corrupted page directory entry redirects its pointer into attacker-controlled memory, where a forged page table can be filled with crafted entries.

In an email, an Nvidia representative said users seeking guidance on whether they’re vulnerable and what actions they should take can view this page published in July in response to the previous GPUHammer attack. The representative didn’t elaborate.

So where do we go now?

The researchers said that both the RTX 3060 and RTX 6000 cards are vulnerable. Changing BIOS defaults to enable IOMMU closes the vulnerability, they said. Short for input-output memory management unit, IOMMU maps device-visible virtual addresses to physical addresses on the host memory. It can be used to make certain parts of memory off-limits.

“In the context of our attack, an IOMMU can simply restrict the GPU from accessing sensitive memory locations on the host,” Kwong explained. “IOMMU is, however, disabled by default in the BIOS to maximize compatibility and because enabling the IOMMU comes with a performance penalty due to the overhead of the address translations.”

A separate mitigation is to enable Error Correcting Codes (ECC) on the GPU, something Nvidia allows to be done using a command line. Like IOMMU, enabling ECC incurs some performance overhead because it reduces the overall amount of available workable memory. Further, some Rowhammer attacks can overcome ECC mitigations.

GPU users should understand that the only cards known to be vulnerable to Rowhammer are the RTX 3060 and RTX 6000 from the Ampere generation, which were introduced in 2020. It wouldn’t be surprising if newer generations of graphics cards from Nvidia and others are susceptible to the same types of attacks, but because the pace of academic research typically lags far behind the faster speed of product rollouts, there’s no way now to know.

Top-tier cloud platforms typically provide security levels that go well beyond those available by default on hobbyist and consumer machines. Another thing to remember: There are no known instances of Rowhammer attacks ever being actively used in the wild.

The true value of the research is to put GPU makers and users alike on notice that Rowhammer attacks on these platforms have the potential to upend security in serious ways. More information about GDDRHammer and GeForge is available here.