Detecting Copy Fail (CVE-2026-31431)– Phenominal Power, Ity Bity Script

Security Splunk Threat Research Team , Raven Tait

CVE-2026-31431, dubbed "Copy Fail," is one of the most severe Linux privilege escalation vulnerabilities to emerge in a very long time, affecting nearly all versions of the Linux kernel since 2017. The vulnerability originates in a logic bug in the Linux kernel's authencesn cryptographic template, which when chained with AF_ALG sockets and the splice() system call lets an unprivileged local user trigger a deterministic, controlled 4-byte write into the page cache of any readable file on the system. Because the page cache is the temporary in-memory copy of a file that the kernel reads when loading a binary for execution, an attacker can target the cached copy of any setuid-root binary such as /usr/bin/su, altering its execution context to grant superuser privileges without ever touching the file on disk.

What makes Copy Fail particularly alarming is its combination of reliability, stealth, and portability. The exploit is a straight-line logic flaw that fires without race conditions, retries, or crash-prone timing windows, and a single 732-byte Python script using only standard library modules works unmodified across every major Linux distribution tested, including Ubuntu, Amazon Linux, RHEL, and SUSE. For threat actors, the implications extend well beyond a single compromised host because the page cache is shared across containers and the host, successful exploitation can facilitate container breakout, multi-tenant compromise, and lateral movement within shared environments, making it particularly dangerous in cloud, CI/CD, and Kubernetes.

In this blog, the Splunk Threat Research Team analyze this vulnerability, how to get relevant logs into Splunk, and provide detection strategies for the attacks.

Affected Systems

Copy Fail has an exceptionally wide blast radius. The vulnerability affects Linux kernel versions 4.14 through 6.19.12, spanning virtually every mainstream, including Ubuntu, Amazon Linux, Red Hat Enterprise Linux, Debian, SUSE, Fedora, Arch Linux, and AlmaLinux. Notably, Ubuntu 26.04 (Resolute) and later kernels are not affected, as they ship with a kernel that postdates the upstream fix. The vulnerability carries a CVSS score of 7.8 (High).

The upstream fix (mainline commit a664bf3d603d) was merged on April 1, 2026, but vendor distributions have been racing to backport and ship it at varying speeds.

Patch Status

Distribution Patch Status Notes
Ubuntu 18.04 / 20.04 / 22.04 / 24.04 Patched Kernel updates available; kmod mitigation shipped via USN-8226-1
Ubuntu 26.04+ Not Affected Kernel version postdates the vulnerable code path
Debian sid / forky Patched Fixed packages available in current repositories
Debian bookworm (stable) In Progress Backport not yet confirmed
RHEL 8 / 9 In Progress Fixes being staged via gradual rollout
AlmaLinux 8 / 9 / 10 Patched Proactively shipped fixes ahead of upstream RHEL errata
Fedora In Progress Recent kernel builds likely include the fix
SUSE / SLES In Progress Availability varies by service pack
Amazon Linux 2023 In Progress Kernel updates in active rollout
CloudLinux 8 / 9 Patched Patched kernels and rebootless KernelCare livepatches available
Arch Linux Likely Patched Rolling release model typically tracks upstream fixes quickly
Note: Status current as of early May 2026. Check vendor advisories for the latest updates.

How Copy Fail Works

At its core, Copy Fail is the product of three independent kernel changes made over a decade that nobody ever connected until now. No single change was a bug on its own. The vulnerability only exists at their intersection.

The Three Pieces

The first piece dates to 2011, when the authencesn algorithm was added to the kernel to support IPsec extended sequence numbers. The algorithm had an unusual habit of using the caller's destination memory buffer as temporary scratch space while rearranging sequence number bytes during cryptographic operations. At the time, this was harmless because the only callers were internal kernel components that never exposed this behavior to the outside.

The second piece arrived in 2015, when the kernel gained the ability to expose its cryptographic subsystem directly to unprivileged userspace processes through a socket interface called AF_ALG. This opened the door for any local user, without special permissions, to send data into the kernel's crypto engine. Also in 2015, authencesn was adapted to work with this new interface, but at this stage the crypto operations still used separate source and destination memory regions, so the scratch-write behavior remained contained.

The third and final piece landed in 2017 as a performance optimization. To save memory copying overhead, the kernel was changed to run certain AEAD decryption operations "in-place," meaning the source and destination were pointed at the same combined memory region. As part of this, page cache pages handed in via the splice() system call were chained directly into the writable destination scatterlist. This placed file-backed memory pages somewhere authencesn 's scratch write could now reach them.

The Write Primitive

When an attacker opens an AF_ALG socket, binds to authencesn, and feeds in a crafted decryption request paired with pages from a target file via splice(), the sequence number rearrangement logic writes four attacker-controlled bytes past the legitimate output boundary and directly into the kernel's in-memory cache of that file. The decryption fails and returns an error, but the write has already happened and is not rolled back.

The attacker controls three things precisely: which file gets written to, which four-byte offset within the file is targeted, and what value gets written there. The only requirement is that the file be readable by the current user.

From Write to Root

The kernel's page cache holds the in-memory copy of files that the system actually use when executing binaries. When a setuid-root binary like /usr/bin/su is loaded for execution, the kernel reads from the page cache, not from disk. By repeatedly applying the four-byte write primitive across the binary's code section, an attacker can inject shellcode into the cached copy of su without ever touching the file on disk. When su is then executed, it runs the attacker's code with root privileges.

While the default exploit path targets setuid binaries like /usr/bin/su, the same write primitive can be applied to /etc/passwd for a different route to privilege escalation. Since /etc/passwd is world-readable on virtually all Linux systems, an unprivileged user can splice its page cache into the exploit's crypto operation just as they would with any other readable file. The attacker locates the byte offset of their own UID field within the file, then uses the four-byte write to overwrite it with 0000 in the kernel's in-memory copy, making the system resolve their account to UID 0 without touching the file on disk or /etc/shadow. When su is then called for the same user, PAM authenticates normally against the untouched shadow file, but the UID lookup now returns root, completing the escalation.

Because the page cache is never marked dirty by this write path, standard file integrity monitoring tools that compare on-disk checksums will see nothing wrong. The disk file is untouched. The compromise exists entirely in RAM and disappears on reboot.

Figure 1: Exploitation Flow

Why This Is Worse Than Previous Linux LPEs

Earlier high-profile Linux privilege escalation bugs like Dirty Cow and Dirty Pipe required either winning a race condition window or targeting specific kernel versions with precise memory offsets. Copy Fail has none of those constraints. The exploit is deterministic, succeeds on the first attempt, fits in a small script using only standard library modules, requires no compilation, and runs unmodified across every major distribution tested. For a threat actor who has already obtained any foothold on a Linux system, escalating to root is now a single script execution away.

Pre-Patch Remediation Steps

Until a patched kernel is available for your distribution, there are a small number of interim mitigations worth knowing about, along with some important caveats about their limitations.

The most widely recommended stopgap is disabling the algif_aead kernel module, which removes the vulnerable code path entirely without affecting the vast majority of common system functions. Services like SSH, IPsec, dm-crypt/LUKS, kTLS, and standard OpenSSL or GnuTLS builds do not rely on AF_ALG and will continue to operate normally. The practical impact of this mitigation is limited to applications that are explicitly configured to use hardware-accelerated AEAD via the kernel's userspace crypto API, which is uncommon outside of specialized networking and storage workloads.

However, the method for applying this mitigation differs significantly depending on your distribution family.

Debian and Ubuntu Systems

On Debian and Ubuntu systems, algif_aead ships as a loadable module and a modprobe blacklist approach works as intended:

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf 

rmmod algif_aead

RHEL-Family Systems

On RHEL-family systems, algif_aead is compiled directly into the kernel rather than packaged as a separate module. The modprobe commands above will execute without errors but have no effect whatsoever on these systems. The correct approach is to blacklist the initcall via grubby and reboot:

grubby --update-kernel=ALL --args="initcall_blacklist=algif_aead_init"

reboot

Monitoring for Copy Fail

Required TA Installation

The foundation for the detection pipeline is the Splunk Add-on for Unix and Linux, available on Splunkbase. This TA handles parsing and CIM normalization of auditd SYSCALL, EXECVE, and PROCTITLE event types, which are the primary log sources for detecting Copy Fail activity.

Configuring auditd

Before any Splunk queries will produce meaningful results, auditd needs to be configured to capture the right syscalls. Set log_format=ENRICHED in /etc/audit/auditd.conf so that UIDs, GIDs, syscall names, and socket address information are resolved before being written to disk rather than stored as raw numeric values. This is a Splunk best practice and significantly improves CIM field mapping.

For auditd rules, a best practice configuration can be found here.

Monitoring auditd for Copy Fail

Important Rules

There are few auditd rules that need to be enabled to capture Copy Fail exploitation.

## Kernel crypto sockets (AF_ALG)
### Low-noise on most hosts. AF_ALG usage typically requires socket creation,
### SOL_ALG configuration, and bind() with a SOCKADDR record that downstream
### tooling can decode to recover salg_type / salg_name. Audit cannot match
### those strings in-kernel, so keep the collection generic here.

-a always,exit -F arch=b64 -S socket -F a0=38 -F success=1 -F auid>=1000 -F auid!=unset -k af_alg

-a always,exit -F arch=b32 -S socket -F a0=38 -F success=1 -F auid>=1000 -F auid!=unset -k af_alg

-a always,exit -F arch=b64 -S bind -F a2=88 -F success=1 -F auid>=1000 -F auid!=unset -k af_alg

-a always,exit -F arch=b32 -S bind -F a2=88 -F success=1 -F auid>=1000 -F auid!=unset -k af_alg

-a always,exit -F arch=b64 -S setsockopt -F a1=279 -F success=1 -F auid>=1000 -F auid!=unset -k af_alg

-a always,exit -F arch=b32 -S setsockopt -F a1=279 -F success=1 -F auid>=1000 -F auid!=unset -k af_alg
``
## Enable only if splice/vmsplice are rare in your estate.
## Correlate short bursts of these events with recent af_alg activity from the
## same pid/auid when investigating unusual kernel crypto socket usage.

-a always,exit -F arch=b64 -S splice -S vmsplice -F success=1 -F auid>=1000 -F auid!=unset -k splice_user

-a always,exit -F arch=b32 -S splice -S vmsplice -F success=1 -F auid>=1000 -F auid!=unset -k splice_user

Stage 1: AF_ALG Socket Creation

The exploit opens an AF_ALG socket using socket(AF_ALG, SOCK_SEQPACKET, 0). In the auditd log this appears as syscall 41 with a0=0x26 (decimal 38, the AF_ALG address family constant) and a1=0x80005 (SOCK_SEQPACKET with SOCK_CLOEXEC). This is the mandatory first step of every known Copy Fail PoC.

Example log entry:

type=SYSCALL msg=audit(1777901306.989:7236): arch=c000003e syscall=41 success=yes exit=4 a0=26 a1=80005 a2=0 ppid=2415 pid=2673 auid=1000 uid=1000 euid=1000 comm="python3" exe="/usr/bin/python3.10" key="af_alg"

Key field breakdown:

Field Value Meaning
syscall 41 socket() system call
a0 0x26 (38) AF_ALG address family
a1 0x80005 SOCK_SEQPACKET | SOCK_CLOEXEC (SEQPACKET type is exclusive to AEAD operations)
exit 4 File descriptor assigned to the new socket
uid/euid 1000/1000 Unprivileged user (no elevation at this stage)
key af_alg auditd rule key set in audit.rules

Stage 2: Socket Bind to authencesn Algorithm

Immediately after socket creation, the exploit binds the socket to the authencesn(hmac(sha256),cbc(aes)) algorithm string. In auditd this appears as syscall 49 (bind) with a0=4 (the socket file descriptor from stage 1) and a2=0x58 (88 bytes, matching the length of the algorithm name struct). This binding is what selects the vulnerable authencesn code path.

Example log entry:

type=SYSCALL msg=audit(1777901306.989:7237): arch=c000003e syscall=49 success=yes exit=0 a0=4 a1=7ffef25d53b0 a2=58 ppid=2415 pid=2673 auid=1000 uid=1000 euid=1000 comm="python3" exe="/usr/bin/python3.10" key="af_alg"

Field Value Meaning
syscall 49 bind() system call
a0 4 AF_ALG socket fd from stage 1
a2 0x58 (88) sockaddr_alg struct length that matches authencesn algorithm name
key af_alg Same auditd rule as socket creation

Stage 3: setsockopt Calls — Key and Operation Setup

Each write iteration requires two setsockopt calls before the splice. The first (a2=1, ALG_SET_KEY) loads the AES key into the crypto context. The second (a2=5, ALG_SET_AEAD_AUTHSIZE) sets the authentication tag size. These appear in pairs in the log, one pair per loop iteration. The logs will contain hundreds of these entries, one pair per 4-byte chunk written to the target file.

Example log entries (one pair from a single iteration):

type=SYSCALL msg=audit(1777901306.989:7238): arch=c000003e syscall=54 success=yes exit=0 a0=4 a1=117 a2=1 a3=765d6f98c5f0 comm="python3" key="af_alg"

type=SYSCALL msg=audit(1777901306.989:7239): arch=c000003e syscall=54 success=yes exit=0 a0=4 a1=117 a2=5 a3=0 comm="python3" key="af_alg"

Field Value Meaning
syscall 54 setsockopt() system call
a1 117 (0x75) SOL_ALG socket option level
a2=1 ALG_SET_KEY First call in each pair which loads the crypto key
a2=5 ALG_SET_AEAD_AUTHSIZE Second call in each pair that sets AEAD tag size

Detection note: The combination of setsockopt at SOL_ALG (a1=117) with ALG_SET_AEAD_AUTHSIZE (a2=5) from a non-root process is highly specific to AEAD operations and has essentially no legitimate non-root use case. This pairing alone is a strong indicator.

Stage 4: splice() — Page Cache Write Loop

This is the write primitive itself. Each loop iteration produces two splice calls: the first moves the target file's page cache pages into the crypto pipe (a0=3, reading from the open file fd), and the second moves them into the AF_ALG accept socket (a0=0x52 and similar, the accept fd). The exit value of each splice call increments by 4 bytes across iterations, directly reflecting the advancing write offset into the target file.

Example log entries (one pair from a single iteration):

type=SYSCALL msg=audit(1777901306.989:7240): arch=c000003e syscall=275 success=yes exit=160 a0=3 a1=7ffef25d53e0 a2=55 a3=0 comm="python3" key="splice_user"

type=SYSCALL msg=audit(1777901306.989:7241): arch=c000003e syscall=275 success=yes exit=160 a0=54 a1=0 a2=5 a3=0 comm="python3" key="splice_user"

Field Value Meaning
syscall 275 splice() system call
a0=3 File fd First splice reads from the target file (fd 3 = open target binary)
a0=0x50-0x54 Accept fd Second splice feeds into AF_ALG accept socket
exit Incrementing Return value advances by 4 per iteration (tracks bytes written into target)
key splice_user auditd rule key

Stage 5: Privilege Escalation Completion

The final entry in the attack sequence is the execve of /usr/bin/su. What makes this record the post-exploitation confirmation is the combination of uid=1000 (original unprivileged user) with euid=0 (effective root). The uid and euid mismatch on an execve of su, where the parent pid=2674 is a shell spawned by the exploit script, is the definitive indicator that the page cache corruption succeeded. Note that this execution as root occurs within milliseconds of the first log.

Example Log entry:

type=SYSCALL msg=audit(1777901306.991:7243): arch=c000003e syscall=59 success=yes exit=0 ppid=2674 pid=2675 auid=1000 uid=1000 gid=1000 euid=0 suid=0 fsuid=0 comm="su" exe="/usr/bin/su" key="process_creation"

Field Value Meaning
syscall 59 execve()
uid 1000 Original unprivileged user identity
euid 0 Effective root
exe /usr/bin/su Target binary whose page cache was corrupted
ppid 2674 Shell spawned by exploit script as parent

Detection: Linux Auditd Copy Fail Privilege Escalation

Correlating these pieces together and looking for any setuid binary being executed in this manner we can detect this privilege escalation.

|-
`linux_auditd`
type=SYSCALL
key IN (
"af_alg",
"process_creation",
"splice_user"
)
| eval setuid_binary = case(
name IN (
"/usr/bin/chfn",
"/usr/bin/chsh",
"/usr/bin/fusermount3",
"/usr/bin/gpasswd",
"/usr/bin/mount",
"/usr/bin/newgrp",
"/usr/bin/passwd",
"/usr/bin/su",
"/usr/bin/sudo",
"/usr/bin/umount",
"/usr/lib/dbus-1.0/dbus-daemon-launch-helper",
"/usr/lib/landscape/apt-update",
"/usr/lib/openssh/ssh-keysign",
"/usr/lib/polkit-1/polkit-agent-helper-1"
), name,
exe IN (
"/usr/bin/chfn",
"/usr/bin/chsh",
"/usr/bin/fusermount3",
"/usr/bin/gpasswd",
"/usr/bin/mount",
"/usr/bin/newgrp",
"/usr/bin/passwd",
"/usr/bin/su",
"/usr/bin/sudo",
"/usr/bin/umount",
"/usr/lib/dbus-1.0/dbus-daemon-launch-helper",
"/usr/lib/landscape/apt-update",
"/usr/lib/openssh/ssh-keysign",
"/usr/lib/polkit-1/polkit-agent-helper-1"
), exe,
true(), null()
)
| eval indicator = case(
key="af_alg", "AF_ALG socket",
key="splice_user", "splice syscall",
isnotnull(setuid_binary), "setuid_exec:" . setuid_binary,
true(), null()
)
| where isnotnull(indicator)
| stats
dc(indicator) as unique_signals
max(_time) as lastTime
min(_time) as firstTime
values(comm) as comm
values(exe) as exe
values(name) as name
values(host) as dest
values(indicator) as signals
values(setuid_binary) as setuid_binaries
values(pid) as pid
values(ppid) as ppid
values(uid) as uid
by auid
| where unique_signals >= 3
| eval risk_score_factor = unique_signals * 25
| sort - risk_score_factor
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`

Figure 2: Auditd Copy Fail Privilege Escalation

Capturing Malformed auth.log Entries

Beyond syscall-level telemetry, there is a post-exploitation artifact unique to the /etc/passwd exploit path that surfaces in auth.log. When su runs from a page-cache-corrupted binary, partial corruption of its runtime state can prevent it from resolving the identity of the calling user. Under normal conditions, su logs both the target account and the invoking user. When exploitation has occurred via this path, the invoking username field is absent.

Normal auth.log entry (pre-exploitation):

2026-05-05T09:59:10.392153+00:00 hostname su[1765]: (to root) ewaugh on pts/1

Malformed auth.log entry (post-exploitation):

2026-05-05T10:04:13.223113+00:00 hostname su[1781]: (to root) on pts/1

The invoking username is missing entirely, leaving a double space between the closing parenthesis and 'on'.

Forwarding logs to Splunk

Add the following to your inputs.conf if you are not already forwarding linux_secure logs:

[monitor:///var/log/auth.log]
disabled = false
sourcetype = linux_secure
index = linux

On RHEL-family systems the equivalent file is /var/log/secure (adjust the monitor path accordingly). The linux_secure sourcetype is provided by the Splunk Add-on for Unix and Linux and handles field extraction for PAM and su log entries out of the box.

Detection: Linux Missing Invoking Username

sourcetype=linux_secure process=su

| rex "su:\s+\(to\s+(?<target_user>\S+)\)(?<source_user>\s{2,})on\s+(?<terminal>\S+)"
| where len(ltrim(source_user)) == 0
| stats
count as total_attempts,
min(_time) as firstTime,
max(_time) as lastTime,
values(target_user) as target_users,
values(host) as dest
by process
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`

Figure 3: Missing Invoking Username

Important Caveats

This signal is specific to certain exploit variants. More common PoC implementations corrupt su in a way that prevents PAM logging from running at all, producing no auth.log entry rather than a malformed one. A missing invoking username can also occur under rare legitimate conditions unrelated to this CVE. This indicator is most valuable when correlated with the auditd-based AF_ALG and splice() signals documented in the previous section rather than used in isolation.

Monitoring kern.log and syslog for Copy Fail Activity

In addition to auditd and auth.log, two further log sources provide useful signals for Copy Fail activity: kern.log, which captures kernel-level messages, and syslog, which aggregates system-wide events including kernel messages on many distributions. Neither source provides the same fidelity as auditd syscall records, but they offer a lower-overhead detection layer and are valuable for correlation.

The PF_ALG Protocol Family Registration Message

The primary indicator available in both kern.log and syslog is the kernel message produced when the AF_ALG protocol family is registered. When the exploit triggers the AF_ALG interface for the first time on a system where the module has not yet been loaded, the kernel logs the following:

NET: Registered PF_ALG protocol family

This message is written to both kern.log and syslog. On its own it is not a reliable indicator of exploitation since it also appears during system boot and when legitimate applications first invoke the crypto API. However, loading during boot is common and expected, while on-demand loading that occurs several minutes or more after boot was observed almost exclusively on systems exposed to Copy Fail activity after the CVE became public.

Note: This message will not appear on systems where AF_ALG is built directly into the kernel rather than compiled as a loadable module, which is the case on most RHEL-family distributions. On those systems kern.log and syslog will not produce this specific artifact.

Forwarding kern.log and syslog to Splunk

Add the following stanzas if you are not already forwarding these logs:

[monitor:///var/log/kern.log]
disabled = false
sourcetype = linux_messages_syslog
index = linux

[monitor:///var/log/syslog]
disabled = false
sourcetype = linux_messages_syslog
index = linux

The linux_messages_syslog sourcetype is included in the Splunk Add-on for Unix and Linux and correctly handles field extraction for kernel and system messages.

Detection: Linux PF_ALG Registration Outside of Boot Window

This search identifies PF_ALG registration events that occur more than five minutes after system boot, which is the window where legitimate on-demand loading by the exploit is most likely to be distinguished from normal boot-time loading:

sourcetype="linux_messages_syslog" "NET: Registered PF_ALG protocol family"

| rex field=_raw "kernel: \[\s*(?<uptime_seconds>[\d\.]+)\]"
| eval uptime_seconds=tonumber(uptime_seconds)
| where uptime_seconds > 300
| eval uptime_readable=tostring(round(uptime_seconds/60,1)) . " minutes after boot"
| table _time host uptime_seconds uptime_readable _raw
| sort -uptime_seconds

Figure 4: PF_ALG Registration Outside of Boot Window

Learn More

This blog helps security analysts, blue teamers, and Splunk customers identify Copy Fail, and similar attacks. You can implement the detections in this blog using the Splunk Add-on for Unix and Linux. To view the Splunk Threat Research Team's complete security content repository, visit research.splunk.com.

Feedback

Any feedback or requests? Feel free to put in an issue on GitHub and we'll follow up. Alternatively, join us on the Slack channel #security-research. Follow these instructions if you need an invitation to our Splunk user groups on Slack.

Contributors

We would like to thank Raven Tait for authoring this post, as well as the Splunk Threat Research Team (Lou Stella, Bhavin Patel, Rod Soto, Eric McGinnis, Nasreddine Bencherchali, Teoderick Contreras, and Patrick Bareiss) for their contributions to the detection content and analysis.

Related Articles

Safe Passage: Seamless Transition Path for IBM QRadar Customers
Security
4 Minute Read

Safe Passage: Seamless Transition Path for IBM QRadar Customers

The SOC is where it all goes down and where dedicated SecOps teams work tirelessly to protect every digital corner of an organization.
Securing the Unseen
Security
4 Minute Read

Securing the Unseen

Learn how Splunk Asset and Risk Intelligence unifies IT/OT visibility, enhances threat detection, and ensures compliance.
Beyond Logs: Navigating Entity Behavior in Splunk Platform
Security
7 Minute Read

Beyond Logs: Navigating Entity Behavior in Splunk Platform

Master internal threat detection with Splunk's anomaly detection, finding events like unusual geolocations and spikes in activity, while optimizing security.