Security-Portal.cz je internetový portál zaměřený na počítačovou bezpečnost, hacking, anonymitu, počítačové sítě, programování, šifrování, exploity, Linux a BSD systémy. Provozuje spoustu zajímavých služeb a podporuje příznivce v zajímavých projektech.

Kategorie

BIOS Disconnect: New High-Severity Bugs Affect 128 Dell PC and Tablet Models

The Hacker News - 1 hodina 7 min zpět
Cybersecurity researchers on Thursday disclosed a chain of vulnerabilities affecting the BIOSConnect feature within Dell Client BIOS that could be abused by a privileged network adversary to gain arbitrary code execution at the BIOS/UEFI level of the affected device. "As the attacker has the ability to remotely execute code in the pre-boot environment, this can be used to subvert the operating
Kategorie: Hacking & Security

Reduce Business Risk By Fixing 3 Critical Endpoint-to-Cloud Security Requirements

The Hacker News - 1 hodina 24 min zpět
Enterprise applications used to live securely in data centers and office employees connected to internal networks using company-managed laptops or desktops. And data was encircled by a walled perimeter to keep everything safe. All that changed in the last 18 months. Businesses and employees had to adapt quickly to cloud technology and remote work. The cloud gave businesses the agility to respond
Kategorie: Hacking & Security

One-Click Exploit Could Have Let Attackers Hijack Any Atlassian Account

The Hacker News - 1 hodina 32 min zpět
Cybersecurity researchers on Wednesday disclosed critical flaws in the Atlassian project and software development platform that could be exploited to take over an account and control some of the apps connected through its single sign-on (SSO) capability. "With just one click, an attacker could have used the flaws to get access to Atlassian's publish Jira system and get sensitive information,
Kategorie: Hacking & Security

Malicious spam campaigns delivering banking Trojans

Kaspersky Securelist - 1 hodina 36 min zpět

In mid-March 2021, we observed two new spam campaigns. The messages in both cases were written in English and contained ZIP attachments or links to ZIP files. Further research revealed that both campaigns ultimately aimed to distribute banking Trojans. The payload in most cases was IcedID (Trojan-Banker.Win32.IcedID), but we have also seen a few QBot (Backdoor.Win32.Qbot, also known as QakBot) samples. During campaign spikes we observed increased activity of these Trojans: more than a hundred detections a day.

IcedID is a banking Trojan capable of web injects, VM detection and other malicious actions. It consists of two parts – the downloader and the main body that performs all the malicious activity. The main body is hidden in a PNG image, which is downloaded and decrypted by the downloader.

QBot is also a banking Trojan. It’s a single executable with an embedded DLL (main body) capable of downloading and running additional modules that perform malicious activity: web injects, email collection, password grabbing, etc.

Neither of these malware families are new – we’ve seen them being distributed before via spam campaigns and different downloaders, like the recently taken-down Emotet. However, in the recent campaign we observed several changes to the IcedID Trojan.

Technical details Initial infection

DotDat

The first campaign we called ‘DotDat’. It distributed ZIP attachments that claimed to be some sort of cancelled operation or compensation claims with the names in the following format [document type (optional)]-[some digits]-[date in MMDDYYYY format]. We assume the dates correspond with the campaign spikes. The ZIP archives contained a malicious MS Excel file with the same name.

The Excel file downloads a malicious payload via a macro (see details below) from a URL with the following format [host]/[digits].[digits].dat and executes it. The URL is generated during execution using the Excel function NOW(). The payload is either the IcedID downloader (Trojan.Win32.Ligooc) or QBot packed with a polymorph packer.

Excel macro details (3e12880c20c41085ea5e249f8eb85ded)

The Excel file contains obfuscated Excel 4.0 macro formulas to download and execute the payload (IcedID or QBot). The macro generates a payload URL and calls the WinAPI function URLDownloadToFile to download the payload.

Macro downloads IcedID downloader

After a successful download, the payload is launched using the EXEC function and Windows Rundll32 executable.

Macro starts payload

Summer.gif

The spam emails of the second campaign contained links to hacked websites with malicious archives named “documents.zip”, “document-XX.zip”, “doc-XX.zip” where XX stands for two random digits. Like in the first campaign, the archives contained an Excel file with a macro that downloaded the IcedID downloader. According to our data, this spam campaign peaked on 17/03/2021. By April, the malicious activity had faded away.

Excel macro details (c11bad6137c9205d8656714d362cc8e4)

Like in the other case, Excel 4.0 macro formulas and the URLDownloadToFile function are used in this campaign. The main difference in the download component is that the URL is stored in a cell inside the malicious file.

Payload download

Though the URL seems to refer to a file named “summer.gif”, the payload is an executable, not a GIF image. To execute the payload, the macro uses WMI and regsvr32 tools.

Macro starts payload

IcedID

As we mention above, IcedID consists of two parts – downloader and main body. The downloader sends some user information (username, MAC address, Windows version, etc.) to the C&C and receives the main body. In the past, the main body was distributed as a shellcode hidden in a PNG image. The downloader gets the image, decrypts the main body in the memory and executes it. The main body maps itself into the memory and starts to perform its malicious actions such as web injects, data exfiltration to the C&C, download and execution of additional payloads, exfiltration of system information and more.

IcedID new downloader

Besides the increase in infection attempts, the IcedID authors also changed the downloader a bit. In previous versions it was compiled as an x86 executable and the malware configuration after decryption contained fake C&C addresses. We assume this was done to complicate analysis of the samples. In the new version, the threat actors moved from x86 to an x86-64 version and removed the fake C&Cs from the configuration.

Configuration of the old version of IcedID downloader

New version configuration

We also observed a minor change in the malware’s main body. While it’s still distributed as a PNG image, and the decryption and C&C communication methods remain the same, the authors decided not to use shellcode. Instead, IcedID’s main body is distributed as a standard PE file with some loader-related data in the beginning.

Geography of IcedID attacks

Geography of IcedID downloader detections, March 2021 (download)

In March 2021, the greatest number of users attacked by Ligooc (IcedID downloader) were observed in China (15.88%), India (11.59%), Italy (10.73%), the United States (10.73%) and Germany (8.58%).

Qbot

Unlike IcedID, QBot is a single executable with an embedded DLL (main body) stored into the resource PE section. In order to perform traffic interception, steal passwords, perform web injects and take remote control of the infected system, it downloads additional modules: web inject module, hVNC (remote control module), email collector, password grabber and others. All the details on Qbot, as well as IoCs, MITRE ATT&CK framework data, YARA rules and hashes relating to this threat are available to users of our Financial Threat Intelligence services.

Geography of Qbot attacks

Geography of QBot attacks, March 2021 (download)

In March 2021, QBot was also most active in China (10.78%), India (10.78%) and the United States (4.66%), but we also observed it in Russia (7.60%) and France (7.60%).

Indicators of compromise

File Hashes (MD5)
Excel with macros

042b349265bbac709ff2cbddb725033b
054532b8b2b5c727ed8f74aabc9acc73
1237e85fe00fcc1d14df0fb5cf323d6b
3e12880c20c41085ea5e249f8eb85ded

Documents.zip

c11bad6137c9205d8656714d362cc8e4

Trojan.Win32.Ligooc

997340ab32077836c7a055f52ab148de

Trojan-Banker.Win32.QBot

57f347e5f703398219e9edf2f31319f6

Domains/IPs

Apoxiolazio55[.]space
Karantino[.]xyz
uqtgo16datx03ejjz[.]xyz
188.127.254[.]114

Atlassian Bugs Could Have Led to 1-Click Takeover

Threatpost - 1 hodina 37 min zpět
A supply-chain attack could have siphoned sensitive information out of Jira, such as security issues on Atlassian cloud, Bitbucket and on-prem products.
Kategorie: Hacking & Security

30M Dell Devices at Risk for Remote BIOS Attacks, RCE

Threatpost - 1 hodina 37 min zpět
Four separate security bugs would give attackers almost complete control and persistence over targeted devices, thanks to a faulty update mechanism.
Kategorie: Hacking & Security

Obávaný trojan Triada opět na scéně, majitele smartphonů připraví o peníze

Novinky.cz - bezpečnost - 2 hodiny 1 min zpět
Bezpečnostní experti z antivirové společnosti Eset varovali, že v uplynulých týdnech začali kyberzločinci opět hojně nasazovat bankovní trojan Triada, který představuje hrozbu pro zařízení s operačním systémem Android – tedy především pro chytré telefony a tablety. Dovede dokonce obcházet zabezpečení online bankovnictví prostřednictvím SMS zpráv.
Kategorie: Hacking & Security

Critical Auth Bypass Bug Affects VMware Carbon Black App Control

The Hacker News - 3 hodiny 37 min zpět
VMware has rolled out security updates to resolve a critical flaw affecting Carbon Black App Control that could be exploited to bypass authentication and take control of vulnerable systems. The vulnerability, identified as CVE-2021-21998, is rated 9.4 out of 10 in severity by the industry-standard Common Vulnerability Scoring System (CVSS) and affects App Control (AppC) versions 8.0.x, 8.1.x,
Kategorie: Hacking & Security

Rusko bude pracovat s USA na dopadení kybernetických zločinců

Novinky.cz - bezpečnost - 3 hodiny 1 min zpět
Rusko bude spolupracovat se Spojenými státy ve snaze dopadnout kybernetické zločince, prohlásil ve středu šéf ruské tajné služby FSB Alexander Bortnikov. Učinil tak týden poté, co se americký prezident Joe Biden a jeho ruský protějšek Vladimir Putin dohodli na rozvinutí spolupráce v některých oblastech, uvedla agentura Reuters.
Kategorie: Hacking & Security

Váš účet byl napaden, okamžitě převeďte peníze do bitcoinů. Podvodníci vydělávají statisíce

Novinky.cz - bezpečnost - 3 hodiny 42 min zpět
O statisíce korun připravili dosud neznámí podvodníci důvěřivé jedince na Ostravsku. Prostřednictvím telefonátu se vydávali za pracovníky banky a záměrně v nich vyvolali tak závažnou obavu o jejich peníze, že je donutili převést peníze na cizí účty. Podobné případy nejsou přitom v republikovém měřítku ojedinělé.
Kategorie: Hacking & Security

Antivirus Pioneer John McAfee Found Dead in Spanish Jail

The Hacker News - 4 hodiny 53 min zpět
Controversial mogul and antivirus pioneer John McAfee on Wednesday died by suicide in a jail cell in Barcelona, hours after reports that he would be extradited to face federal charges in the U.S. McAfee was 75. He is said to have died by hanging "as his nine months in prison brought him to despair," according to McAfee's lawyer Javier Villalba, Reuters reported. Security personnel at the Brians
Kategorie: Hacking & Security

Cyber espionage by Chinese hackers in neighbouring nations is on the rise

The Hacker News - 5 hodin 11 min zpět
A string of cyber espionage campaigns dating all the way back to 2014 and likely focused on gathering defense information from neighbouring countries have been linked to a Chinese military-intelligence apparatus. In a wide-ranging report published by Massachusetts-headquartered Recorded Future this week, the cybersecurity firm's Insikt Group said it identified ties between a group it tracks as "
Kategorie: Hacking & Security

Pakistan-linked hackers targeted Indian power company with ReverseRat

The Hacker News - 5 hodin 11 min zpět
A threat actor with suspected ties to Pakistan has been striking government and energy organizations in the South and Central Asia regions to deploy a remote access trojan on compromised Windows systems, according to new research. "Most of the organizations that exhibited signs of compromise were in India, and a small number were in Afghanistan," Lumen's Black Lotus Labs said in a Tuesday
Kategorie: Hacking & Security

Do vnitřní části Sluneční soustavy se řítí obrovská kometa. Nepropadejte panice, postavte sondu

Zive.cz - bezpečnost - 5 hodin 42 min zpět
** Astronomové objevili u Neptunu obří kometu ** Těleso 2014 UN271 se dostane ke Slunci nejblíže v roce 2031 ** Kometa má v průměru asi 200 kilometrů
Kategorie: Hacking & Security

Iran Media Websites Seized by U.S. in Disinformation Campaign

Threatpost - 23 Červen, 2021 - 21:23
DoJ uses sanctions laws to shut down an alleged Iranian government malign influence campaign.
Kategorie: Hacking & Security

„Zakažte zabijácké drony, než všichni zemřeme,“ požaduje skupina akademiků

Zive.cz - bezpečnost - 23 Červen, 2021 - 19:55
Jak jsme vás nedávno informovali, podle zprávy Rady bezpečnosti OSN během občanského konfliktu v Libyi na ustupující vojáky zaútočil malý kamikadze dron turecké výroby. Několik vědců na tuto událost reagovalo napsáním příspěvku pro IEEE Spectrum, v němž ji označilo za důvod k velkému ...
Kategorie: Hacking & Security

Pandemic-Bored Attackers Pummeled Gaming Industry

Threatpost - 23 Červen, 2021 - 18:53
Akamai's 2020 gaming report shows that cyberattacks on the video game industry skyrocketed, shooting up 340 percent in 2020.
Kategorie: Hacking & Security

Critical Palo Alto Cyber-Defense Bug Allows Remote ‘War Room’ Access

Threatpost - 23 Červen, 2021 - 17:39
Remote, unauthenticated cyberattackers can infiltrate and take over the Cortex XSOAR platform, which anchors unified threat intelligence and incident responses.
Kategorie: Hacking & Security

REvil Ransomware Code Ripped Off by Rivals

Threatpost - 23 Červen, 2021 - 17:11
The LV ransomware operators likely used a hex editor to repurpose a REvil binary almost wholesale, for their own nefarious purposes.
Kategorie: Hacking & Security

How to confuse antimalware neural networks. Adversarial attacks and protection

Kaspersky Securelist - 23 Červen, 2021 - 14:16

Introduction

Nowadays, cybersecurity companies implement a variety of methods to discover new, previously unknown malware files. Machine learning (ML) is a powerful and widely used approach for this task. At Kaspersky we have a number of complex ML models based on different file features, including models for static and dynamic detection, for processing sandbox logs and system events, etc. We implement different machine learning techniques, including deep neural networks, one of the most promising technologies that make it possible to work with large amounts of data, incorporate different types of features, and boast a high accuracy rate. But can we rely entirely on machine learning approaches in the battle with the bad guys? Or could powerful AI itself be vulnerable? Let’s do some research.

In this article we attempt to attack our product anti-malware neural network models and check existing defense methods.

Background

An adversarial attack is a method of making small modifications to the objects in such a way that the machine learning model begins to misclassify them. Neural networks (NN) are known to be vulnerable to such attacks. Research of adversarial methods historically started in the sphere of image recognition. It has been shown that minor changes in pictures, such as the addition of insignificant noise can cause remarkable changes in the predictions of the classifiers and even completely confuse ML models[i].

The addition of inconspicuous noise causes NN to classify the panda as a gibbon

Furthermore, the insertion of small patterns into the image can also force models to change their predictions in the wrong direction[ii].

Adding a small patch to the image makes NN classify the banana as a toaster

After this susceptibility to small data changes was highlighted in the image recognition of neural networks, similar techniques were demonstrated in other data domains. In particular, various types of attacks against malware detectors were proposed, and many of them were successful.

In the paper “Functionality-preserving black-box optimization of adversarial windows malware”[iii] the authors extracted data sequences from benign portable executable (PE) files and added them to malware files either at the end of the file (padding) or within newly created sections (section injection). These changes affected the scores of the targeted classifier while preserving file functionality by design. A collection of these malware files with inserted random benign file parts was formed. Using genetic algorithms (including mutations, cross-over and other types of transformations) and the malware classifier for predicting scores, the authors iteratively modified the collection of malware files, making them more and more difficult for the model to be classified correctly. This was done via objective function optimization, which contains two conflicting terms: the classification output on the manipulated PE file, and a penalty function that evaluates the number of injected bytes into the input data. Although the proposed attack was effective, it did not use state-of-the-art ML adversarial techniques and relied on public pre-trained models. Also, the authors measured an average effectiveness of the attack against VirusTotal anti-malware engines, so we don’t know for sure how effective it is against the cybersecurity industry’s leading solutions. Moreover, since most security products still use traditional methods of detection, it’s unclear how effective the attack was against the ML component of anti-malware solutions, or against other types of detectors.

Another study, “Optimization-guided binary diversification to mislead neural networks for malware detection”[iv], proposed a method for functionality-preserving assembler operand changes in functions, and adversarial attacks based on it. The algorithm randomly selects a function and transformation type and tries to apply selected changes. The attempted transformation is applied only if the targeted NN classifier becomes more likely to misclassify the binary file. Again, this attack lacks ML methods for adversarial modification, and it has not been tested on specific anti-malware products.

Some papers proposed gradient-driven adversarial methods that use knowledge about model structure and features for malicious file modification[v]. This approach provides more opportunities for file modifications and results in better effectiveness. Although the authors conducted experiments in order to measure the impact of such attacks against specific malware detectors (including public models), they don’t work with product anti-malware classifiers.

For a more detailed overview of the various adversarial attacks on malware classifiers, see our whitepaper and “A survey on practical adversarial examples for malware classifiers“.

Our goal

Since Kaspersky anti-malware solutions, among other techniques, rely on machine learning models, we’re extremely interested in investigating how vulnerable our ML models are to adversarial attacks. Three attack scenarios can be considered:

White-box attack. In this scenario, all information about a model is available. Armed with this information, attackers try to convert malware files (detected by the model) to adversarial samples with identical functionality but misclassified as benign. In real life this attack is possible when the ML detector is a part of the client application and can be retrieved by code reversing. In particular, researchers at Skylight reported such a scenario for the Cylance antivirus product.

Gray-box attack. Complex ML models usually require a significant amount of both computational and memory resources. Therefore, the ML classifiers may be cloud-based and deployed on the security company servers. In this case, the client applications merely compute and send file features to these servers. The cloud-based malware classifier responds with the predictions for given features. The attackers have no access to the model, but they still have knowledge about feature construction, and can get labels for any file by scanning it with the security product.

Black-box attack. In this case, feature computation and model prediction are performed on the cybersecurity company’s side. The client applications just send raw files, or the security company collects files in another way. Therefore, no information about feature processing is available. There are strict legal restrictions for sending information from the user machine. This approach also involves traffic limitation. This means the malware detection process usually can’t be performed for all user files on the go. Therefore, an attack on a black-box system is still the most difficult.

Consequently, we will focus on the first two attack scenarios and investigate their effectiveness against our product model.

Features and malware classification neural network

We built a simple but well-functioning neural network similar to our product model for the task of malware detection. The model is based on static analysis of executable files (PE files).

Malware classification neural network

The neural network model works with the following types of features:

  • PE Header features – features extracted from PE header, including physical and virtual file size, overlay size, executable characteristics, system type, number of imported and exported functions, etc.
  • Section features – the number of sections, physical and virtual size of sections, section c
  • Section statistics – various statistics describing raw section data: entropy, byte histograms of different section parts, etc.
  • File strings – strings parsed from raw file using special utility. Extracted strings packed into bloom filter

Let’s take a brief look at the bloom filter structure.

Scheme of packing strings into bloom filter structure. Bits related to strings are set to 1

The bloom filter is a bit vector. For each k string n predefined hash functions are calculated. The value of the hash functions determines the position of a bit to be set to 1 in the bloom filter vector. Note that different strings can be mapped to the same bit. In this case the bit remains in the set position (equal to 1). This way we can pack all file strings into a vector of a fixed size.

We trained the aforementioned neural network on approximately 300 million files – half of them benign, the other half malware. The classification quality of this network is displayed in the ROC curve. The X-axis shows the false positive rate (FPR) in logarithmic scale, while the Y-axis corresponds to the true positive rate (TPR) – the detection rate for all the malware files.

ROC curve for trained malware detector

In our company, we focus on techniques and models with very low false positive rates. So, we set a threshold for a 10-5 false positive rate (we rate 1 false positive as 100 000 true detections). Using this threshold, we detected approximately 60% of the malware samples from our test collection.

Adversarial attack algorithm

To attack the neural network, we use the gradient method described in “Practical black-box attacks against machine learning“. For a malware file we want to change the score of the classifier to avoid detection. To do so, we calculate the gradient for the final NN score, back-propagate it through all the NN layers to the file features. The main difficulty of creating an adversarial PE is saving the functionality of the original file. To achieve this, we define a simple strategy. During the adversarial attack we only add new sections, while existing sections remain intact. In most cases these modifications don’t affect the file execution process.

We also have some restrictions for features in the new sections:

  • Different size-defining features (related to file/section size, etc.) should be in the range from 0 to some not very large value.
  • Byte entropy and byte histograms should be consistent. For example, the values in a histogram for a buffer with the size S should give the value S when combined.
  • We can add bits to the bloom filter, but can’t remove them (it is simple to add new strings to the file, but difficult to remove).

To satisfy these restrictions we use an algorithm similar to the one described in “Deceiving end-to-end deep learning malware detectors using adversarial examples” but with some modifications (described below). Specifically, we move the “fix_restriction” step into the “while” loop and expanded the restrictions.

Here dF(x,y)⁄dx is the gradient of the model output by features, fix_restrictions projects features to the aforementioned permitted value area,  is the step size.

The adversarial-generating loop contains two steps:

  • We calculate gradient of model score by features, and add to the feature vector x in the direction of the gradient for all non-bloom features.
  • Update the feature vector x to meet existing file restrictions: for example, put integer file features into the required interval, round them.

For bloom filter features we just set up one bit corresponding to the largest gradient. Actually, we should also find the string for this bit and set up other bits corresponding to it. However, in practice, this level of precision is not necessary and has almost no effect on the process of generating adversarial samples. For simplicity, we will skip the addition of other corresponding string bits in further experiments.

White-box attack

In this section we investigate the effectiveness of the algorithm for the white-box approach. As mentioned above, this scenario assumes the availability of all information about the model structure, as is the case when the detector is deployed on the client side.

By following the algorithm of adversarial PE generation, we managed to confuse our classification model for about 89% of the malicious files.

Removed detection rate. X-axis shows the number of steps in algorithm 1; Y-axis shows the percentage of adversarial malicious files that went undetected by the NN classifier (while their original versions were detected).

Thus, it is easy to change files in order to avoid detection by our model. Now, let us take a closer look at the details of the attack.

To understand the vulnerabilities of our NN we implement the adversarial algorithm for different feature types separately. First, we tried to change string features only (bloom filter). Doing so confuses the NN for 80% of the malware files.

Removed detection rate for string changing only

We also explore which bits of the bloom filter are often set to 1 by the adversarial algorithm.

The histogram of bits, added by the adversarial algorithm to the bloom filter. Y-axis corresponds to the ratio of files that the current bit is added to. A higher rate means that bit is important for decreasing the model score

The histogram shows that some bits of the bloom filter are more important for our classifier, and setting them to 1 often leads to a decrease in the score.

To investigate the nature of such important bits we reversed the popular bits back to the string and obtained a list of strings likely to change the NN score from malware to benign:

Pooled mscoree.dll CWnd MessageBoxA SSLv3_method assembly manifestVersion="1.0" xmlns="urn… SearchPathA AVbad_array_new_length@std Invalid color format in %s file SHGetMalloc Setup is preparing to install [name] on your computer e:TScrollBarStyle{ssRegular,ssFlat,ssHotTrack SetRTL VarFileInfo cEVariantOutOfMemoryError vbaLateIdSt VERSION.dll GetExitCodeProcess mUnRegisterChanges ebcdic-Latin9--euro GetPrivateProfileStringA XPTPSW cEObserverException LoadStringA fFMargins SetBkMode comctl32.dll fPopupMenu1 cTEnumerator<Data.DB.TField cEHierarchy_Request_Err fgets FlushInstructionCache GetProcAddress NativeSystemInfo sysuserinfoorg uninstallexe RT_RCDATA textlabel wwwwz

We also tried to attack the model to force it to misclassify benign files as malware (inverse problem). In this case, we obtained the following list:

mStartTls Toolhelp32ReadProcessMemory mUnRegisterChanges ServiceMain arLowerW fFTimerMode TDWebBrowserEvents2DownloadCompleteEvent CryptStringToBinaryA VS_VERSION_INFO fFUpdateCount VirtualAllocEx Free WSACreateEvent File I/O error %d VirtualProtect cTContainedAction latex VirtualAlloc fFMargins set_CancelButton FreeConsole ntdll.dll mHashStringAsHex mGetMaskBitmap mCheckForGracefulDisconnect fFClientHeight mAddMulticastMembership remove_Tick ShellExecuteA GetCurrentDirectory get_Language fFAutoFocus AttributeUsageAttribute ImageList_SetIconSize URLDownloadToFileA CopyFileA UPX1 Loader

These sets of “good” and “bad” strings look consistent and plausible. For instance, the strings ‘MessageBoxA’ and ‘fPopupMenu1’ are actually often used in benign files. And vice versa, strings like ‘Toolhelp32ReadProcessMemory’, ‘CryptStringToBinaryA’, ‘URLDownloadToFileA’ and ‘ShellExecuteA’ look suspicious.

We also attempted to confuse our model using only binary sections statistics.

Removed detection rate for section added, without bloom features. X-axis corresponds to the number of added sections, Y-axis to the percentage of malware files that become “clean” during adversarial attacks

The graph shows that it is possible to remove detection for about 73% of malware files. The best result is achieved by adding 7 sections.

At this point, the question of a “universal section” arises. That is, a section that leads to the incorrect classification and detection removal of many different files when added to them. Taking this naïve approach, we simply calculated mean statistics for all sections received during the adversarial algorithm and created one “mean” section. Unfortunately, adding this section to the malware files removes just 17% of detections.

Byte histogram of “mean” section: for its beginning and ending. X-axis corresponds to the byte value; Y-axis to the number of bytes with this value in the section part

So, the idea of one universal section failed. Therefore, we tried to divide the constructed adversarial section into compact groups (using the l2 metric).

Adversarial sections dendrogram. Y-axis shows the Euclidian distance between sections statistics

Separating the adversarial sections to clusters, we calculated a “mean” section for each of them. However, the detection prevention rate did not increase rapidly. As a result, in practice, only 25-30% of detection cases can be removed by adding such “universal mean sections”.

The dependence of the removed detection share on the number of clusters for “mean” sections computation

The experiments showed that we do not have a “universal” section for making a file look benign for our current version of NN classifier.

Gray-box attack

All previous attacks were made with the assumption that we already have access to the neural network and its weights. In real life, this is not always the case.

In this section we consider a scenario where the ML model is deployed in the cloud (on the security company’s servers), but features are computed and then sent to the cloud from the user’s machine. This is a typical scenario for models in the cybersecurity industry because sending user files to the company side is difficult (due to legal restrictions and traffic limitations), while specifically extracted features are small enough for forwarding. It means that attackers have access to the mechanisms of feature extraction. They can also scan any files using the anti-malware product.

We created a number of new models with different architectures. To be precise, we changed the number of fully connected layers and their sizes in comparison with the original model. We also collected a large collection of malware and benign files that were not in the original training set. Then we extracted features from the new collection – this can be done by reversing the code of the anti-malware application. Then we labeled the collection in two different ways: by the full anti-malware scan and using just the original model verdicts. To clarify the difference, with the selected threshold the original model detects about 60% of malware files compared to the full anti-malware stack. These models were trained on the new dataset. After that the adversarial attack described in previous sections was implemented for proxy models. The resulting adversarial samples built for the proxy model were tested on the original one. Despite the fact that the architectures and training datasets of the original and proxy models were different, it turned out that attacks on the proxy model can produce adversarial samples for the original model. Surprisingly, attacking the proxy model could sometimes lead to better attack results.

Gray-box attack results compared to white-box attack. Y-axis corresponds to the percentage of malware files with removed detections of the original model. The effectiveness of the gray-box attack in this case is better than that of the white-box attack.

The experiment shows that a gray-box attack can achieve similar results to the white-box approach. The only difference is that more gradient steps are needed.

Attack transferability

We don’t have access to the machine learning models of other security companies, but we do have reports[vi] of gray-box and white-box adversarial attacks being successful against publicly available models. There are also research papers[vii] about the transferability of adversarial attacks in other domains. Therefore, we presume that product ML detectors of other companies are also vulnerable to the described attack. Note that neural networks are not the only vulnerable machine learning type of model. For example, another popular machine learning algorithm, gradient busting, is also reported[viii] to have been subjected to effective adversarial attacks.

Adversarial attack protection

As part of our study, we examined several proposed algorithms for protecting models from adversarial attacks. In this section, we report some of the results of their impact on model protection.

The first approach was described in “Distillation as a defense to adversarial perturbations against deep neural networks“. The authors propose to train the new “distilled” model based on the scores of the first model. They show that for some tasks and datasets this method reduces the effectiveness of gradient-based adversarial attacks. Unfortunately, the idea does not guarantee successful model protection, and in our case, when we trained the model according to the proposed approach, it still turned out to be easily confused by the adversarial algorithm.

We also tried to add noise to data:

  • For continuous features we calculated mean and standard deviation values. We added a random number of sections to each file during model training. Then the generated sections parameters were taken from a Gaussian distribution with parameters equal to the aforementioned mean and deviation values.
  • For the bloom filter structure, we also added a 5% chance of setting each bit to 1.

The idea behind this method is to try to expand the set of potential file sections, making the network more stable and resistant to attacks that add sections to the end of a file. However, this method was not effective either.

We also tried the protection approach described in “Detecting adversarial samples from artifacts“. In many cases, neural network layers contain a dropout mechanism. A dropout is a layer that randomly replaces features from a previous layer with zeros. The idea of the paper is to enable a dropout in the inference stage (while it is usually disabled because the dropout layer mechanism produces a level of randomness and noise in predictions for the same objects), and calculate the score multiple times for each file. Enabling dropout layers leads to slightly different scores for each run, so we get a score distribution. These distributions, according to the authors, have standard deviation values for adversarial samples that are higher than for usual samples. We tried this approach, computing standard deviations of scores for adversarial malware and original malware files. Remarkably, the standard deviations of the adversarial scores appeared to be significantly higher than the deviations of the initial file scores.

Standard deviation distribution for adversarial and initial sample model scores

With this difference we can, for example, detect 50% of adversarial samples with initial sample misclassification of around 10%, with a threshold.

X-axis shows the share of original samples detected as adversarial; Y-axis shows the share of detected adversarial samples

The other approach proposes the use of monotonic networks (see “Monotonic Networks” and “Monotonic models for real-time dynamic malware detection“). The principle behind this method is to create a neural network with positive layer weights and monotonic activation functions. Such models will, by design, be resistant to the addition of new sections and strings, and any addition will simply increase the model detection score, making the attack described in this article impracticable.

Adversarial attack difficulties in the real world

Currently, there is no approach in the field of machine learning that can protect against all the various adversarial attacks, meaning methods that rely heavily on ML predictions are vulnerable. Kaspersky’s anti-malware solution provides a complex multi-layered approach. It contains not only machine learning techniques but a number of different components and technologies to detect malicious files. First, detection relies on different types of features: static, dynamic or even cloud statistics. Complex detection rules and diverse machine learning models are also used to improve the quality of our products. Finally, complex and ambiguous cases go to the virus analysts for further investigation. Thus, confusion in the machine learning model will not, by itself, lead to misclassification of malware for our products. Nevertheless, we continue to conduct research to protect our ML models from existing and prospective attacks and vulnerabilities.

[i] Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial examples.” arXiv preprint arXiv:1412.6572 (2014).

[ii] Brown, Tom B., et al. “Adversarial patch.” arXiv preprint arXiv:1712.09665 (2017).

[iii]  Demetrio, Luca, et al. “Functionality-preserving black-box optimization of adversarial windows malware.” IEEE Transactions on Information Forensics and Security (2021).

[iv] Sharif, Mahmood, et al. “Optimization-guided binary diversification to mislead neural networks for malware detection.” arXiv preprint arXiv:1912.09064 (2019).

[v] Kolosnjaji, Bojan, et al. “Adversarial malware binaries: Evading deep learning for malware detection in executables.” 2018 26th European signal processing conference (EUSIPCO). IEEE, 2018;

Kreuk, Felix, et al. “Deceiving end-to-end deep learning malware detectors using adversarial examples.” arXiv preprint arXiv:1802.04528 (2018).

[vi] Park, Daniel, and Bülent Yener. “A survey on practical adversarial examples for malware classifiers.” arXiv preprint arXiv:2011.05973 (2020).

[vii] Liu, Yanpei, et al. “Delving into transferable adversarial examples and black-box attacks.” arXiv preprint arXiv:1611.02770 (2016).

Tramèr, Florian, et al. “The space of transferable adversarial examples.” arXiv preprint arXiv:1704.03453 (2017).

[viii] Chen, Hongge, et al. “Robust decision trees against adversarial examples.” International Conference on Machine Learning. PMLR, 2019.

Zhang, Chong, Huan Zhang, and Cho-Jui Hsieh. “An efficient adversarial attack for tree ensembles.” arXiv preprint arXiv:2010.11598 (2020).

Syndikovat obsah