How real and present is the malware threat from AI?

Over the last few months, we have seen a number of proof of concepts (PoCs) that demonstrate ways ChatGPT and other generative AI platforms can be used to perform many tasks involved in a typical attack chain. And since November 2022, white hat researchers and hacking forum users have been talking about using ChatGPT to produce Python-based infostealers, encryption tools, cryptoclippers, cryptocurrency drainers, crypters, malicious VBA code, and many other use cases.

In response, OpenAI has tried to prevent terms-of-use violations. But because the functions of malicious software are often indistinguishable from legitimate software, they rely on identifying presumed intent based on the prompts submitted. Many users adapted and have developed approaches for bypassing this. The most common is “prompt engineering”, the trial-and-error process were both legitimate and malicious users tailor the language used to achieve a desired end response.

For example, instead using a blatantly malicious command such as “generate malware to circumvent vendor X’s EDR platform”, several seemingly innocent commands are input. The code responses are then appended to make custom malware. This was recently demonstrated by security researcher codeblue29, who successfully leveraged ChatGPT to identify a vulnerability in an EDR vendor’s software and produce malware code – this was ChatGPT’s first bug bounty.

Similar success has been achieved via brute force-oriented strategies. In January 2023, researchers from CyberArk published a report demonstrating how ChatGPT’s content filters can be bypassed by “insisting and demanding” that ChatGPT carry out requested tasks.

Others have found ways of exploiting differences in the content policy enforcement mechanisms across OpenAI products.

Cyber criminal forum users were recently observed advertising access to a Telegram bot they claim leverages direct access to OpenAI’s GPT-3.5 API as a means of circumventing the more stringent restrictions placed on users of ChatGPT.

Several posts made on the Russian hacking forums XSS and Nulled promote the tool’s ability to submit prompts to the GPT-3.5 API directly via Telegram. According to the post, this method allows users to generate malware code, phishing emails and other malicious outputs without needing to engage in complex or time-consuming prompt engineering efforts.

Arguably the most concerning examples of large language model (LLM)-enabled malware are those produced via a combination of the above tactics. For example, a PoC published in March 2023 by HYAS demonstrated the capabilities of an LLM-enabled keylogger, BlackMamba, which includes the ability to circumvent standard Endpoint Detection and Response (EDR) tools.

Yet despite its impressive abilities, ChatGPT still has accuracy issues. Part of this is due to the way generative pre-trained transformers (GPTs) function. They are prediction engines and are not specifically trained to detect factual errors, so they simply produce the most statistically probable response based on available training data.

This can lead to answers that are patently untrue – often referred to as “hallucinations” or “stochastic parroting” – a key barrier to the implementation of GPT-enabled services in unsupervised settings. The concerns are the same about the quality of code produced by ChatGPT – so much so that ChatGPT-generated comments were banned from code sharing forum Stack Overflow almost immediately following initial release.

Current-generation GPT models don’t effectively and independently validate the code they generate, regardless of whether prompts are submitted through the ChatGPT GUI or directly via API call. This is a problem for would-be polymorphic malware developers, who would need to be skilled enough to validate all possible modulation scenarios to produce exploit code capable of being executed.

This makes the barriers to entry for lower-skilled threat actors prohibitively high. As Trend Micro’s Bharat Mistry argues, “Though ChatGPT is easy to use on a basic level, manipulating it so that it was able to generate powerful malware may require technical skill beyond a lot of hackers.”

The NCSC also assesses that even those with significant ability are likely to develop malicious code from scratch more efficiently than using generative AI.

Further iterations of GPT models have already begun expanding the capabilities of commercially available LLM-enabled products. These future developments may diminish the technical threshold required for motivated threat actors to conduct adversarial operations above their natural skill level.

However, presently, although current-generation LLMs present both considerable promise and considerable risk, their broader security impacts are still muted by limitations in the underlying technology. The pace of innovation and improvement is rapid and future advancements will expand the possibilities available to the average generative AI user, increasing the potential for further misuse.

Leave a Reply

Your email address will not be published. Required fields are marked *