The Great AI Swindle

Renaud Bidou 02/17/2024

Making predictions

January is the month of coming year predictions and expected trends in cybersecurity. But how can one make such predictions? Theoretically, there are two approaches to predict the future.

The first one is to find a genius (with a crystal ball preferably) who will really try to guess what the emerging threats will be next year. The success rate is usually similar to that of a broken clock, that gives the right time twice a day. It is then no surprise to see the industry moving to a more reliable technique.

Whoever spends enough time every day to read blog and articles about attack techniques, incident response technical reports, malware analysis, exploit details, red team operations and other technically relevant topics can easily see trends. When December comes, making predictions is just a question of considering the topics with the most noticeable "growth" and label them with terms such as "rise of", "emergence of", etc. Add the commonly high recurrence topics (usually ransomware, APT, IoT) and label them with "evolution of", "increased use of". Spour with a little bit of buzz words matching the services and products you sell, and you have your "CyberSecurity 2024 Predictions" report.

This year is no exception, with a special and confusing guest: "Artificial Intelligence".

Debunking the AI Apocalypse

No Future

The public release of ChatGPT in the end of last year opened a new playground for security professionals. An (almost) green field to test, break and exploit. In the meantime, the apparent "magic" of the Large Language Models (LLM) and its trivial accessibility to the masses, provided exceptional exposure to the topic, and consequently to any security-related finding.

In this context, the lack of understanding of the technology - empowered by usual prejudices versus disruptive-looking innovation - lead to a common fear-of-the-unknown phenomenon, leveraged by all the actors in search for 15 minutes of fame. We've then been flooded with lapidary expressions such as "AI helps creating undetectable malware", "AI finds 0-days", "AI can generate deep fakes that can fool anyone", and felt desperately vulnerable to those out-of-control, malevolent intelligent entities. Skynet 1 is real and IT infrastructures are doomed.

AI is not Generative AI

It could be worth stepping back a little bit and having a close look at what we are really dealing with. Subtly dodging the debate about the real meaning of AI (if any), we can simply consider that we call AI the capability of a computer to perform tasks a human would, once properly trained. Making an addition, sorting elements of a list and classifying data is AI as well as summarizing a text or generating a vocal speech with a custom voice.

Obviously, we do not refer to AI in the case of a bubble sort algorithm (although we could), and usually focus on machine learning-based operations: classification, regression, reinforced learning, deep learning and natural language processing. Text Generative AI (ChatGPT-like engines) belong to a sub-category (actually sub-sub-category) of this last one.

Therefore, considering Generative AI as the only potential risk is definitely wrong. Machine learning is now part of almost every application (to different levels of course) and Generative AI only represents a small proportion of the potential attack surface.

AI does not innovate (at least in CyberSecurity)

That said, one should always keep in mind that machine learning purpose is to make predictions based on dataset previously ingested. Predictions can be a single value (such as a temperature or a volume of data), a category (car vs. bike, benign vs. malicious file), the next probable word in a sentence etc.

Therefore, there is no way an "AI" can invent a new evasion technique. It will always rely on existing ones. If your security is properly set up to detect and mitigate existing evasion attempts, there is no additional risk. There may be more attempts as asking ChatGPT is easier that making a Google search and reading an article, but in the end the result should be the same. So no, AI is not going to alter the already-existing risk of evasive malware or exploit.

Similarly, AI is accused of reducing threats development time. We have 2 options here. In the first case, we think of creating a new malware/C2/exploit/etc. Same as evasion techniques those "new" threats will just rely on already published PoC or malicious code and will not bring any real innovation. So yes, maybe someone will come with a meterperter generated from scratch in a matter of seconds. So what? As (hopefully) you do not entirely rely on deterministic engines (such as signatures) to protect you infrastructure, other mechanisms such as behavioral analysis, sandboxing or ML engines will catch the malware just like any other variants, AI generated or not.

The second case deals with 0-days. Can AI help finding vulnerabilities? Yes. Several modules have been developed for the most popular reverse engineering tools to speed up the research of vulnerabilities in applications code. Same for the fuzzing engines 7, empowered with machine learning to make operations faster and more reliable. In this field, AI is a game changer, on both sides. Indeed, code testing is available to both attackers and authors. That means that more vulnerabilities will (should...) be found before the release of a product. No more 15 years old bugs in the Linux kernel 3... From this standpoint, I am not sure AI is to be considered as a threat to cybersecurity.

We also read that AI helps weaponizing 0-days faster. Actually, once 0-days are published they are most of the time provided with a PoC. Meaning that they are already weaponized anyway...

AI is not (yet) smart enough

Deep fake is another story. We probably all heard about this recent story of 25M$ theft through fake video meetings 2. Many could say "I said it, it was in my predictions". And that's true. But it would be very valuable to go deeper into this story. To run this attack, offenders organized 20 Teams meetings (number differs a few but you get the idea) involving the victim and AI generated participants, with no interaction between the victim and the other participants. Those meetings led to 15 payments based on verbal instructions with no written trace, for the total amount provided above. Doesn't this ring a bell somewhere?

We had a similar story in France a few months ago. The secretary of a political party was abused by sellers who took advantage of her temporary psychological weakness and her will to help noble cause. Skipping the details, she eventually bought for 250k€ (200k$) of over overpriced office supplies, such as 500 paper sheets for 2.500€ (~2.000$). No AI involved, not even a scam, just a dishonest selling technique that turned into harassment in the end.

Except for the amount of the fraud, the second technique is much more dangerous than the first one... and prevention remains the same: education and processes.

Embracing the reality of AI

Old risks, new techniques

With a little bit of research on the topic 5, one may have noticed that there are real threats related to machine learning-based applications. We mainly talk about adversarial attacks, that aim at deceiving the engine to output something undesired. This is a common process manipulation, that can lead to malicious code injection on the client, data leak, user misleading, application internals access or operation disruption.

Most of those risks are not new. But the techniques are, and the real danger lies in the fact that these techniques are constantly evolving.

Prompt injections

The most popular type of adversarial attack is prompt injection, simply because almost anyone can successfully try it on LLMs. They are largely documented and dozens of different techniques, from the simple context manipulation to indirect requests stealthy embedded in pictures, are available to have the engine answer (or do) almost anything. No doubt this is an issue. Let's see what the potential security impacts are.

The first one is the capability to have the engine reveal internal information - such as its own code, administrator credentials, components nature and versions - that could be leveraged by an intruder to compromise the application or the whole system. This is a typical jailbreaking attack. In its most severe form, it would allow remote command injections on the hosting system, with the LLM as an easy-to-use UI. This is definitely a risk and has to be secured, otherwise your new customer support bot will turn into a splendid trojan horse with a human language interface.

The second is the ability to access content submitted by other users, potentially giving access to compromising information to anyone (usually with malicious intents). It could be a piece of code, a configuration files, some security logs, a user list, or anything that has been submitted to the engine as content to be processed for any reason. Any highly sensitive piece of information should not be used in public facing LLMs. This represents a high risk of having them exposed to unintended actors and leveraged to prepare an intrusion. Similarly, GPT uploaded data are shared across all GPT engines 6. Any vulnerability in one of the hosting application will expose data submitted to any other... Clearly, don't upload anything sensitive.

Another potential risk lies in LLMs plugins. Some are designed to apply specific actions based on the answer of the application. A prompt injection attack could lead to have a specific plugin execute malicious code, or upload user data. In the end it is nothing more than a second order injection attack or even a usual Cross-Site Scripting, but again the vector is new and not necessarily getting the attention it should from a security standpoint.

AI is not perfect, not even close

Another risk is the belief that AI engines always provide the best possible response. Aside of the model manipulation attacks we will discuss later, we consider a more human issue. Indeed, asking the engine to provide the right configuration for an EDR, to generate a code snippets for a web application, or to triage some security logs may be a good idea. As long as you know what you are doing, and are able to perform an educated review of the provided answer.

There is no guarantee that a security configuration designed by an AI is the most appropriate in the context of a specific security infrastructure, as well as you should not trust the code provided for your web application or blindly rely on its analysis of your security logs. AI saves time and helps making decisions, but it should only be used as a starting point, never as a definitive solution. This has been known for quite sometime (since the design of first machine learning engines actually) and summarized in one of the golden rules: "all models are wrong, but some are useful". That's the reason why you will (should) never find a 100% AI/ML-based solution.

Moreover, it is possible to have the engine "hallucinating", that is providing inaccurate (or completely meaningless) answers. Indirect prompt injections and token manipulation are common techniques against LLM, but other models are also vulnerable to malicious inputs, such as those created through Generative Adversarial Networks (GAN) as an example. Once a user (or an automated system) is tricked to use such type of manipulated input, he can be easily abused if he blindly takes the response as valid. In this scenario, a human can be led to deploy vulnerable code or apply erroneous recommendation, while a SOAR platform would miss event proper qualification or apply irrelevant reaction (or no reaction at all).

Attacking the model

The core of a machine-learning engine is the model. Trained with relevant dataset, the model will make the predictions upon further submitted inputs. Prediction can be a value, a category, a text, an image, etc. The model then clearly appears as the cornerstone of the whole system. Aside of the "hallucination" phenomenon we discussed above, we can distinguish 2 types of attacks: model corruption and model algorithm identification.

Model corruption occurs when a learning loop is setup. That is the model continuously learn either by regularly enriching its sources (in Natural Language Processing, including LLM) or via human revision of inaccurate predictions. If an analyst considers a real security incident as false-positive and the model is fed with this new invalid piece of information, the model will lower the risk of similar subsequent events. Even more subtle, an exceptional but innocuous event frequently repeated can provoke overfitting of the model (modifying its algorithm to provide valid prediction for this exceptional event). And overfitting is bad, very bad. In short, exceptions should always generate wrong predictions. If the model is twisted in such a way that exceptions are considered as standard data, it's total accuracy will be dramatically lowered, impacting the quality subsequent predictions.

Identifying the algorithm enables an attacker to generate crafted input data to get a specific output. It is much more subtle than injection, as the input is entirely legitimate. Although the internal prediction process is rarely known (and even more rarely understood), key features involved in prediction may be found via techniques similar to fuzzing or via GANs. Once those variables and relevant values are identified, it becomes possible to evade machine learning-based security engines by crafting a malicious input that would match the main features values of innocuous content. Exposure to this type of attack highly depends on the quality of the model. Models built without proper feature scaling, or created with collinear or dummy features are particularly vulnerable to this type of attack, as some features may get prominent weight in the prediction.

The supply chain

The previous attacks are external ones. The attacker doesn't have access to the model itself. Obviously, if the attacker was to get his hands on the model or (even worst) the code used for its generation, this could be considered as a "Game Over", unless integrity check is regularly performed on the model file (should be a no-brainer considering the criticality of this component). Nothing new here, just ensuring the confidentiality and integrity of a critical process component.

Here is the trick, although AI supply chains are subject to the same attacks as any other, the risk is higher as multiple new (for the cybersecurity standpoint) components are involved, from the libraries to the model format or the model file itself. Several quite critical vulnerabilities have already been found, and it is an easy guess that more will be discovered and exploited in the coming weeks and months.

Conclusion & 2024 Predictions

We usually say that AI is new. Actually, Warren McCulloch and Walter Pitts designed the perceptron - a single layer neural network 4 - in 1943, later implemented (in 1958) by Franck Rosenbalt. So AI is not that new... But it is now accessible to anyone, revealing its capability to dramatically augment (and therefore change) the way things work. AI makes cyberdefense mechanisms more efficient, cyberattacks as well. Security engineers have access to enhanced research and analysis capabilities, so have hackers. Actually, even the paradigm is not new.

Still, AI remains quite mythical (if not mystical) to many people, and it is easy to present the low-hanging fruit (eg. Generative AI) as the main risk to cybersecurity. This is definitely a fraud. Most applications (including security ones) embed machine learning models, LLM-powered assistants and other AI components. Unfortunately there is a skill gap. And that leads to this never-ending story of old security issues rising again each time a technology gets widely adopted.

That's why, in 2024, we should see a rise in exploitation of poorly designed AI engines, a continuous growth of errors due to misused AI and an emergence of attacks against supply chains of AI-based applications.

[1] Skynet - Wikipedia [2] Finance worker pays out $25 million after video call with deepfake 'chief financial officer' - Article on CNN [3] New Old Bugs in the Linux Kernel - Blog Article [4] Perceptron - Wikipedia [5] ParaCyberBellum Library - Machine Learning & AI [6] ChatGPT: Lack of Isolation between Code Interpreter sessions of GPTs - Blog Article [7] KernelGPT: Enhanced Kernel Fuzzing via Large Language Models - Research Paper

← Back to Blog