As more and more organizations embrace Artificial Intelligence (AI) and Machine Learning (ML) to optimize their operations and gain a competitive advantage, there’s growing attention on how best to keep this powerful technology secure. At the center of this is the data used to train ML models, which has a fundamental impact on how they behave and perform over time. As such, organizations need to pay close attention to what’s going into their models and be constantly vigilant for signs of anything untoward, such as data corruption.
Unfortunately, as the popularity of ML models has risen, so too has the risk of malicious backdoor attacks that see criminals use data poisoning techniques to feed ML models with compromised data, making them behave in unforeseen or harmful ways when triggered by specific commands. While such attacks can take a lot of time to execute (often requiring large amounts of poison data over many months), they can be incredibly damaging when successful. For this reason, it is something that organizations need to protect against, particularly at the foundational stage of any new ML model.
A good example of this threat landscape is the Sleepy Pickle technique. The Trail of Bits blog explains that this technique takes advantage of the pervasive and notoriously insecure Pickle file format used to package and distribute ML models. Sleepy Pickle goes beyond previous exploit techniques that target an organization’s systems when they deploy ML models to instead surreptitiously compromise the ML model itself. Over time, this allows attackers to target the organization’s end-users of the model, which can cause major security issues if successful.
Senior Solutions Architect at HackerOne.
The emergence of MLSecOps
To combat threats like these, a growing number of organizations have started to implement MLSecOps as part of their development cycles.
At its core, MLSecOps integrates security practices and considerations into the ML development and deployment process. This includes ensuring the privacy and security of data used to train and test models and protecting models already deployed from malicious attacks, along with the infrastructure they run on.
Some examples of MLSecOps activities include conducting threat modelling, implementing secure coding practices, performing security audits, incident response for ML systems and models, and ensuring transparency and explainability to prevent unintended bias in decision-making.
The core pillars of MLSecOps
What differentiates MLSecOps from other disciplines like DevOps is that it’s exclusively concerned with security issues within ML systems. With this in mind, there are five core pillars of MLSecOps, popularized by the MLSecOps community, which together form an effective risk framework:
Supply chain vulnerability
ML supply chain vulnerability can be defined as the potential for security breaches or attacks on the systems and components that make up the supply chain for ML technology. This can include issues with things like software/hardware components, communications networks, data storage and management. Unfortunately, all these vulnerabilities can be exploited by cybercriminals to access valuable information, steal sensitive data, and disrupt business operations. To mitigate these risks, organizations must implement robust security measures, which include continuously monitoring and updating their systems to stay ahead of emerging threats.
Governance, risk and compliance
Maintaining compliance with a wide range of laws and regulations like the General Data Protection Regulation (GDPR) has become an essential part of modern business, preventing far-reaching legal and financial consequences, as well as potential reputational damage. However, with the popularity of AI growing at such an exponential rate, the increasing reliance on ML models is making it harder and harder for businesses to keep track of data and ensure compliance is maintained.
MLSecOps can quickly identify altered code and components and situations where the underlying integrity and compliance of an AI framework may come into question. This helps organizations ensure compliance requirements are met, and the integrity of sensitive data is maintained.
Model provenance
Model provenance means tracking the handling of data and ML models in the pipeline. Record keeping should be secure, integrity-protected, and traceable. Access and version control of data, ML models, and pipeline parameters, logging, and monitoring are all crucial controls that MLSecOps can effectively assist with.
Trusted AI
Trusted AI is a term used to describe AI systems that are designed to be fair, unbiased, and explainable. To achieve this, Trusted AI systems need to be transparent and have the ability to explain any decisions they make in a clear and concise way. If the decision-making process by an AI system can’t be understood, then it can’t be trusted, but by making it explainable, it becomes accountable and, therefore, trustworthy.
Adversarial ML
Defending against malicious attacks on ML models is crucial. However, as discussed above, these attacks can take many forms, which makes identifying and preventing them extremely challenging. The goal of adversarial ML is to develop techniques and strategies to defend against such attacks, improving the robustness and security of machine learning models and systems along the way.
To achieve this, researchers have developed techniques that can detect and mitigate attacks in real time. Some of the most common techniques include using generative models to create synthetic training data, incorporating adversarial examples in the training process, and developing robust classifiers that can handle noisy inputs.
In a bid to quickly capitalize on the benefits offered by AI and ML, too many organizations are putting their data security at risk by not focusing on the elevated cyber threats that come with them. MLSecOps offers a powerful framework that can help ensure the right level of protection is in place while developers and software engineers become more accustomed to these emerging technologies and their associated risks. While it may not be required for a long time, it will be invaluable over the next few years, making it well worth investing in for organizations that take data security seriously.
We’ve featured the best online cybersecurity course.
This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: