Prepare to explore the complex world of deep reinforcement learning and cybersecurity backdoors with experts Vas Mavroudis and Jamie Gawith at the upcoming Black Hat Conference 2024 in Las Vegas.
Guests:
Vas Mavroudis, Principal Research Scientist, The Alan Turing Institute
Website | https://mavroud.is/
At BlackHat | https://www.blackhat.com/us-24/briefings/schedule/speakers.html#vasilios-mavroudis-34757
Jamie Gawith, Assistant Professor of Electrical Engineering, University of Bath
On LinkedIn | https://www.linkedin.com/in/jamie-gawith-63560b60/
At BlackHat | https://www.blackhat.com/us-24/briefings/schedule/speakers.html#jamie-gawith-48261
____________________________
Hosts:
Sean Martin, Co-Founder at ITSPmagazine [@ITSPmagazine] and Host of Redefining CyberSecurity Podcast [@RedefiningCyber]
On ITSPmagazine | https://www.itspmagazine.com/sean-martin
Marco Ciappelli, Co-Founder at ITSPmagazine [@ITSPmagazine] and Host of Redefining Society Podcast
On ITSPmagazine | https://www.itspmagazine.com/itspmagazine-podcast-radio-hosts/marco-ciappelli
____________________________
Episode Notes
As Black Hat Conference 2024 approaches, Sean Martin and Marco Ciappelli are gearing up for a conversation about the complexities of deep reinforcement learning and the potential cybersecurity threats posed by backdoors in these systems. They will be joined by Vas Mavroudis from the Alan Turing Institute and Jamie Gawith from the University of Bath, who will be presenting their cutting-edge research at the event.
Setting the Stage: The discussion begins with Sean and Marco sharing their excitement about the upcoming conference. They set a professional and engaging tone, seamlessly leading into the introduction of their guests, Jamie and Vas.
The Core Discussion: Sean introduces the main focus of their upcoming session, titled "Backdoors in Deep Reinforcement Learning Agents." Expressing curiosity and anticipation, he invites Jamie and Vas to share more about their backgrounds and the significance of their work in this area.
Expert Introductions: Jamie Gawith explains his journey from working in power electronics and nuclear fusion to focusing on cybersecurity. His collaboration with Vas arose from a shared interest in using reinforcement learning agents for controlling nuclear fusion reactors. He describes the crucial role these agents play and the potential risks associated with their deployment in critical environments.
Vas Mavroudis introduces himself as a principal research scientist at the Alan Turing Institute, leading a team focused on autonomous cyber defense. His work involves developing and securing autonomous agents tasked with defending networks and systems from cyber threats. The conversation highlights the vulnerabilities of these agents to backdoors and the need for robust security measures.
Deep Dive into Reinforcement Learning: Vas offers an overview of reinforcement learning, highlighting its differences from supervised and unsupervised learning. He emphasizes the importance of real-world experiences in training these agents to make optimal decisions through trial and error. The conversation also touches on the use of deep neural networks, which enhance the capabilities of reinforcement learning models but also introduce complexities that can be exploited.
Security Concerns: The discussion then shifts to the security challenges associated with reinforcement learning models. Vas explains the concept of backdoors in machine learning and the unique challenges they present. Unlike traditional software backdoors, these are hidden within the neural network layers, making detection difficult.
Real-World Implications: Jamie discusses the practical implications of these security issues, particularly in high-stakes scenarios like nuclear fusion reactors. He outlines the potential catastrophic consequences of a backdoor-triggered failure, underscoring the importance of securing these models to prevent malicious exploitation.
Looking Ahead: Sean and Marco express their anticipation for the upcoming session, highlighting the collaborative efforts of Vas, Jamie, and their teams in tackling these critical issues. They emphasize the significance of this research and its implications for the future of autonomous systems.
Conclusion: This pre-event conversation sets the stage for a compelling session at Black Hat Conference 2024. It offers attendees a preview of the insights and discussions they can expect about the intersection of deep reinforcement learning and cybersecurity. The session promises to provide valuable knowledge on protecting advanced technologies from emerging threats.
Be sure to follow our Coverage Journey and subscribe to our podcasts!
____________________________
This Episode’s Sponsors
LevelBlue: https://itspm.ag/levelblue266f6c
Coro: https://itspm.ag/coronet-30de
SquareX: https://itspm.ag/sqrx-l91
Britive: https://itspm.ag/britive-3fa6
AppDome: https://itspm.ag/appdome-neuv
____________________________
Follow our Black Hat USA 2024 coverage: https://www.itspmagazine.com/black-hat-usa-2024-hacker-summer-camp-2024-event-coverage-in-las-vegas
On YouTube: 📺 https://www.youtube.com/playlist?list=PLnYu0psdcllRo9DcHmre_45ha-ru7cZMQ
Be sure to share and subscribe!
____________________________
Resources
Deep Backdoors in Deep Reinforcement Learning Agents: https://www.blackhat.com/us-24/briefings/schedule/index.html#deep-backdoors-in-deep-reinforcement-learning-agents-39550
Learn more about Black Hat USA 2024: https://www.blackhat.com/us-24/
____________________________
Catch all of our event coverage: https://www.itspmagazine.com/technology-cybersecurity-society-humanity-conference-and-event-coverage
To see and hear more Redefining CyberSecurity content on ITSPmagazine, visit: https://www.itspmagazine.com/redefining-cybersecurity-podcast
To see and hear more Redefining Society stories on ITSPmagazine, visit:
https://www.itspmagazine.com/redefining-society-podcast
Are you interested in sponsoring our event coverage with an ad placement in the podcast?
Learn More 👉 https://itspm.ag/podadplc
Want to tell your Brand Story as part of our event coverage?
Learn More 👉 https://itspm.ag/evtcovbrf
Deep Backdoors in Deep Reinforcement Learning Agents | A Black Hat USA 2024 Conversation with Vas Mavroudis and Jamie Gawith | On Location Coverage with Sean Martin and Marco Ciappelli
Please note that this transcript was created using AI technology and may contain inaccuracies or deviations from the original audio file. The transcript is provided for informational purposes only and should not be relied upon as a substitute for the original recording, as errors may exist. At this time, we provide it “as it is,” and we hope it can be helpful for our audience.
_________________________________________
[00:00:00] Sean Martin: There we go, Marco.
[00:00:03] Marco Ciappelli: Sean.
[00:00:04] Sean Martin: Woof woof, I mean uh, vroom vroom.
[00:00:06] Marco Ciappelli: Vroom vroom, woof woof is in the back there. Woof woof is in the back. The coyote, I call her the coyote. Oh, vroom vroom is in the car.
[00:00:12] Sean Martin: Before we recorded you said I had a barking face. Well, I
don't know, I never know. A face only a
dog can love.
[00:00:23] Marco Ciappelli: I don't know.
I never know how you're going to start a podcast. So
[00:00:26] Sean Martin: I
[00:00:27] Marco Ciappelli: don't even know how we get guests still. Like if they had never seen what we do, maybe they come otherwise. They're like, these guys are weird.
[00:00:36] Sean Martin: Exactly. Well, what is, what's not weird is, uh, all the cool stuff that's going on at Black Hat. And I don't know, maybe it's weird, weird in a good way because, uh, if we're not doing this research, things can really get strange in our world, but, uh, yeah, lots of great talks and I spent some time picking a few that caught my attention.
And this was certainly one of them. Uh, it's called deep back doors and deep reinforcement learning agents. And, uh, I was like, I don't know what that is, but it sounds cool. Let's find out. And, uh, it's a panel of folks, uh, Vass, Jamie, Sanyam, and Chris. And we were fortunate Markbo to have, uh, Jamie and Vass on with us today.
Welcome guys.
[00:01:24] Jamie Gawith: Thanks, Sean. Yeah, really pleased to be here.
[00:01:27] Sean Martin: Thanks very much. Good to have you on. Good to have you on. And if you haven't figured out that those are listening, this is part of our chats on the road to Black Hat, uh, where we pick topics and, uh, and have a good chat with some of the presenters.
Uh, to learn more and, and, and kind of figure out what's, what's being talked about there, and then hopefully you go and listen to the session as well and connect with the speakers. So that's what this is all about. A quick moment from each of you to share a bit about, uh, what you're each up to at your respective, uh, universities and institutes, and, uh, then we'll go from there about learn more about the topic.
So Jamie, I'll start with you.
[00:02:07] Jamie Gawith: Yeah, thanks Sean. So, uh, I'm a lecturer here at the University of Bath. So we're down on the southwest of England, about an hour out of London or so, and I'm, uh, in the electrical engineering department. So I guess full disclosure for the Black Hat Conference, you know, I'm not a cyber security expert.
I'm sort of almost piggybacking on VASA's, uh, expertise, but I guess, um, you know, my main field of research is power electronics and electrical power conversion. But I guess relevant to the podcast and relevant to the black hat talk, um, is that I've been working in the field of nuclear fusion since about 2020.
And so back in, I think it must've been about 2021 Vaz, we kind of met and we both got discussing, I think you were quite interested in nuclear fusion. I was very interested in cyber security. Uh, and yeah, we stayed in touch ever since or caught up quite a bit ever since. And then earlier this year, Vaz said, Hey, look, look at all of this cool stuff happening with.
The control of nuclear fusion reactors using these reinforcement learning agents, um, and yeah, I got talking about that and security and I think that's where we, uh, got this whole thing going.
[00:03:20] Sean Martin: I'm getting the chills, but, but happy that you're talking as well. Bass, what are you up to?
[00:03:27] Vas Mavroudis: Yeah, um, Nice to meet you all.
Thanks for coming. Um, I, I am a principal research scientist at the Alan Turing Institute. I lead the team here on autonomous cyber defense, and we play both sides, both attack and defense using autonomous agents, and as a result, we also care about the security and robustness of those agents because now we're introducing machine learning into cybersecurity, so these models become part of the attack surface.
Uh, so we started looking into their security and this is how we discovered that reinforcement learning was actually pretty vulnerable to backdoors as we've seen them in software.
[00:04:08] Marco Ciappelli: Alright, so like I said, when I joined, before we started recording, my head, it's already exploding. Like Jamie, I am not a cyber security person.
I've been piggybacking on Sean for the past 15 years on this, but I look at society and technology. And, uh, I want to start with the application that you already went there. Fusion, driving health care, and maybe for people listening that may not know as much as you do for sure, what is actually deep reinforcement learning and an overview of how it's already being used.
Any of you.
[00:04:46] Vas Mavroudis: Yeah. So, uh, reinforcement learning and I'll get later to the deep part, um, is a, uh, material learning paradigm that is a bit less common. So when we're talking about language models or, um, um, models that recognize objects in pictures, this is usually either supervised or unsupervised. Um, in simple terms, it means that there is a data set to learn from.
Now, The real world usually doesn't have data sets about lots of things. There are some things we have good data sets for, but in most cases we have systems we interact with and there's no data set. It's just experience that you learn through. Now, reinforcement learning is doing exactly that. Um, you get to interact with a system, whatever this is.
This can be a, um, a vehicle in a simulation, a drone in a simulation or in the real world, but usually we use simulations because it's faster and you can parallelize training. Uh, it can be, um, games, board games for example, um, can be lots of things and you learn what's a good strategy to follow in those games in that case.
Uh, through trial and error and playing lots of what's called episodes. This is the reinforcement learning paradigm. There is no data set to learn from, but there is an environment to, um, experiment with.
[00:06:08] Sean Martin: And this is pulling, if we look at something like a car, this is pulling data from sensors and hubs and controllers and buses and all that stuff.
Right. And the cloud users,
[00:06:19] Vas Mavroudis: I don't know, whatever. Exactly. So the, uh, the way it learns is. Without anthropomorphizing, uh, it's very similar to what humans do. So you take an action and you observe what effect this action had to the environment. And later on, you might think a bit, uh, backwards, trying to figure out which of your actions were, um, We're good for your goal and which didn't actually contribute as you might have expected and so you adapt your strategy And if you do this for long enough, you actually end up with something.
That's pretty decent
[00:06:53] Marco Ciappelli: scary, so
[00:06:57] Sean Martin: So this this So maybe describe the the cyber Environment because I we have had two chats already where one was very policy oriented on measuring National cyber security strategies Um, so not much tech stack to evaluate there, but maybe there, I don't know, we'll leave that one there for a second.
The other is more around, uh, yeah, measuring how well programs work. And then I just had another, another conversation from, it's coming from B Sides actually, looking at metrics of, of incident response. And so I don't know where, where your environment sits, um, and what the, uh, learning is evaluating in terms of.
Can you kind of give us an overview of that? Because there's a lot to, a lot to consider, right?
[00:07:55] Vas Mavroudis: Yeah, absolutely. Um, as a side note, because I promised earlier and then I didn't deliver, uh, when we are talking about deep reinforcement learning, it It refers to reinforcement learning based on neural networks, um, uh, deep neural networks to be precise.
Um, there are other ways you can do reinforcement learning using more traditional statistical techniques, but I would say that every capable reinforcement learning model nowadays is using deep neural networks. So it's almost a, an unnecessary detail for specifying that it's deep. In most cases, 99 percent of the cases, it is the, uh, going back to your question.
Now, um, we, my team here at the Turing is actually focused, as I said, on autonomous cyber defense. And this means we, uh, work a lot with, um, diesel environments. This includes both systems and networks, computer networks. And the goal in those cases is, uh, assuming you have an intrusion, this can be, um, some scripted attack.
It can be a, uh, a human adversary actually trying to exploit your services and traverse your network, escalating their access. And, um, You as a defender, uh, you would like to offload your human operators that respond to those alerts by have training an agent that does at least some of their tasks. So the agent could actually, uh, fuse, um, alerts from various, uh, network monitoring tools that they might reveal that there is potentially something going on the network and then go on on its own and investigate a bit more.
Uh, and in some cases, and depending on how much Uh, capabilities you want to give to such an agent. Um, take on some, um, remediation actions that try to, um, put the adversary from your network.
[00:09:50] Marco Ciappelli: All right. So I'm reading the intro for this panel. So Jemmy, I mean, you, you, you are, you do work with atomic energy.
You are an expert in that. Um, I see professor and researcher. Um, And I also read things like Vaz just said, like, it takes actually some decision, but also at superhuman speed. Yeah. And that's where I kind of like, okay, yeah, that could go wrong. Um, so tell me a little bit about it.
[00:10:24] Jamie Gawith: Yeah, yeah, I think that that's exactly it.
So from my point of view, I'd sort of describe myself as like a hardware systems engineer. I guess in the case of a nuclear fusion reactor, I'll just give a very brief background to a very complicated area of technology. But basically nuclear fusion, it's kind of the opposite of a nuclear fission reactor.
Okay. that we have operating today. So for nuclear fission, we've got these uranium, uh, material that sort of fissions into two elements, uh, to produce energy. Fusion's the opposite. You'll take light elements like hydrogen and it'll fuse together to form helium and produce a bunch of energy. So we're, we're trying to develop these experimental reactors, uh, and in order to do this, so this is the reaction that goes on in the center of the sun, in the center of stars.
Uh, but it's a very, very difficult reaction to make happen. So we have to heat up the, this hydrogen, uh, this hydrogen fuel to sort of literally over a hundred million degrees, uh, before it does this, we have to recreate the conditions inside the center of a star. And when we get the fuel up to these very high temperatures, uh, basically if you've ever seen the surface of the sun, if you've ever seen sort of an image of that, it's not constant.
It's sort of bubbling away every so often. It's flaring, there's coronal mass ejections, and so on. Similarly in the machines that we design, uh, the plasma is unstable. It's always trying to escape. And basically how we run these reactors is we have a bunch of sensors, sort of dozens or even maybe up to a hundred sensors, looking at the plasma, looking at its position, its temperature and density and so on, and then it's using the actuators, which are usually magnetic field coils and the fueling rate and the heating rate, uh, to decide how best to contain it.
So what you've got, I guess, similarly to autonomous vehicles, where if you've got cameras and lidars, you've got all these sensor, all the sensor input that you guys were talking about, and you have to make decisions on whether to accelerate, brake and steer. It's the same sort of thing with these plasmas.
So you need to, uh, yeah, you basically need to, you've got a very complex control problem. And this is what really lends itself to these AI techniques, these machine learning techniques, like Fess was saying. So, rather than explicitly programming, like in case A do X or something, this isn't feasible in something like this.
So we use these agents to learn and respond to control these plasmas optimally. And what's sort of been shown, at least in the past few years, and this is all quite recent, is that, um, Is that, yeah, these, these things can. operate obviously at superhuman speed, but also incredibly good accuracy. So they're sort of outperforming any, any other control systems.
And then I guess, yeah, I guess where all of this work came from earlier this year was, well, hold on a second, if we're going to give a Control of, uh, you know, very expensive reactors that can create these incredibly hot plasmas, uh, over to an AI agent. Well, you know, what could possibly go wrong? Uh, and that's kind of, that's kind of, uh, that's kind of, uh, where this all started and, and there's this, obviously we can have backdoors on these and, and what do we need to think about and how do we mitigate them?
So from my point of view, it's, it's solving a really important problem. It looks like we're going to see more of these agents and these really important use cases. So it's going to be incredibly important to, to make sure that we, we understand these threats, I suppose. So that's, that's what it, what do you think there's that sort of how, how I view this work?
[00:14:15] Vas Mavroudis: Yeah. So, um, if it's useful, I can actually provide some more depth into. What, what does it mean backdoors for such a model and why, why is it a problem? Um, perhaps unsurprisingly, backdoors in machine learning, they're implemented differently, but they are exactly what you expect them to be from if you know about software backdoors.
So it's hidden functionality. Um, if you inspect this piece of code, it's not obviously flawed. Uh, but someone that knows that it's there could exploit it to get access to, uh, if we're talking about the cryptographic function, for example, perhaps retrieve a private key, um, if it's software, maybe exploit a, um, an online service.
Now, in. In machine learning, all the functionality is encoded in the neurons of that model, so the backdoor leaves there, which is a, which is a headache, because at least when the backdoor is written in the source code, you can audit this, and if you look carefully enough, you can actually uncover that there is something wrong here.
Uh, whereas with, um, machine learning models, the backdoor is encoded somewhere in the neurons that are not explainable at all. We don't even know what every neuron, what role every neuron plays in the, in the agent. We just know that it works because we test it. And the problem is that it's very hard to uncover the existence of a backdoor, um, before it gets triggered to do the, you know, express the malicious functionality.
So. Obviously in the case of a reactor, a reactor could lie dormant for years up until, uh, it observes a predefined malicious trigger that then, um, makes it do something that's not, uh, good for the, um, um, for the reactor.
[00:16:14] Sean Martin: And could, does that trigger have to be internal or could it be an external force?
Because I'm thinking two, two cases. One, it, it analyzes the environment and I've reached the state where I need to wake up and do something. Or it could be I've produced data that's sent somewhere else, which then I receive a response that says now it's time to do something. Is it one or the other or both or something else even I'm not thinking of.
[00:16:43] Vas Mavroudis: The, the concern is that the agent might be, uh, it's an area no one thinks deeply about. Uh, The agent is receiving information from sensors and not all these sources are necessarily sanitized. So it can be either, like internal or external. Obviously the target is, the problem comes when external information sources are processed by the agent and they are not reviewed previously.
Having said that, for a sophisticated adversary the trigger could be such that it's very hard to detect. It could be a specific value in, uh, some reading, uh, it could be a lot of things.
[00:17:29] Sean Martin: So the work that you're doing, Vas, and, uh, Sanyam looks, looks like he's doing similar stuff at Cardiff, but, so are you both looking at how to leverage this for detection and response as you noted, but then also, Is part of that, you're looking at the vulnerabilities in the back doors, both, both you and Sanyam?
So Cardiff and, and Alan Turing are both doing this, right?
[00:17:56] Vas Mavroudis: Yes. So, um, yeah, this is correct. Uh, Sanyam was a visitor, uh, here in my group at the Alan Turing Institute. And, uh, now we keep collaborating with JV as well. Um, We started looking into is, um, sophisticated triggers that are hard to detect because the literature before our work was mostly very obvious triggers that were easy to detect.
So they, they didn't necessarily make a good case about how, um, how big of a threat backdoors could be, because we're essentially talking about supply chain attacks with it, which the security community is very familiar with, but now they come from a different angle. They come from the angle of models.
for joining us. Um, our concern is obviously, um, Bactors might make it to end products. And I know that's one bit. I don't want to go too much into detail, but, um, in our Black Hat talk, we will reveal a solution that we have designed and open sourced.
[00:18:57] Marco Ciappelli: Well, I guess my head is going. But, uh, I mean, the, the fact is that the, the, the, where I focus is, is the word agent, like, because agent is when actually it takes action, right?
So it, and, and it, it does it in a way that it believes, like in a human, right? Like, I believe I'm taking the right action. A human can screw up and so can an agent. And how do we actually have ever. Full control over that, I mean And that's why, you know, people point finger to a car that hit a person on the bike.
Big deal. There's been, you know, 3, 000 probably only in L. A. that got hit by a car that was driven by a person, a human, and nobody, you know, made a big fuss about it. So I think we are in that phase where, you know, Sure. We need to learn, but we also need to understand that this is the future. So what you guys are doing, it's, it's, I mean, again, it's sci fi movie in my head right now.
So I think it's going to be great.
[00:20:12] Sean Martin: I think the difference is the, your 300 to 1 example is when it becomes 300 human to 3 million, right? Because the neurons all, all trigger at the same time across a fleet of neurons. I don't know, I just think when we start talking about technology, the scale, the scalability of this stuff is really, it's the speed perhaps, where we can't contain and the scalability where it just goes out of control, we can't contain.
And I don't know, Jamie, if you have any, I can visualize perhaps what happens if you can't contain the, uh, the, the field there, right. But yeah, it's, uh, it's something you didn't describe.
[00:21:00] Jamie Gawith: I guess, yeah, it is something that lends itself to animations and images, but yeah, essentially you've got, to boil it down, you've got this very, very hot plasma, and if you lose control of it, you lose control of its position.
It will run into the closest part of the reactor around it, and it will essentially just melt and damage it. So it's something called a plasma disruption. And you know, this is very much not on the same scale as an efficient reactor where you have a meltdown. There's no kind of risk of that. In fact, it's one of the large benefits of nuclear fusion compared to fission.
But yeah, you absolutely, uh, can damage your reactor. And if you're building one of these things, these prototypes can cost sort of, yeah, the real ones cost upwards of a billion dollars, pounds, euros. Uh, so you can be put out of action for months or years, or even have to decommission the machine. So yeah, some pretty, pretty real consequences.
And I guess this kind of gets to the point of, yeah, if we're going to give control over these important. Uh, applications from autonomous vehicles to fusion reactors to anything else in the real world. Yeah, we better really understand the consequences.
[00:22:16] Marco Ciappelli: Well, uh, I'm coming to the presentation, which will be Wednesday, August 7th at 2.
30 PM to 3 PM at the South Sea AB Level 3. Of course, you don't have to remember this. We will write it. And, uh, it will be, of course, and he's already on the black hat, um, dot com website. I want to thank you guys for this. I think that you'll have as many geek that want to understand this stuff right there.
Black hat and everything else happening around the DEF CON B sides and all of that. So I want to invite everybody to take part of this. I think it's important to understand, even if we're never. Regular people like us understand as deep as you are, but I think it's important to make our voice heard anyway in our society And uh, and at least understand the gist of it.
So impressive, um sean will be there and Absolutely. Absolutely. I encourage everybody We'll let them know that's why we're going to cover and uh and share all this information Before during and after black cat. So jammy vast. Thank you so much We'll see you in Las Vegas. Thanks
[00:23:38] Jamie Gawith: guys. Yeah.
[00:23:39] Sean Martin: Appreciate it.
Appreciate you doing this work and, uh, more importantly, sharing it, uh, with folks and I'm excited to learn about, uh, the open source stuff that you put together so people can check that out too. So you get, you get that at the session, uh, safe journey to both of you and your, uh, your two counterparts or co co, uh, presenters.
And, uh, we'll see you all in Vegas shortly. Thanks everybody.
[00:24:07] Vas Mavroudis: See you in Vegas.