ITSPmagazine Podcasts

Hacking Deepfake Image Detection System with White and Black Box Attacks | A SecTor Cybersecurity Conference Toronto 2024 Conversation with Sagar Bhure | On Location Coverage with Sean Martin and Marco Ciappelli

Episode Summary

In this episode of Sector 2024, Sean Martin, Marco Ciappelli, and security researcher Sagar Bhure discuss the escalating threat of deepfake technology and its implications for misinformation, financial fraud, and cybersecurity. Tune in to explore real-world examples and learn about innovative detection methods that aim to stay ahead of this complex challenge.

Episode Notes

Guest: Sagar Bhure, Senior Security Researcher, F5 [@F5]

On LinkedIn | https://www.linkedin.com/in/sagarbhure/

At SecTor | https://www.blackhat.com/sector/2024/briefings/schedule/speakers.html#sagar-bhure-45119

____________________________

Hosts: 

Sean Martin, Co-Founder at ITSPmagazine [@ITSPmagazine] and Host of Redefining CyberSecurity Podcast [@RedefiningCyber]

On ITSPmagazine | https://www.itspmagazine.com/sean-martin

Marco Ciappelli, Co-Founder at ITSPmagazine [@ITSPmagazine] and Host of Redefining Society Podcast

On ITSPmagazine | https://www.itspmagazine.com/itspmagazine-podcast-radio-hosts/marco-ciappelli

____________________________

Episode Notes

The authenticity of audio and visual media has become an increasingly significant concern. This episode explores this critical issue, featuring insights from Sean Martin, Marco Ciappelli, and guest Sagar Bhure, a security researcher from F5 Networks.

Sean Martin and Marco Ciappelli engage with Bhure to discuss the challenges and potential solutions related to deepfake technology. Bhure reveals intricate details about the creation and detection of deepfake images and videos. He emphasizes the constant battle between creators of deepfakes and those developing detection tools.

The conversation highlights several alarming instances where deepfakes have been used maliciously. Bhure recounts the case in 2020 where a 17-year-old student successfully fooled Twitter’s verification system with an AI-generated image of a non-existent political candidate. Another incident involved a Hong Kong firm losing $20 million due to a deepfake video impersonating the CFO during a Zoom call. These examples underline the serious implications of deepfake technology for misinformation and financial fraud.

One core discussion point centers on the challenge of distinguishing between real and artificial content. Bhure explains that the advancement in AI and hardware capabilities makes it increasingly difficult for the naked eye to differentiate between genuine and fake images. Despite this, he mentions that algorithms focusing on minute details such as skin textures, mouth movements, and audio sync can still identify deepfakes with varying degrees of success.

Marco Ciappelli raises the pertinent issue of how effective detection mechanisms can be integrated into social media platforms like Twitter, Facebook, and Instagram. Bhure suggests a 'secure by design' approach, advocating for pre-upload verification of media content. He suggests that generative AI should be regulated to prevent misuse while recognizing that artificially generated content also has beneficial applications.

The discussion shifts towards audio deepfakes, highlighting the complexity of their detection. According to Bhure, combining visual and audio detection can improve accuracy. He describes a potential method for audio verification, which involves profiling an individual’s voice over an extended period to identify any anomalies in future interactions.

Businesses are not immune to the threat of deepfakes. Bhure notes that corporate sectors, especially media outlets, financial institutions, and any industry relying on digital communication, must stay vigilant. He warns that deepfake technology can be weaponized to bypass security measures, perpetuate misinformation, and carry out sophisticated phishing attacks.

As technology forges ahead, Bhure calls for continuous improvement in detection techniques and the development of robust systems to mitigate risks associated with deepfakes. He points to his upcoming session at Sector in Toronto, where he will delve deeper into 'Hacking Deepfake Image Detection Systems with White and Black Box Attacks,' offering more comprehensive insights into combating this pressing issue.

____________________________

This Episode’s Sponsors

HITRUST: https://itspm.ag/itsphitweb

____________________________

Follow our SecTor Cybersecurity Conference Toronto 2024 coverage: https://www.itspmagazine.com/sector-cybersecurity-conference-2024-cybersecurity-event-coverage-in-toronto-canada

On YouTube: 📺 https://www.youtube.com/playlist?list=PLnYu0psdcllSCvf6o-K0forAXxj2P190S

Be sure to share and subscribe!

____________________________

Resources

Hacking Deepfake Image Detection System with White and Black Box Attacks: https://www.blackhat.com/sector/2024/briefings/schedule/#hacking-deepfake-image-detection-system-with-white-and-black-box-attacks-40909

Learn more about SecTor Cybersecurity Conference Toronto 2024: https://www.blackhat.com/sector/2024/index.html

____________________________

Catch all of our event coverage: https://www.itspmagazine.com/technology-cybersecurity-society-humanity-conference-and-event-coverage

Are you interested in sponsoring our event coverage with an ad placement in the podcast?

Learn More 👉 https://itspm.ag/podadplc

Want to tell your Brand Story as part of our event coverage?

Learn More 👉 https://itspm.ag/evtcovbrf

To see and hear more Redefining CyberSecurity content on ITSPmagazine, visit: https://www.itspmagazine.com/redefining-cybersecurity-podcast

To see and hear more Redefining Society stories on ITSPmagazine, visit:
https://www.itspmagazine.com/redefining-society-podcast

Episode Transcription

Hacking Deepfake Image Detection System with White and Black Box Attacks | A SecTor Cybersecurity Conference Toronto 2024 Conversation with Sagar Bhure | On Location Coverage with Sean Martin and Marco Ciappelli

Please note that this transcript was created using AI technology and may contain inaccuracies or deviations from the original audio file. The transcript is provided for informational purposes only and should not be relied upon as a substitute for the original recording, as errors may exist. At this time, we provide it “as it is,” and we hope it can be helpful for our audience.

_________________________________________

[00:00:00]  
 

Sean Martin:
 

Marco Ciappelli: Sean, 
 

Sean Martin: feel that my brain is tangled. 
 

Marco Ciappelli: your brain is what? 
 

Sean Martin: My brain is tangled. Just like your headset gets tangled. 
 

Marco Ciappelli: Well, I untangle my headset. So can you do the same with your brain? 
 

Sean Martin: I'm going to, I'm going to try it. That's the, that's the objective today. The question is, will anybody notice a difference if I'm successful? 
 

Marco Ciappelli: It depends. It 
 

Sean Martin: depends. It depends who, who's presenting me and if this is actually real or not. 
 

Marco Ciappelli: And if it's you, it's who you say you are. And how do I know? 
 

Sean Martin: I'm actually curious. I mean, I doubt anybody's. Deepfaking me at this point, but I don't know who knows in five years if everybody has a twin or a triplet That's used for something But that's kind of the topic today, right? So this is part of our chats on the road to Sector in [00:01:00] Canada. 
 

The, uh, it's the Informa led slash black hat, uh, style security conference in Toronto, and I'm excited to, to cover this event and I'm thrilled to have a Sagar Barayan, uh, he's from F5 security researcher there and he's doing a lot of work evidently with deep fakes and, and. Hacking the validation tools that determine whether or not someone is a deepfake or an entity is a deepfake or not. 
 

So I'm thrilled to have this chat. Uh, he has a session called hacking deepfake image detection system with white and black box attack. uh, that's on Wednesday, the 23rd, just, uh, just a few weeks away. So cigar, maybe a few words about some of the things you get to research as part of your role, including, uh, the, the deepfake stuff that we're talking about today. 
 

Right? 
 

Sagar Bhure: of mine. I'll give you a [00:02:00] slight overview of what the topic and the research that I'm going to present on on Sector Toronto next month. So, deepfakes are, we all know what deepfakes are, like it's created using advanced techniques like generative, uh, So, uh, where there are generators to craft the realistic images while discriminator tries to spot the fake ones. 
 

So it's always going to be a cat and a mouse game where hackers or manipulators are trying to create a realistic image. Uh, there have been a lot of widespread, uh, knowledge base available where it helps you to, uh, identify whether an image is deepfake or, or, or a genuine or a realistic one. But the algorithm recently has became, uh, becoming so sophisticated powered by the, uh, hardware growth. 
 

So that with the bear eyes, it's very hard to detect. That it's a deep fake or, or, or, or, or a genuine [00:03:00] image, for instance, um, uh, based on skin textures, uh, uh, I can identify that, Hey, is it a deep fake or realistic image based on the mouth movements and the audio sync. I can figure out that, Hey, is it, is that, you know, video call or video, uh, uh, deep faked or not. 
 

I mean, a simple change, like just entering one bit of an image can, can trick the classifiers. Uh, onto labeling a fake as real. So what, uh, this cat and the mouse game is all about, or what Hacker is trying to do is try to bypass the image based detectors, uh, to classify their fake image as a real one. So for example, Uh, in 2020, a 17 year old student, uh, recently, uh, um, created a fake Twitter account, uh, of a fictional candidate for Rhode Island, I guess, uh, named Andrew Walls, uh, and getting it verified with AI, and getting it verified on a Twitter, uh, with AI generated image. 
 

That person did not even exist and he filed [00:04:00] for, uh, for candidature to fight elections, uh, back in 2020. So this incidence is highlights how easily misinformation can exist. These are not just, uh, um, you know, based on candidatures or based on just generating image. Uh, we have also seen, uh, deepfakes in serious fraud cases. 
 

Uh, those, uh, I mean, defects are not just to cause harm on someone's public image. It also really, uh, leads towards financial frauds. That's where a case from Hong Kong firm, uh, case comes where Hong Kong firm lost at least, uh, $20 million, uh, due to a deep fake video impersonation as their CFO joining in the Zoom call and approving a transactions or a, or a call. 
 

So, or although it, it was a, it was an example of audio deep fake. So in this session, I'll, I'll share such case studies demonstrating how a pixel manipulation of a noise, [00:05:00] uh, can compromise detection system and bypass, um, a deep, uh, a sophisticated deep fakes image-based detect sensation. 
 

Marco Ciappelli: Sean, you are not kidding when you talk about tangling things like when you have the deep fake that, you know, it's the detector or the deep fake. It has to actually fight the hacking out itself. So it's always that good and evil bottle that I like to talk about and how sometimes is what actually make us evolve as a society. 
 

You know, it's like there is a problem. We find the solution and then. And there is another problem and there is another solution and that's how we develop. But it's kind of worrisome, uh, Sagar, because maybe you can have the tool to do something like that. And the company and the expert like you, that you can detect the texture in the skin or the example that you gave, but how do you [00:06:00] address these when they go into the public space? 
 

Twitter or X or whatever it's called nowadays. And how do we are able to market as such so that the regular user that doesn't have the tool that you have can be protected? 
 

Sagar Bhure: I mean, I'll, I'll, I'll, I'll use the word secure by design in the way that, um, uh, let's say you, you, you talked about Twitter, or let's say for example, Facebook or Instagram. So they should have a way to avoid miscommunications. I mean, I have, I mean, I have, when I present this idea to people, they have a different view. 
 

Maybe we can talk about that. But you should stop uploading a video if it is artificially generated. So, so that, so in short, the line between the fake and the real is getting blurred, um, by this generative way, I think. Uh, and I think we need to have some rule to [00:07:00] control, not to stop, but to control how this artificially generated data. 
 

Um, let it be text or images are used. Um, I mean, I can find 10 good use cases of generative AI as well, right? Of artificially generated images. But there are, there should be a way to control, uh, how the generated AI content is used for the good, not for causing financial or reputational damage to anyone. 
 

Sean Martin: Yeah, because I think Instagram, I haven't looked at video of whether or not you can flag it as generated by, but, uh, certainly images, you, you have the option to opt in to say this was supported by some artificial intelligence or completely, I guess, if you wanted to do that. Um, I, I haven't done that cause I don't do that and I just, so I have no idea what the use case or the flow after. 
 

Saying this is AI, um, I don't know that I've come across [00:08:00] anything that, that has that flag on it either. It's about, it's not just about that it exists, it's about what, what happens with it next. somebody see something and think it's real? Does somebody see somebody and think they're real? Um, do they look at the message along with it and, and decide to take some action or go somewhere or try to meet somebody or try to take advantage of somebody based on, I don't know. 
 

The information that they're seeing. So I think it all ultimately goes back to the bad actors trying to initiate or instigate an action that, that benefits them right at the expense of, of others. And so it, I know that the, the tools have to follow all the models that create this stuff so that they can then validate whether it's, Artificially created or not. 
 

And I know some of the early tech space tools would flag manually person written content as [00:09:00] computer generator AI generated so falsely the false detection thing, which is another thing that we, we struggle with just in security. So how do we work through, and I guess this is what your topic is about as well, how do we work through some of the false positives, right? 
 

Where, um, well there's two cases, one slipping through and becoming a false negative. So presenting, proving something is real when it shouldn't be, but then also the reverse, flagging something that's, that's, uh, real as fake. I don't know, I was a little mouthful there, but how do we know what's real and what's not? 
 

If the, if we're relying on tools. To do this and the tools can't keep up with each other. 
 

Sagar Bhure: So, so there are two ways to do it. Like the first way is. Um, using, uh, the detection tools like visually detection, uh, which is, which not everyone can do. And with, uh, with obviously with the, uh, uh, with the improvised [00:10:00] algorithms and the hardware space, it will be, it's really going to tough to detect if it's a defect or not. 
 

Um, but at, at present. Uh, we have not reached till, till that age, I can, by visually looking, uh, I mean, we can see that it's defect or not it's tough, but it's, it's possible, but over five years or not even five years, over two years, this is, this is not going to be the case. We have to strengthen our tools, so we have to strengthen our, uh, detection technology so that we, uh, we check or we scrutinize every pixel on an image or, or every byte in our audio file. 
 

To see that if it is, um, uh, if it is a deep fake or if it is artificially generated, uh, data, or if it is a real data, um, and, and it's, it's going to be a cat and mouse game for some time till we decide, uh, uh, on, on, on how a robust detection technology is something. And, and it's, it's [00:11:00] also going to be evolving for at least four to five years, uh, because we have, we'll be having like, once we have a detection system, we'll have vulnerabilities. 
 

Uh, just like what I'll be talking about, uh, uh, in Toronto next month. So it is going to evolve in short right now, there is no perfect detection technologies because deepfakes are evolving over the period and so as the detection technologies. So 
 

Marco Ciappelli: I know you do the video, right? But, you know, the audio, it's a pretty big deep fake and has consequences as well, because, you know, there is a video zoom call, but it's also the phone call, which is much easier if you're doing like fishing and, and, uh, you know, vision in this case. Um, how, how do you go with that one where you don't actually have, So you have data guides, I think, but you don't have the pixel. 
 

So the pixel is the key. That's where I'm going. [00:12:00] Can we actually stay ahead of the game with, uh, with audio as well? 
 

Sagar Bhure: with audio, it's, it's, it's a different game because all the data. Uh, channel over there is it's, it's different, uh, but, but when you combine, uh, both of them, it, it becomes easier because you have to synchronize between images and, and audios. It gets easier there, but when you treat both of them separate, uh, you have a data loss or you have that loss between of loss of synchronization between image and audio files. 
 

So it 
 

Marco Ciappelli: the video is easier to detect. 
 

Sagar Bhure: right? 
 

Marco Ciappelli: Okay. Mm hmm. Mm hmm. 
 

Sean Martin: And so the audio is harder to detect, but if you interesting, okay. You mix them together and yeah. 
 

Sagar Bhure: in simple terms, if you want to detect audio defect, think it like that. I have a different way of saying, maybe I'm stuttering somewhere. Uh, maybe my voice level goes up and down in some [00:13:00] words. So if you, if you give me a 10 minute video of audio of yours, I can do a profiling on that and I can make it in my database. 
 

Whenever you call me, I'll match it just for example, I match it with that database of how you speak, like how your pitch levels are, uh, I mean, different pitch levels at different moods also, let's say when you are happy, your pitch levels are like that, like a profiling of everything. And when I get a new call, I can probably do a detection. 
 

I mean, in the rough terms, a detection system based on the profiling that I've stored, uh, earlier. And then that, I mean, you know, on a rough idea, the algorithm works like that for detecting fake audios. I 
 

Marco Ciappelli: That's interesting. So the more you have, um, Of the real person, the real voice, all the real face. Cause as you can have a certain rhythm in the voice and you can have a way you breathe, maybe a way you, you pitch the voice also in the way you move, maybe you [00:14:00] have a twitch in your eyes or a blinking, but do you think that fat enough video, fat enough audio of the original person to the AI, the AI will be able to produce. 
 

A very legit deepfake, or do you think there is always be going to be a way for the good guys to stay ahead of the game? 
 

Sagar Bhure: mean, I mean, I think of a lot of, I mean, right now you and me have very little to do less, but let's say there are personalities like Tom Cruise and big, you know, celebrities, they have huge data set available on YouTube and, you know, media platform. I can download their audio files. Uh, and then I can, you know, it's easier for me because I have huge data set, even more than 10 minutes of their data set. 
 

Marco Ciappelli: Right. 
 

Sagar Bhure: To create profiling and create deepfakes also. I 
 

Sean Martin: So we're, we're talking a lot about the, [00:15:00] the societal, the general user and things that may interact with social media and, and videos online and things like that. And we're talking about celebrities and politicians and, and. Maybe some fraud from a financial perspective against an individual or group of individuals. 
 

I'm curious your perspective on the impact to business. Um, because I knew you do some, some things with OWASP, the, uh, the LLM top 10, if I'm not mistaken, which is all about building applications that leverage AI. So I'm wondering what, what are businesses attempting to do? With audio and video and what risks do they face, uh, with the deep fakes and, and the hacking of the detection systems to, to say that this is, this is, this has been compromised. 
 

Are there business use cases that, that organizations need to be concerned with? I guess that's really 
 

Sagar Bhure: mean, [00:16:00] it depends on the business, uh, industry of the segment. Uh, if you talk about, uh, the media outlets or news organizations, it's pretty obvious. Uh, I mean, they have to stop deepfakes, uh, providing missing misinformations and all that thing, because that's only the source of information people tend to believe for financials, uh, businesses, uh, let's say. 
 

A deep, uh, a deep take off like it's popular for organizations to fall into wishing, let's say, uh, if you remember 20 years or 15 years down the line, if you get a spam mail, it's pretty obvious that typos. Links, junk links, that and all, but I can do chat GPT or any LLM API, create a nice spam email, which will bypass every spam filters. 
 

So in the same way, it can bypass getting into your inboxes, clicking on any employee clicking on that [00:17:00] malicious URL. And then the, then the, then the. I mean, any deep fake image also can have link hidden inside it or, or a payload hidden inside it. Uh, so, so, so that's all the flows like bypassing the spam filters or any firewall for any use case. 
 

Marco Ciappelli: Um, 
 

Sean Martin: so interesting, isn't it? 
 

Marco Ciappelli: yeah. Also Sean, I may not be a celebrity in that large scale, but I think you have enough podcast and 
 

Sean Martin: saying. I think there's certainly plenty of food to chew on. 
 

Marco Ciappelli: Yeah, for sure. For sure. Tell me. Tell me about this. This person does not exist dot com. I never heard about it before, but I'm intrigued. What is that? 
 

Sagar Bhure: So if you open this person does not exist. com, it will show you a realistic person, but it's not that person does not exist. It's created by a generative adversarial network [00:18:00] with gender. So, which generates images of person who never exist in this world. 
 

Marco Ciappelli: So do you use that for training your system as well? Because you can see how a is actually building the image. 
 

Sagar Bhure: Yes. So to create, I mean, to detect deepfakes, one of the sophisticated algorithm is supervised machine learning algorithms, where you have labeled dataset of real persons and fake persons. And the fake person dataset comes from here. And then you probably train on to train to create a deepfake model. 
 

Marco Ciappelli: and, and you find the pattern that it will tell you if, what it could be, the difference between one is created or one is not. Fascinating, scary. 
 

Sean Martin: I see a world where we have to submit DNA with our, uh, pin, pinprick of the finger, submit some blood with our videos 
 

Marco Ciappelli: Yeah. And, and then, and then there'll be a deep fake DNA and then 
 

Sean Martin: And that's fine. 
 

Marco Ciappelli: Then we go from there. 
 

Sean Martin: Then I'll just, [00:19:00] uh, I'll go sail the 
 

Marco Ciappelli: then you just, judges just have relationship with, uh, you know, it's kind of like, uh, it's kind of like the, the, the world of. When, uh, you know, the, the, the, the, the, the robots that are going, uh, undetected in the, in the real world, we're in the Mad, Mad Max world and, uh, you know, there's always a way to find it, but it's going to be harder and harder, so might as well just live with it. 
 

Yeah. 
 

Sean Martin: always about, I mean, what we're doing is we're making it. Bigger, better, faster, right? So more people have access, they can do it. and we're making decisions based on this stuff faster than ever as well. So keeping up with the scalability of all this. That's 
 

Marco Ciappelli: Yeah, that was going to be actually the last thing I wanted to mention, like how scalable are the system that you're using and how fast are in detecting so it's [00:20:00] just something that could be implemented in large amount, like a YouTube channels, I mean, an entire YouTube or or a Twitter or Facebook social media and how fast can actually start marking content that is deepfake generated. 
 

Sagar Bhure: So images are real time, but videos take some time to process. Process because it depends, like it's a hour long video or a 10 minutes video. So I'm working towards buffering, like chunking it down, uh, 10 minutes video to some quantifical Kimberley mode so that it's close to real time so that even if it is a deep fake video inserted between a long video, it's bypassed or it's blocked by a firewall so that you see an abrupt change in the video, uh, 10 minutes to let's say 15 minutes, but you skip the deep fake part of it. 
 

So that any viewer as a parental control, let's say, um, you'll [00:21:00] get the deep fake or the misinformation amount of that particular hour long movie or a video. 
 

Marco Ciappelli: well, that's cool. 
 

Sean Martin: Everything you say, I think of something else. I'm just like, 99 percent of it could be authentic. And just one little nugget dropped in there that's not real, right? How do you, how do you spot that? 
 

Marco Ciappelli: Yeah, 
 

Sean Martin: Ah, crazy stuff. Well, the best way is to start by, uh, connecting. Uh, with Cigar at Sector in October, October 23rd, 1015 is, uh, your session. 
 

It's a 45 minute briefing, which is cool. A lot, a lot of deep, uh, deep conversations. They're hacking deep fake image detection system with white and black, black box attacks, uh, which means from outside the system and inside the system, for those who don't know what black and white box, uh, testing is, but anyway, Cigar, thanks so much for, uh, We're taking the time to share with us today. 
 

And, uh, I'm sad [00:22:00] I didn't untangle my mind anymore. I think it may have tangled it a bit more with this conversation, but, uh, hopefully people join you in, in Toronto and, uh, hopefully I get to meet some folks there as well and keep the conversation going. Uh, appreciate it. So yeah, thank you. 
 

Sagar Bhure: Thank you. See you in Toronto. Bye. 
 

Sean Martin: Thanks everybody for listening. Please do stay tuned. There are a few other chats from sector, uh, already available and I don't know, we might pull a couple more together and hopefully we'll see everybody in Toronto. in October. Uh, stay tuned for more chats on the road. Lots of events coming up. We're actually doing one later today on autonomous vehicles. 
 

So, uh, a break from securities. This is also security, but, uh, also society. So, stay tuned. Thanks everybody. 
 

Marco Ciappelli: take care.