Cyber Compliance and Beyond logo

Episode 30

Teaching AI to Protect CUI

Share
Teaching AI to Protect CUI

About This Episode

Podcast Episode 30
June 16, 2026 - 21 mins

Manufacturing sits at the heart of the Defense Industrial Base, yet many machine shops are only now grappling with what CMMC truly means for handling engineering data. In this episode, Paul Van Metre explains the practical challenges shops face, from managing drawings and CAD models that clearly qualify as CUI to updating decades-old processes never designed for cybersecurity.

We discuss why CUI is so widespread on the shop floor, how traditional “print-and-post” workflows increase exposure, and why moving to digital, paperless processes can significantly reduce both scope and cost. Paul highlights the mindset barriers many shops encounter, the operational pressures on small manufacturers, and the hesitation some have in accepting that CMMC applies to them at all.

We also explore:

  • The shift toward cloud-based ERP systems and FedRAMP equivalency
  • How modern ERP platforms eliminate local CUI storage, secure endpoints, and streamline compliance
  • The high cost and risk of old on-prem ERP systems
  • Strategies for lowering long-term CMMC costs
  • Lessons from hundreds of ERP deployments, including when to migrate historical data versus starting fresh

Listen to get a clear, realistic look at what CMMC means for machine shops, the operational decisions that matter most, and how manufacturers can modernize without overburdening the business.

Microphone
Are you a podcast listener?

Get the latest episodes on your favorite streaming platform.

Podcast use is subject to Kratos Terms.

Subscribe via email for the latest podcast

Get email alerts on the latest episodes

Episode Transcript

Cole French:

AI isn’t just another shiny new tool, it’s rapidly becoming the lens through which organizations see, understand, and control their most sensitive data. As CUI moves across cloud apps, email threads, shared drives, and even AI engines themselves, the stakes get higher and the blind spots get bigger. In this episode, we explore how AI-powered data intelligence is reshaping how organizations discover, label, and protect critical information, and why guardrails, context, and visibility matter more than ever when your security program is drowning in noise.

Welcome to the Cyber Compliance & Beyond podcast, a Kratos podcast that brings clarity to compliance, helping you leverage compliance as a tool to drive your business’s ability to compete in any market. I’m your host, Cole French. Kratos is a leading cybersecurity compliance advisory and assessment organization, providing services to both government and commercial clients across varying sectors, including defense, space, satellite, financial services, and healthcare. Now, let’s get to today’s episode and help you move cybersecurity forward.

In today’s episode, recorded live at CUI-CON in Orlando, we explore what CMMC really means in an AI-driven world where sensitive information is scattered across cloud apps, email threads, shared drives, and even generative AI engines themselves. We start with one of the core challenges organizations face, simply knowing what CUI they have, where it lives, and who has access to it, a problem made exponentially harder by sprawling SAS platforms, constant file sharing, and workers pasting information into AI tools without understanding the risk.

From there, we dig into how AI-powered data intelligence can finally bring clarity to environments buried in unclassified, mislabeled, or unknown sensitive data. We look at how modern models can learn an organization’s unique definition of CUI, classify data at scale, and reveal the true boundary of a compliant enclave, often shrinking scope dramatically for large enterprises that previously assumed everyone needed access. We then examine why enforcement tools like DLP, firewalls, and endpoint security have historically failed, noisy alerts, poorly tuned policies, and a lack of context. And we explore how feeding these tools higher quality intelligence can transform them from background noise into actionable, precise controls that actually prevent risky behavior.

Finally, we look ahead to the new frontier of AI guardrails from preventing sensitive data from being fed into AI engines to controlling what AI tools can output back to users to understanding how autonomous bots might learn new skills or traverse your environment if given too much access. We break down why organizations must be intentional with permissions, model inputs, and AI behavior before these tools become deeply embedded in operational workflows. Joining us for today’s conversation is Tom Evgey, field CISO of Secuvy, a leader in AI-driven data intelligence. Tom specializes in helping organizations discover, classify, and protect their most sensitive information from CUI to personal data and brings deep expertise in how AI can enhance both security and operational efficiency at scale.

Tom is a seasoned cybersecurity professional with over 20 years of experience in leadership, project management, negotiation, and conflict resolution, developing and driving sales enablement strategies, improving customer success and product adoption and increasing mind share and awareness. We hope you enjoy this episode.

Tom, just want to thank you for stopping by our booth here at CUI-CON in Orlando to chat about AI. It’s kind of what we’re going to get into today. We talked a little bit before hitting record on this. You guys, you work with AI solutions and AI has become this thing that’s all around us that we talk about all the time, and really talking about it, there’s a lot of scary things with it. But hopefully through our conversation today, we’ll dive into some of the really good use cases and good uses for it. So maybe you could just talk to us, get us started on how do you use AI? What are you seeing out there and what are the benefits you’re seeing with AI?

Tom Evgey:

Yeah, absolutely. Thanks for having me and glad to be here. So Secuvy is a data intelligence platform that is driven by AI engines for data discovery and classifications and has a couple of different use cases obviously around security but also operational efficiency. And really the main driver behind the platform is being able to identify and discover very sensitive information, CUI obviously being one of them, and getting that contextual awareness around data to be able to identify what’s really sensitive and provide the security team in building an appropriate policies around that data.

Cole French:

So when folks are working with your particular solution, you mentioned the context, but how does the AI model or tools that you’re working with, how does that tool get the context to know, “Okay, this is sensitive information, this isn’t,” things like that?

Tom Evgey:

So we go through an onboarding phase with a customer where they provide some contextual data around what CUI data means to them. It obviously comes in very many different flavors and in colors. And so training that model in identifying what’s CUI to the customer, they’re able to very quickly learn what type of data is around their environment and be able to identify it and market. So definitely some, what we call marks, data marks from the customer to be able to build those policies, but the AI models are, like I mentioned, able to quickly identify and learn what that type of data looks like.

Cole French:

I know there’s standard CUI markings, for instance, or even if you’re doing classifications for other types of information or sensitive data. Does that tool come baked in with some of those known across the board markings for different sensitive types of data, or is it required that I go feed it anything and everything that it would then go and identify and use context?

Tom Evgey:

Yeah. So it’s actually a combination of both. We have some basic policies that we start off with, but the context from the customer is extremely important because that’s how the model’s learned. But to your point, there are some baseline policies that we provide and then the additional context come from the customers.

Cole French:

So it is just sensitive data. Are you guys also looking at other elements potentially? Because for instance, CMMC, right? So yeah, I need to know where my CUI is, who’s accessing it, all that kind of stuff, which I assume it can identify that sensitive information. Does it also have the capability to say, “Hey, here’s all the CUI I found and here’s all the people that have access to it or interact with it,” things like that? Is that something that it’s able to do as well?

Tom Evgey:

Absolutely. I think the first three questions when you’re thinking about CMMC is understanding what type of data you do classify as CUI, where that data resides and eventually who’s got access to it so you can build that policy around it. I think once you answer those three main questions, you’re well on your way to probably further than most organizations are today if you’re able to answer those questions. But yeah, understanding where that data is, who’s got access to it and then ingesting or taking that data and pushing it down to your enforcement points. So your DLPs, your firewalls, your intrusion detection, your endpoint security, a lot of organizations are Microsoft shops so they use Purview to do the labeling and then the enforcement of that data.

Cole French:

And it’s interesting. I’ve had this conversation with some other folks around this decision about, “What CMMC I can do for this whole enterprise or my whole organization” or, “I can build an enclave where I store everything.” And a lot of times it’s a though or from what’s the easiest solution, what’s the easiest thing for the business, et cetera? And it’s much more difficult to say, “Maybe I need to make that decision based on what I actually have.” But I think a problem a lot of organizations face, especially larger organizations is, “I don’t know how much I have. I’m not exactly sure where it is, and I’m not exactly sure who accesses it.” So they might come to it and say, “I have 1,000 users so let’s just do an enclave. We’ll put everybody in there or just because we have that many, we’ll do the enterprise and apply this security baseline across the entire enterprise.”

But you might come in with a tool like yours and be able to say, “Okay, we’ll actually learn and find out what you guys have, who’s accessing what.” And then you might find, “Oh, in a 1,000-person organization, it’s only like 100 people within my organization that are actually accessing and interacting with and working with CUI.” And you can take that as a scoping point and say, “All right, maybe I’ll build an enclave for those 100 users.” And it can help you make a really well-informed decision, I think.

Tom Evgey:

Absolutely. It will help you define the scope and it will help you define the boundaries of your enclave. So where does that bubble need to end for that data and also what data comes in and out of it. Think of the sheer volume of digital content that we’re sharing on daily basis and how it traverses through. Your typical organizations, the emails that we’re sending out, the Slack messages, all of the SaaS applications that we’re using, files being shared, content being downloaded, input being uploaded from... Or input from users across different AI engines and search engines. So just an incredible amount of data and you are required to maintain some of that data or keep track of it.

And so when you’re able to identify it, not only are you able to classify it, but you’re also able to tag it. And then those policies are moving around with that data, those files that you tagged and have labels on them, now when they move around your environment, your organization, those policies, the labels come with it. And so you’re able to really follow that data through and understand how it traverses through your enclave.

Cole French:

That’s actually what I was going to ask next, was an interactive in nature. So it is something that because I think there’s a user, let’s say I’m going to move this file. There may be ramifications to moving that file that I’m just not aware of or nothing gets brought to my attention, but there is an interactive component where it’s like, “Hey, I’m a user and I move this file or I send this file” and before doing it, it provides context to, “Hey, if you do this, there’s potential here. There’s ramifications of doing something like that.”

Tom Evgey:

Yeah, absolutely. That’s how the enforcement points within your environment, that’s when they kick in. So when you have those labels and that data defined, so when somebody tries to access that data, your DLP, your purview, your endpoint will block it, or when they’re trying to download it or they’re trying to send it out, those enforcement points are going to be the ones that are going to take action and stop that action from happening. The value that we bring in is that intelligence to those tools where they’ve basically failed or they’re either failing or they’ve just been so noisy over the past decade or so, or they’re just creating just a lot of different alerts and they’re not functioning properly and we have this alert fatigue. Being able to reduce that noise, making your policies shorter and more concise, that’s really the value that we bring.

Cole French:

Absolutely. Like you mentioned and touched on, the amount of digital information that’s out there is astounding in many organizations. So the ability to have sort of an interactive guide, if you will, that cuts down the doors and really gives you actual decision points. Because, yeah, I’ve worked in an operations capacity in the past and at a certain point you get a certain number of alerts or alerts or you get a certain amount of information and you’re like, No,” you just don’t even look at it anymore because it’s just noise. But if it’s something that’s actually tuned and gives you proper context and you see that it actually helps you make actual decisions and improve your enforcement mechanisms.

Tom Evgey:

Yeah, I think that’s what the security programs have been missing in this the last frontier, if you will, in a cybersecurity program is that intelligence defeat all of your security tools. In a typical organization with 500 to 1,000 users, the number of security tools, especially if they need to adhere to any sort of compliance framework like CMMC or PCI or ITAR or HIPAA, the number of security tools that they have is just astounding. At least 11 tools from firewall to intrusion detection to logging to encryption to endpoint security and you’re ingesting all that information, but at the same time you need to tune them.

You need to make sure that the policies are in place, that you’re firing off on when there are actual events happening. At the end of the day, what we’re trying to do, everyone here, is we’re trying to protect our data. Whether it’s personal information, whether it’s CUI, whether it’s customer information, we’re all trying to achieve the same goal here. And if you know where that data is and you know who’s got access to it, all of your tools can be smarter and less noisy.

Cole French:

Absolutely. And so I’m curious, so we’re talking mostly about CUI to start here, but I’m sure you guys are looking at, you mentioned all sorts of security stacks, all the different tools. So I’m assuming that your capability can actually look at from a layer zero all the way up and can evaluate the context in between those different solutions and highlight and illuminate, “Hey, this could be a problem area,” or things like that. So yeah, talk a little bit about how maybe thinking more broadly than just data categorization or sensitive data, what other types of security benefits have you guys found?

Tom Evgey:

So from a security perspective, again, from looking at sensitive information and what’s been classified or unclassified, there’s gaping holes within some of those environments, but let me take you to maybe a different lens here and looking at through operational efficiency where some organizations have petabytes of data and they’re delivering content to their customers, whether it’s a streaming platform, whether it’s a company like Netflix, again, a streaming platform, that is either delivering videos or music or they’re delivering other services, they have petabytes of data. And part of using AI, generative AI is that it continuously tries to think what your next question will be so we can answer it.

And so it brings data to the front closer to the user so it’s going to be able to answer it more proficiently. And so if we can identify the data, stale data, if we can identify fresh data, data that needs to be moved into archive, there is massive ROI in just from a storage perspective on how the content delivery is created. So we’ve seen a lot of requests from a data security posture management being used in a couple of different ways, and operational proficiency has definitely been one of them.

Cole French:

I can really see a lot of value in that because I think that’s a great description that you just gave, that it’s always thinking of what’s the next question to ask. And I like that because I think when it comes to how we operate, sometimes I think we have a limited capacity to ask the right question, which asking the right question is what gets us to what the next thing we’re going to do is, and what that thing is and how it’s important. And in a lot of cases, I think there might be even questions we don’t know how to ask.

But you mentioned the stale data, the fresh data, all that kind of stuff. I think as humans, there’s an element we don’t even really think about with that kind of thing. It is an important thing to consider and to think about. So even just having that presence in your environment that can bring those questions to the surface so that you can actually wrestle with, “Should I do this? Should I do that? What do I do about this particular thing that I know maybe I didn’t know before?”

Tom Evgey:

Yeah, 100%. So we’re definitely seeing a lot of that. And again, we started with security, but our data is everywhere, especially in the dawn of AI where organizations are trying to race to that proficiency and leveraging AI. They’re just feeding it so much data, so much information because training happens through data ingestion. And so what data you’re providing the AI engines, do you know if it has any sort of personal information, any customer information, any identifiable information, policies that you might be violating? And also from a guardrail perspective, what is your AI engines or bots, what information are they providing? Are they needing any guardrails to prevent them from providing API keys, of doing operational functionality in the backend that might jeopardize keys or passwords or users?

Cole French:

So guardrails. So how are you guys approaching the problem of guardrails? Because I touched on it at the beginning. The AI is this scary thing, I guess, in some respects. Guardrails I think are important. So what are you guys seeing as far as guardrails or what kind of guardrails do you guys put in place when it comes to AI, things like that?

Tom Evgey:

So we actually have an MCP server that we’re going to be launching and that sits between the user and your OpenAI, ChatGPT, and Perplexity. Looks at the data that you’re sending and, based on policies, will be able to block anything that’s CY-related or confidential information. So we’re certainly going towards that direction. But also at the backend looking at the data, like I mentioned, that is being provided to the AI engines and blocking it before it starts sending all that information.

Cole French:

So you’re essentially configuring it so that what the user is putting into it doesn’t violate any sort of policy. And I’m assuming policies and things like that are things that you have to configure within the solution first, or is it similar to what you were talking about with sensitive information and the context and stuff like that? Is it able to learn over time from a policy standpoint or provide nuance and things like that as well?

Tom Evgey:

Yeah, it’s able to learn over time. And it’s the feedback that it’s getting from the AI engine that is what we’re blocking. And so what the user is going to cut and paste from their file system or from their desktop, they may have CUI, we are specifically not there just yet because that’s going to be more of a browser functionality, but what information is being sent or being provided by the AI engine, that is where we block.

Cole French:

Making sure the AI engine isn’t providing information that it shouldn’t be providing?

Tom Evgey:

Exactly. Exactly. Yeah.

Cole French:

Tom, again, I really appreciate you stopping by to chat with us. One final question here as we wrap up and just want to know what you think, and we’ve talked about some of this, but when it comes to AI and guardrails and things like that, what do you think is the most important thing organizations need to keep in mind as they’re wrestling with, “How do I use AI in my environment and how do I do it in a way that leverages all the benefits but also prevents some of the scary stuff that’s out there?”

Tom Evgey:

Yeah, that’s a great question that I think a lot of organizations are struggling with today. So there are two parts to that. The first one is what information are we providing AI engines like OpenAI or Anthropic or Perplexity, and be mindful of what information we’re sharing. With cloudbots and MCP servers and the LLMs that we’re building internally, we really need to be mindful of the access that we’re providing them and what can they do. Because they’re learning, they’re not just acting on a specific command that you give them, they’re trying to fulfill a more broad function. And what I mean by that is that when you ask them to deploy a VM, for example, within your environment or you’re asking them to run a task, they’re going to try in multiple different directions to achieve that task with the access that you provide them.

So for example, if you want them to speak in a certain way, your cloud bot, for example, or your chatbot, you want it to speak in a certain way, it’ll go out of the internet and try to grab packages that will allow them to change their voice or learn new skills. The bots today learn new skills. They’re able to update themselves so they can gain access to different environments so they can understand how to access Slack for example, or how to run through your shopping list or book travel for you. So I think being very mindful of the access that you give them and how it’s being used is super critical. And just be mindful that everything that you are giving your bot potentially can be leaked. So I would just caution organizations to be very mindful of that.

Cole French:

I think that’s great advice. I think that’s in line with advice we give even for non-AI solutions is to really think through this and plan. Plan it. Make sure you’re talking to the right people and not just going in alone. You definitely want to make sure you’re working with those who can help you make these decisions and make them in a wise way. So again, Tom, appreciate you stopping by.

Tom Evgey:

Absolutely. Thank you.

Cole French:

I really enjoyed this conversation. I think AI being such a pertinent, prominent topic out there, I think this will really be beneficial to our listeners. So again, I appreciate it.

Tom Evgey:

Yes. Thank you so much for having me and looking forward to hearing more of your podcast and some of the content you have to share.

Cole French:

Thank you for joining us on the Cyber Compliance & Beyond podcast. We want to hear from you. What unanswered questions would you like us to tackle? Is there a topic you’d like us to discuss or you just have some feedback for us? Let us know on LinkedIn and Twitter at Kratos Defense or by email at ccbeyond@kratosdefense.com. We hope you’ll join us again for our next episode, and until then keep building security into the fabric of what you do.

Have a topic you’d like to discuss?
Use our contact form to send us a message.
Get updates from Cyber Compliance & Beyond
Sign-up to receive email alerts when podcasts are available.