Cyber 3 MIN Read

When Your Cyber “Check Engine” Light Is On: IT Ops Is the New Cyber Continuous Monitoring

January 4th, 2022


Learn more about GDIT’s cybersecurity practice.

The typical cyber continuous monitoring approach, while well intended, is not without issues. As agencies transition to a Risk Management Framework (RMF), understanding these issues and how to avoid them is more important than ever.

Here’s why: It’s far too common (and far too easy) to adopt a periodic monitoring approach that involves reviewing cybersecurity systems in a controlled, as-designed lab environment. In these checklist-based scenarios, teams run scans, call them good, update their documentation and move on.

This approach is not continuous, and it’s not really “monitoring” either. It’s periodic and based only on how things should be in a lab – not what actually is based on real-world, as-maintained conditions.

It happens in government and private-sector settings all the time. But our adversaries don’t operate like that; they don’t say, “let’s only attack based on a current technical baseline” rather than as capabilities advance or as our environments evolve. The recent log4j breach is just the latest example of why we must be better at continually looking at IT infrastructure from an operational perspective.

Another reason the traditional continuous monitoring approach is problematic: Outside of these tests, most continuous monitoring is alert-based, like a “check engine” light in a car. Does the lack of an alert mean nothing is wrong? What if your alerting capability is broken? What if it’s not even on?

[M]ost continuous monitoring is alert-based, like a 'check engine' light in a car. Does the lack of an alert mean nothing is wrong? What if your alerting capability is broken? What if it’s not even on?

John Sahlin, Ph.D.

Director, Cyber Solutions – Defense

Extending the analogy, when a “check engine” light comes on, we’re supposed to pull over, call a tow truck, find the problem and remedy it immediately. But how often – and for how long – is that light ignored? Do we always immediately stop driving? Of course not. And this happens in cybersecurity too. Once you get an alert, something is already wrong. You’re responding too late. These alerts don’t provide a positive indication of system health and they don’t give you information about the green systems that are trending toward yellow or red but are still in normal operational conditions. These alerts also fail to give us a picture of our risk exposure – how far we can drive with the “check engine” light on to support that critical mission?

So, what’s the alternative?

Cyber teams should leverage IT Ops to actively assess the true health of production IT systems. GDIT has IT Service Management and Digital Engineering capabilities that we leverage for customers to conduct testing, troubleshooting and analysis on an “as maintained” baseline for systems. This is particularly helpful for distributed networks of networks such as tactical military units, or forward deployed/remote teams. We have also developed an intelligent service management platform for enterprise IT Ops, called Atlas. Atlas can help teams move from pure IT Ops to using data to make better decisions.

In tandem, cyber teams must also develop a security health analysis framework for reviewing the IT Ops status and trends and linking both to key RMF security controls. GDIT regularly applies our cybersecurity expertise to design resilience into solutions and, with Atlas, to feed data into security controls. This allows teams to find trends and early indicators of potential problems so they can evaluate overall risk and make decisions as they operate about how to manage or tolerate those risks in the real world, and in real-time.

Lastly, cyber teams should use artificial intelligence and machine learning to provide predictive security health assessments. In this model, AI can automate tasks and ML can draw conclusions from the greens in a system (and how to keep them that way) before they get to red. GDIT’s Mission Insights platform uses AI/ML and look at things more holistically. This approach allows teams to prevent issues rather than react to them and that’s the real value of true continuous monitoring.