The ability for commanders to know and understand an organizational attack surface, its vulnerabilities, and associated risks is a fundamental aspect of command decision-making. In the cyberspace domain, ongoing monitoring sufficient to ensure and assure effectiveness of security controls related to systems, networks, and cyberspace, by assessing security control implementation and organizational security status in accordance with organizational risk tolerance and within a reporting structure designed to make real time, data-driven risk management decisions are paramount.
The National Institute of Standards and Technology (NIST) Special Publication (SP) 800 137, Information Security Continuous Monitoring (ISCM) for Federal Information Systems and Organizations, defines Information Security Continuous Monitoring (ISCM) as “maintaining ongoing awareness of information security, vulnerabilities, and threats to support organizational risk management decisions.”
The Risk Management Framework (RMF) is the unified information security framework for the entire federal government. According to Office of Management and Budget (OMB), by institutionalizing the RMF, “agencies can improve the effectiveness of the safeguards and countermeasures protecting federal information and information systems in order to keep pace with the dynamic threat landscape.”[1] The RMF, developed by the NIST, describes a disciplined and structured process that integrates information security and risk management activities into the system development life cycle. ISCM is a critical part of the RMF process. As such, a foundational component of the ISCM strategy is the need to not only focus on monitoring, but also to support risk management decisions across the multiple mission areas of operations affected by the cyberspace domain.
To assist with the operationalization of ISCM across the entire federal government, the OMB released Memorandum M-14-03, Enhancing the Security of Federal Information and Information Systems. The Memorandum provides guidance to implement ISCM across the Federal Government and help manage information security risk on a continuous bases. In response to M-14-03, the U.S. Army Research Laboratory (ARL) team initiated a program to develop risk scoring at the scale and complexity needed for the DoD. This project, named Information Security Continuous Monitoring (ISCM), is intended to provide a capability that not only allows for the identification of a system risk, but also to allow for that risk to be changed dynamically based on the threat or mission need. This project required a novel approach to risk scoring, as well as a platform that could ingest and visualize the various data types needed, all while fostering collaboration with our federal, academic, and industry partners.
This article discusses the history of ISCM at ARL; the approach and current status of ARL’s ISCM capability; the data, entity creation, and risk scoring processes and models; and the next step and way ahead for ARL’s ISCM capability.
History of ARL ISCM and Initial Approach
In 2011, at the request of the DoD, the ARL team began investigating how to enhance the situational awareness provided by the cyber security tools used in the defense of transactions on DoD information networks. This was the DoD’s first major thrust into continuous monitoring based on the success of the State Department’s efforts [2]. The ARL team approached ISCM with the primary goal of developing a capability that could continuously correlate and aggregate disparately formatted events generated by intrusion detection, vulnerability assessment, and host-based security tools. At the outset, the minimum bar for success required that ISCM satisfy the following:
- Enhanced cyber situational awareness – The ability to ingest, aggregate, correlate and enrich cyber data from a variety of sources and provide an interface or dashboard view that enables commanders and mission owners to make higher confidence decisions.
- Continuous monitoring – The ability to transform the historically static security control assessment and authorization process into an integral part of a dynamic enterprise-wide risk management process. Providing the Army with an ongoing, near real-time, cyber defense awareness and asset assessment capability.
- Technical transfer – The ability for ISCM to be packaged and transitioned to other organizations with a similar cyber security mission and data sets. In particular, it is important that ISCM be transferable with minimal software refactoring and systems reengineering.
Building off the success from the State Department’s continuous monitoring program, in 2011 the State Department’s source code was transitioned to the Defense Information Systems Agency (DISA) and National Security Agency (NSA), and was further developed and transitioned to ARL in 2012. This initial ISCM prototype was named JIGSAW and was built atop Splunk [3]. JIGSAW was a collaborative effort between ARL and the DoD High Performance Computing Modernization Program and consisted of a Red Hat Enterprise Linux server, running Splunk on 12 CPU cores, 48GB of RAM and 3TB disk storage. The JIGSAW pilot ran for the majority of 2012 and consisted of a variety of experiments ingesting and exploring approximately 50 gigabytes per day of vulnerability assessment data, host based security logs, intrusion detection events, and network flow data from the DoD Defense Research and Engineering Network.
JIGSAW provided good insights into each individual data set, but its correlation and aggregation capabilities weren’t robust enough for our long-term vision. In JIGSAW, there was no entity construct. Every stored row was an event. We could perform aggregations such as: For all data sources indexed, show me all the results per hostname. However, if we then attempted to associate a risk value with the hostname, it was not possible. The aggregation per hostname only existed in the context of the original query results. We potentially could have exported the results to a relational database, established the hostname risk association in a separate table, and then exported the hostname/risk object back to the JIGSAW as a new event. However, we were not willing to contend with the complexity of a transaction that required layering database upon database in order to shore up the deficiencies in each. Furthermore, the cost associated with scaling to the 1TB+ daily volume of data we were expecting to ingest and index made the JIGSAW solution unsuitable for our use case.
Splunk was removed from the ISCM solution and replaced with a relational database backend, PostgreSQL [4]. A Python [5] frontend with a variety of Python libraries were adopted for visualizations and custom Python scripts was developed to perform the data parsing, data correlation, and aggregation. This new configuration addressed entity construct and cost concerns associated with Splunk; however, the scalability issues persisted. In relational databases, the notion of clustering and horizontal scaling is in support of high availability and not scalability or sharding of large data sets. Server parallelism and increased data ingest rates, storage and processing of terabytes worth of data proved to be a difficult task. Furthermore, achieving historical and trending analysis for several months’ worth of cyber data was next to impossible, again due to the scalability limitations.
There were many lessons learned from the ISCM prototypes and it helped refine our minimum bar for success to include the following requirement:
- Scalable architecture – ISCM needed to be a scalable architecture that could quickly be augmented with minimal impact to uptime and support the storage and processing of large data sets at the Petabyte scale.
ARL needed to adopt an architecture that could easily scale horizontally and support several months (100TB+) of historical and trending data. Additionally, we needed to consistently ingest and process terabytes of semi-structured data in parallel. With JIGSAW, flow data could only be stored for a couple weeks before we had to delete older data to preserve disk space. This fact led us to commence an investigation of distributed computing and NoSQL architectures, specifically, the Apache Hadoop [6] ecosystem. Several entities in the DoD had already begun to engineer big data frameworks using Hadoop in order to address their mission needs. The ARL team made technology transfer requests in order to build upon existing source code and lessons learned.
Two distributed computation frameworks were evaluated by ARL. The first came from the U.S. Army Intelligence & Security Command and is called Red Disk [7]. The second came from DISA and is called the Big Data Platform (BDP) [8][9]. Many of the components of Red Disk and the BDP are similar. At their core, they are both Hadoop clusters providing a distributed computing framework, with software components capable of ingesting, storing, processing, and visualizing large volumes of data from an assortment of information sources. Both environments are comprised of open source and unclassified components, and also leverage technology transfer from other DoD entities. During our evaluations, we compared the streaming ingest capabilities of each framework for ingesting cyber events via topology constructs (graphs of computations that contain data processing logic) in Apache Storm [10]. Red Disk experienced performance issues when attempts were made to ingest ARL’s sensor data into Apache Accumulo [11]. Its custom data processing framework and data-modeling construct averaged less than 1MB/s ingest rate. The BDP performed substantially better with ingest rates that average 50MB/s, with peak rates near 100MB/s.
In the latter part of 2014, the ARL team adopted the BDP to build our ISCM solution as well as future cyber analytic capabilities. Based upon our evaluations, we determined that doing so would substantially reduce the amount of time the ARL team had to spend on architecting a custom Hadoop solution for ingest, storage and processing of our cyber data sources. Additionally, adopting the BDP helped to satisfy requirement for technical transfer and enables a federated approach towards the creation of cyber analytic capabilities among other entities using the BDP. With the BDP acting as the core framework for data ingest, storage and processing, cyber security researchers, scientists, and engineers can focus less on systems engineering and systems integration tasks and more on data modeling and application of statistical, algorithmic and analytic methods to the data in order to glean deeper insight. In the next section, we discuss the current status of ISCM and its supporting hardware.