Current Status of ISCM and SUPPORTING Hardware
The current ARL ISCM solution which is built atop of the BDP is based upon five individual widget/analytic capabilities, working in conjunction with one another to provide a dynamic cyber hunting capability and enhanced decision support via a risk categorization and prioritization capability. As illustrated in Fig. 1 below, those individual capabilities include asset management, antivirus compliance, network management, vulnerability management and risk management. The first four capabilities serve as the building blocks to generate the risk picture in the fifth capability.
The ISCM capabilities are based upon attributes and events primarily correlated from four cyber data sources:
- Vulnerability scanning reports collected via the DISA Assured Compliance Assessment Solution (ACAS) [12]
- Host Based Security System (HBSS) reports from tools such as Anti-Virus/Anti-Spyware, collected via Intel/McAfee Enterprise Security products [13]
- Network flow information from ARL’s Interrogator Intrusion Detection System [14]
- National Vulnerability Database [15]
Fig. 1 ISCM capabilities summary
Producing the ISCM capability required building multiple instances of the BDP to support the various stages of the software development lifecycle. We acquired hardware to support four medium-size (~30 node) clusters. The clusters served the following roles: A testing and evaluation environment, an external collaboration environment for cyber researchers from academia and other government entities, an environment for capabilities ready for pre-production and a production environment for cyber security analysts at ARL and other stakeholders. The hardware specifications for the various cluster environments are outlined in Fig. 2 below:
Fig. 2 ARL ISCM cluster hardware specifications
The dramatic increase in capability between production 1.0 and production 2.0 was primarily driven by a desire for enhanced historical and trending analysis by increasing the window of time elapsed before deleting data. Furthermore, we had experienced I/O bottlenecks with the 1.0 version of production and determined that more spindles per node would alleviate that problem. Finally, we wanted to leverage in-memory computation engines like Apache Spark [16], so we nearly tripled the available random access memory per node. With the hardware in place, we were ready to tackle the ISCM data modeling and knowledge engineering tasks.
Data modeling
Figure 3 below illustrates the core of the ISCM data model, the entity construct. At the core of the model is the host element, which is defined as a computer, printer or other managed network device (e.g. switch, router or firewall).
Fig. 3 ISCM entity model
ISCM leverages Storm’s stream processing capabilities to create entities and stores them in an entity-modeling construct in Accumulo. Each line of input or raw data is processed for any relevant entity data and the attributes and relationships that go into that entity. Entities, attributes, and relationships are created without querying the entity model first. This may seem counter-intuitive because duplicate entities may already exist in Accumulo; however, duplicates are expected and if an entity is repeated, only the entity with the most current timestamp is returned during the table scan. Furthermore, only three of the same entity identifiers are preserved after the database has gone through compaction (written from memory to disk). This puts the responsibility of handling excessive duplication on Accumulo, allowing Storm to ingest and parse the data as quickly as possible. Finally, storing the state of the entity object (including some duplicates) over time provides the basis for historical and trending analysis in ISCM. Currently ISCM can keep state for three months’ worth of entity objects.
Some data sources, such as ACAS vulnerability assessment reports or HBSS system properties, allow for the creation of two entities and the bidirectional relationship between them. This is a simple case since the entities are available when the relationship is created and the entity is known to explicitly exist.
Other data sources, such as HBSS asset configuration compliance module (ACCM), only creates one entity and the relationships with the host presumed to be reported by another feed type; system properties in this example. Although we do not have a host entity because we derive the host ID from the given data, the software entity is created with all the normal attributes while a relationship is added. Furthermore, we can create the opposite direction of the relationship by creating an entity representing that host that will have the relationship added to it. This works even in the case of the ACCM data being processed before the system properties data because the entity model allows for hanging relationships (i.e. when the host is not actually stored yet). If one were to run a query and discover this, one would know the ID of the host, and some of the software relationships it has, even if all of the host information is not yet available.
In March of 2015, when the asset management widget (see Fig. 1) was completed, we noticed that querying the entity model from the web frontend, certain types of queries took several minutes to return results. We realized that this was due to the fact that Accumulo was not designed to support query-focused datasets. Operations such as order by, group by, and count could not be accomplished without pre-computing the queries via a Hadoop MapReduce [17] job. This was unacceptable because our vision was to allow the adopters of ISCM the ability to ask any questions of the data that were of interest to them. Question such as:
- How many assets are in a given enclave?
- Which assets are in a specific VLAN?
- Which assets have outdated anti-virus signatures?
- Which assets have vulnerabilities greater than 200?
- Which assets have outdated scan results?
- Which assets have communicated with a foreign country recently?
As a result of this limitation in Accumulo, we further refined our initial minimum bar for success to include the following requirement:
- Low latency queries – As a query focused capability, ISCM needs to provide rapid responses to simple and compound queries from both end users and statistic/analytic processes.
The ARL team petitioned the DISA BDP change advisory board to incorporate the Elasticsearch [18] into the BDP baseline. Elasticsearch enhanced the user experience and allowed for users to issue dynamic queries, and in many cases, receive sub-second responses. This allowed for a much more dynamic experience.
At this point, the BDP architecture satisfied all of our requirements and the remainder of 2015 was spent developing and integrating the three remaining widgets: antivirus compliance, network management and vulnerability management. In November 2015, we began our investigation of how to illustrate the risk picture.