Operational Technology: Infrastructural Awareness

This blog was written as in the context of Adam Drong's graduate research.

Operational Technology. It is everywhere. Did you know that OT powers our most critical infrastructures? Infrastructures like electricity grids, water purification plants, airports, transportation, and food and healthcare industries are all dependent on this domain. The talk about the IT & OT convergence is booming, and with that, the need for improved cybersecurity measures is growing alongside it.

Constraints to traditional penetration testing within Operational Technology

During my time at Warpnet B.V. I heard stories of costumers wanting to secure their OT environment, and with it, their operations. Being an IT cybersecurity company, OT cybersecurity is a new domain, with new challenges and constraints that offers different attack paths and vectors. Receiving the request to cover a broader scope for their penetration tests, these costumers asked if their OT environments could be included within their annual tests. Penetration testing OT however, comes with added risks. Outdated hardware, outdated protocols and a lower level of importance on cybersecurity are still the standard in many of these companies according to recent articles. An added crux we've encountered is that the hardware used in these environments is not owned by the companies themselfs, but rather by the original manufacturers. As a results, any penetration testing involving this hardware requires prior approval from the manufacturers before any testing can commence.

Next to the request of costumers, research has been done on penetration testing of OT and its devices. Research like the ones from Coffey and Pospisil have put their focus into uncovering the fragility of OT devices. Both articles have done experiments that have highlighted certain weaknesses of these devices. Both experiments state that PLC's are very sensitive to active scanning methodologies which can lead from small delays to even complete unrecoverable crashes. The authors of Pospisil conclude their research by stating that active scanning methods should only be used in controlled environments, where crashes of devices do not gravely impact operations. Coffey's researchers raise the question of which aspect of active scanning methodologies disrupt device processes, highlighting the unknown cause behind the disruptions.

Outdated hardware, limitations to what can and can't be tested due to manufacturers and the unknown fragile architecture within devices. These constraints are the tip of the iceberg when it comes to the challenges within OT. How do we make sure that, with these constraints in mind, we can help companies in this domain to make sure they are on top of their security? Or, at least, acquire a better understanding of their infrastructure and the dangers it poses?

The old and the new

So, how do we assist costumers with an OT-environment, without being limited by these constraints? Without limiting our progress-curve and general ability to assist costumers by spending all of our time and effort into finding the solutions to all these constraints, we looked at a solution that would solve all of our problems in a quicker and effective way. Passive scanning methodologies within the field of OT include topics that have been expanding over the last several years. Methodologies for identification of devices, anomalydetection and honeypots are some interesting topics, to name a few. To effectively answer the questions from our costumers, and with that acquire a broader perspective of the do's and don'ts in OT, we must start from a top-down approach. When we take a look at how penetration testing of an IT environment is done by looking at the steps from a higher perspective you can separate each of these steps into phases. You'd start with reconaissance by finding your entry points and surface weaknesses that help you to traverse latteraly across the network. Next you'd try to exploit some known vulnerabilities or attack vectors in an attempt to gain access to potential targets. The approach in an OT-environment could be structured relatively the same way: reconaissance, gaining access, and so on. For our research, our focus starts at the first phase: reconaissance.

Passive Identification

If the fragility of OT devices, the unknown cause for it and the issues with ownership limit us in doing our work, we should utilize non-intrusive techniques. One of these techniques is passive fingerprinting. By collecting device-specific characteristics, creating a fingerprint and building a database with unique fingerprints for different devices, I checked if its feasible to consistently determine a devices' identity. I checked this across several (self-)curated datasets. Based on these datasets I was able to successfully differentiate two devices from the same vendor and differentiate two devices between two vendors. Based on an f1 score calculation, I determined the devices with an average accuracy of 87% across all datasets.

TCP/IP: The approach

Now, how did I do this? Let us take a look at the techniques behind the signature creation and the protocol-specific identification techniques used in the scanner.

Fingerprinting: creating signatures

The creation of signatures is based on an older technique that was invented by the creator of the passive-fingerprinting tool called p0f. This tool was released fully around 2012 (the idea for it was created around the year 2000!) and is (to this day) used to identify operating systems of clients on a network. This tool works by looking at TCP/IP traffic, specifically characteristics that are consistent enough to be used in the constitution of a signature. I took inspiration of this and utilized some of the techniques used in p0f for the technique used in my scanner. To name a few take a look at the example signature below:

"00:1D:9C.v4.64.256.44818.*.*.df.id+.pushf+"

Each characteristic is period-separated. In this example, we can see (from left-to-right) the MAC address prefix, IP packet version of the client, the TTL (Time-To-Live), IP Identification Step Size and several other characteristics that are consistent enough for a client to be used as a determining factor. A characteristic like the IP Identification Step Size in this example, is especially interesting because it utilizes a client' TCP stream. This characteristic uses its median value found during runtime. Characteristics like these are great because they are so specific per type of device.

Fingerprinting: detecting signatures

That's all fun and interesting of course, but I hear you thinking, "How do we detect these characteristics and correctly use them in the buildup of a signature?". 'Correctly' is an interesting term here. In machine learning there are several different techniques that are used to measure a certain distance between variables. In my case however, there are different things that are of interest when detecting a signature. When a signature is in the process of being built, every value of each characteristic undergoes a consistency-check. For every client on the network, during runtime, several values per characteristic will be recorded. What is essential for the buildup of a signature, is to know if the value of a characteristic is consistent enough to be used in the buildup. A value found in a network packet can only be used in the signature if it is consistent enough. If that value turns out to be too inconsistent, we should use something called a wildcard (*) for that value.

Take a look at the following example. In this example, we use the TCP sourceport number 44818. From a client on the network we know that, since epoch, all recorded values for the TCP sourceport number are 44818, 2000, 1234 and 9876. The algorithm has been keeping track of the amount of times that all of these values have been seen attached to this client.

"port": {
    "values": {
        "44818": {
            "consistency": 0.53,
            "occurrence": 100
        },
        "2000": {
            "consistency": 0.24,
            "occurrence": 45
        },
        "1234": {
            "consistency": 0.18,
            "occurrence": 35
        },
        "9876": {
            "consistency": 0.05,
            "occurrence": 10
        }
    }
},

In the snippet above we can see the amount of occurrences per value of the TCP sourceport characteristic. Alongside that we can see the consistency-rate per value. This consistency-rate is used to determine if this rate surpasses a certain threshold. The formula for determining if the consistency-rate of a value surpasses the threshold is very straightforward: Count(y) >= (n / 2). In this formula, y refers to the current value where the treshold-check is being determined for. Here n stands for the total amount of all occurrences for the TCP sourceport number. The threshold is achieved when the current value y contains more than or is equal to 50% of the sum of all occurrences. So, in the example above, portnumber 44818 has a consistency rate of 0.53. This is more than 50% of the sum of occurrences for this characteristic. This means that in this example, portnumber 44818 is allowed to be used in the buildup of a signature.

EtherNet/IP: The approach

EtherNet/IP, DeviceNet and ControlNet are all industrial protocols that are one of the more used protocols in the industrial market. Together they make up around 30% of the total market. Every protocol has its own perks and use cases, however they all use the same messaging interface: CIP. CIP stands for Common Industrial Protocol and defines extensive rules for the orientation of messages and the way that clients should act during conversation. These rules are what is interesting about CIP. Now what if some of these rules are vendor-specific? Or what if some rules tell us the identity of a device?

CIP

The CIP [documentation](https://ia902308.us.archive.org/4/items/ovda_cip_docs/THE%20CIP%20NETWORKS%20LIBRARY%20Volume%201%20Common%20Industrial%20Protocol%20V3.3.pdf
) offers several rules and tells us how clients should act if they want to be able to converse on the network. The buildup of a CIP packet is as follows:

Service ID: This is a hexidecimal value, which is unique per CIP service. This ID is exactly 2 bytes. The first byte shows us the direction of a packet. This means the packet is either a response packet or a request packet. The second byte defines the type of service that is being requested or responded to. The example below shows us a Wireshark capture of a CIP part of a packet, specifically the Service-ID part. Here we can clearly see the first byte tells us this packet is a request and that the second byte tells us its requesting the service Get_Attributes_Single.

Operational Technology: Infrastructural Awareness

Request-Path: The Request-Path defines 3 different segments. These are the class, instance and attribute segments. These segments define further information for a request. For example, in the screenshot below, we can see that the attribute segment states attribute number 5. This refers to the status attribute of a clients identity object.

Data: The data part is only existent when the Service Code contains a response byte. The data part contains the requested data for the type of response.

Detecting CIP services

Now that we know how CIP messages are buildup, we can take a look at the part of the CIP architecture that is interesting for our use case. In the previous example I explained the different parts and segments that buildup a CIP-packet. I did this alongside the example of a Service Code. There are several different service codes defined within CIP. Next to the Get_Attributes_Single service code used in the example, the Get_Attributes_All service is the most interesting one out of all service codes. This service code returns all identity attributes and data relevant to the identity of a device upon request. As I told you previously the Service ID part of a CIP packet defines both the direction of a packet and the type of service used. The approach to gathering device data from CIP devices is very straightforward:

Listen to the right direction/service combination in the Service ID. If we filter for specific bytes within the Service ID, we can find the exact combination of the direction and service code we're looking for. Like stated previously Service ID is exactly 2 bytes long. In this case, we filter for 0x01 for the first byte. This equals a response packet. For the second byte we filter for 0x01, which equals to the exact service code that we need: Get_Attributes_All.
When we got the request we want we can extract any relevant data to the device we want. According to the documentation, there's quite a lot to choose from. Take a look at the Wireshark capture below:

This picture shows a lot of info. It goes from the vendor ID and product name all the way to the firmware major and minor revision numbers. Extracting this type of data is especially interesting as it is extremely difficult to acquire this by solely TCP/IP fingerprinting.

Continuous research

Now, how do we continue this research, possibly expand it or branch of it? For this project, passive fingerprinting, there are several interesting features to add:

Machine learning. This can be done both for the creation of signatures. A Machine Learning Model (MLM) could discover features like timing differences and other hard to spot features.
Shannon Entrophy. By expanding (existing) features by not only looking at a single value for a characteristic but, for example a range of values for a characteristic, we can possibly gather more defining features that can be used in the buildup of a signature.
An existing tool. NetworkMiner is an existing tool that is consistently being updated and improved by Netresec. It is one of the better passive analysis tools. It has a widerange of support techniques and protocols, like p0f for OS-detection. However, it does not have a signature database for OT-devices, which makes it a little less applicable in this context.

Outside the scope of this project, the HoneyNet project is an interesting collection of honeypot projects. The MITRE ICS Matrix is, next to the HoneyNet project, a handy framework that gives detailed insight into the different attack paths and attack vectors relevant to ICS.

OT environments present unique challenges that make traditional penetration testing approaches difficult, and possibly, very dangerous. It is the critical nature of industrial processes and legacy system that were never designed with cybersecurity in mind that demand methods that prioritize safety and continuity. Projects like passive fingerprinting, The HoneyNet project and the MITRE ICS Matrix are projects that require our time and attention in an effort to better secure and redesign these industrial processes and legacy systems. Our focus should be clear: enable infrastructural awareness without risk. Continues research is what is needed to let the domain of OT evolve in the aspect of cybersecurity.

This article was written by Adam

Adam Drong
Security Specialist (internship)