Abstract— As a number of attacks such as Stuxnet and BlackEnergy targeting the control system of critical infrastructure have happened, the importance of security enhancement for the facilities such as industrial CPS (Cyber Physical System) has emerged. In this paper, by reflecting the characteristics of industrial CPS, we propose a packet diversity-based anomaly detection model which we can learn and conduct detection with more effectively than the existing anomaly detection systems. In the proposed detection system, in order to enhance the sensitivity of the detection model, we construct a detection models on each after grouping the data of an industrial CPS into packet structure based on features of packet header. The proposed detection system aims single packet anomaly detection to cope with the threats such as injection attacks, malformed packet used in fuzzing and others. For the architecture of anomaly detection system, we suppose a structure applying whitelist and learning-based detection model doubly. Measuring packet diversity using payload variation of packet and entropy-based uncertainty is also proposed to select which learning-based detection model is appropriate to dataset. As learning-based detection models, anomaly detection system uses a model constructed with a well-known learning method OCSVM (One Class SVM) and a newly proposed representative detection model made for solving the limitation of OCSVM.
Keywords—Industrial Control System; Security; Anomaly Detection; Packet Diversity; Entropy; Cyber Physical System
Introduction
Industrial CPS is a system to monitor and control effectively systems operated in industrial infrastructure such as electric power, gas, water resources and transportation as well as in industrial processes such as industrial energy, finance and a factory. Industrial CPS can be constructed as a form of SCADA(Supervisory Control And Data Acquisition), DCS(Distributed Control System), and also includes control systems such as PLC(Programmable Logic Controller). Such an industrial CPS has been considered safe from cyber threats because it is operated in a separated and closed environment, unlike general IT environments. However, in 2010, cyberattack like Stuxnet happened in various fields such as power grid, gas, transportation and nuclear power, and they changed security awareness for industrial CPS at home and abroad.
Meanwhile, ICT (Information & Communication Technology) has been incorporated into it to provide new services and improve the existing systems, and recently, Due to the applying of CPS, physical devices such as sensors and actuators, controllers and even networks are being composed, which has resulted that the current control system, unlike the existing control systems, is losing its closed features as shown in applying the open environment such as IoT and internet and applying wire & wireless communication technologies. Therefore, as the vulnerabilities increase, we should be able to provide against new security threats in addition to the existing attacks to critical infrastructure.
Recently, various security techniques are being studied to detect attacks in industrial CPS. Of these, a number of techniques are based on the studies on intrusion detection systems and detection models for networks, and these techniques apply the detection of the attack patterns in network packet data, the whitelist based on protocol specification or the detection of abnormal behavior through learning a machine.
While general IT networks have various types of data and protocols and many anomalous data, industrial CPSs use relatively limited types of data and protocols, and a number of these data are composed as a repeated form with a specific cycle. Especially, in sections conducting automated communication, data communications are conducted in more formulaic forms.
In this paper, we propose a detection system effective in industrial CPS by utilizing its feature that it conducts a formulaic data communications. With this detection system, we group packets by extracting their features from the header of training packet in the detection model and construct a white list with them. After that, we construct a separate learning-based detection model for each packet group. The learning-based detection model uses mainly OCSVM for learning normal behaviors, some problems can happen in constructing a detection model in industrial CPS which has relatively low diversity of packets. Therefore, in this paper, before constructing a learning-based detection model, we propose a method to measure the diversity of training packets in a packet group, and present a new representative learning model for the low diversity packet groups. Lastly, we verify the proposed detection system by applying it to SCADA network of electric power control system which is a representative industrial CPS.
The main contributions of this paper are as followings:
-
Improve the sensitivity of a detection model by generating a detection model through grouping by the packet types.
-
Present a method to measure packet diversity.
-
Propose a new representative detection model and verify the effectiveness.
This paper is composed as follows: in Section 2, related works are described, and in Section 3, we propose an intrusion detection system and explain its whole structure and each detection model. In Section 4, we describe an experiment using actual packets of Korean SCADA network and its results, and the last section, Section 5, is a conclusion.
Related Works
Intrusion detection techniques have been published in various studies at home and abroad based on whitelist, communication pattern, traffic and machine learning, etc.
In J.Hong’s study, a white list was generated based on the specification of IEC61850 protocol used in substations, and it was utilized for detection [1]. However, this study covered only part of the protocol, and there is a limitation that it couldn’t detect attacks conducted without violating of specification. Hyunguk Yoo’s paper [2] proposed a machine learning-based detection model to detect abnormal behavior of SCADA protocol IEC61850. It is similar to our proposed detection model in that it constructed a detection model based on single packet, and it generates a detection model with OCSVM by using packet field and part of message payload of IEC61850. In Leandros A. Maglaras’s presents, an intrusion detection system using OCSVM in SCADA system [3] and another paper of same author, machine learning was performed by grafting distributed intrusion detection system onto OCSVM, and detection rate was increased by applying ensemble methods and social network metrics [4]. In Jianmin Jiang’s paper, it was demonstrated that OCSVM detected abnormal behavior effectively from SCADA systems in [5]. WenliShang introduce improved OCSVM for industrial communication, using particle swarm algorithm to optimize OCSVM parameters [6]. Taeshik Shon proposed a new SVM approach, named Enhanced SVM, which combines soft-margin SVM and OCSVM approaches to provide unsupervised learning and low false alarm capability. Finally, enhanced SVM shows false positive rate less than Bro and Snort [7]. Generally, OCSVM is effective to learn normal behaviors with normal dataset. However, In the case that the diversity of a dataset is low, it is apprehended that a constructed detection model might treat a number of normal data as abnormal. Therefore, it needs careful consideration for the packet diversity of an industrial CPS to apply it.
In RRR Barbosa’s paper[8], a detection method based on network flow and periodicity was proposed based on the idea that a number of cyberattacks targeting control systems cause the change of traffics such as the periodicity, size and noise of network data. It has a limitation that it can be applied only part of attack types. In A Valdes’s paper[9], a flow-based abnormal behavior detection method was proposed, which measured average packet size and average inter-arrival time in a specific time interval.
Intrusion detection of an industrial CPS needs a structure in which we can apply a right detection technique to a right place by applying detection techniques multiply. In particular, it needs to conduct more effective detection through considering the features of an industrial CPS which are different from the ones of general IT network.
Proposed Intrusion Detection System for Industrial Control System
Target Domain and Detection System Overview
The main target of the proposed detection model is the network performing automated communication in industrial CPS. Fig. 1 is the general architecture of SCADA network which is a typical network in industrial CPS. The proposed detection model can be utilized effectively when a low diversity packet group exists, and it can be applied to some protocols used in the field networks such as substation LAN, a network composed between SCADA and sub-node, and SCADA LAN as shown in Fig. 1.
Fig. 2. An Overview of Anomaly Detection System
Typical SCADA architecture [10]
The overall architecture of the anomaly detection system proposed in this paper is as shown in Fig.2. In this paper, the intrusion detection system in an industrial CPS doesn’t learn and detect all the training data at a time, but firstly generates packet groups after preprocessing the packets of training dataset in order to conduct more refined learning, measures the diversity of the packets, and constructs learning-based detection models suitable for each packet group.
Data Preprocessing
The learning-based detection model of the proposed detection system groups the packets having a same form like Fig.3 and constructs individual detection models by each packet group as a way to increase detection sensitivity under the feature that the communication of industrial CPSs is not anomalous. The reason of constructing individual detection models by packet group is because the sensitivity of the constructed detection model could be lowered by the learning conducted without regarding the meanings of the data in the process of constructing the detection model in case that the number of fields and their variable types in payload are different.
The way to extract the criterial features to group packets is as follows: Firstly, it needs to discern the kind of protocol by using the basic information such as source IP/Port and destination IP/Port and the packet type value in additional target protocol header. Once the protocol is discerned, it needs to extract, from the additional application header used in each protocol, the features such as the header information related to object type included in payload that can discriminate the field structure of payload. By using these features as criteria, packet groups are generated.
The additional advantage of grouping packets is that packet header-based whitelist detection model can be generated easily if setting the extracted features of all the packet groups as whitelist entry when extracting whitelist. The point that we need to be careful during packet preprocessing is that the fragmentation of a packet could happen depending on the kind of a protocol. In this case, it needs to reassemble them in advance during data preprocessing.
Because each packet utilizes all the fields when learning-based detection model is constructed, all the information should be recorded without changing packet data and the packet data should be stored in a container by packet group.
Measuring Diversity of Packet Group
Fig.4. Detection Model Decision Flow using Packet Diversity
Fig.3. Data Preprocessing (Grouping)
In industrial CPS, when payload is used in a packet, there are some cases that a same value is always included such as the case of data response, but frequently-changing values such as analog measuring values are also included. Considering this feature, in this paper, we construct a learning-based detection model, and apply well-known machine learning-based OCSVM and a representative model proposed in this paper selectively. The selection criterion of learning-based model is the payload diversity of a packet, and payload diversity is measured with the payload variation and uncertainty of a packet by packet group. The reasons why we consider only packet payload are because the header values within a group are almost identical because packets are grouped based on their packet headers and the abnormal behavior happening in headers can be detected enough in whitelist-based detection model.
Payload variation is the number of distinct payload values present in a packet group of whitelist. When the payload variation is over the pre-set variation threshold Tvariation, OCSVM model is constructed. When payload variation is smaller than this Tvariation, the payload uncertainty in a packet group is calculated.
Payload uncertainty is another way to measure the diversity of a packet. When a payload variation is low, the normalized entropy of the payload is calculated and the value refined with the payload variation is used. In this paper, we present the calculation method of uncertainty as followings:
Like the case of variation, when the uncertainty of a packet group is bigger than pre-defined uncertainty threshold Tuncertainty, a packet group constructs OCSVM model, and when it is equal or smaller than Tuncertainty, it constructs the representative model.
The Fig. 4 is the summary of the structure which decides a detection model through measuring packet diversity in a packet group.
When the diversity of a packet group is relatively high, SVM, a well-known classification algorithm, which has excellent performance is used to construct a learning-based detection model. Among SVMs, OCSVM is used, which can construct a model only with normal data classes. However, because of its feature that it constructs a normal model with one class, some data in training dataset should be always classified as abnormal. When the diversity of a training dataset is low, a packet type which has enough data to be judged as normal can be classified as abnormal during learning. Because the risk that this phenomenon happens in industrial CPS of which packet diversity is lower than general IT network, we propose a representative model to cope with it.
This paper has differentiation from the existing OCSVM-based detection systems in that it proposes a new learning-based model in addition to OCSVM based on the diversity of each packet group.
Proposed Representative Detection Model
The basic principle of representative model is based on the frequency of packets in a packet group, and unlike OCSVM, the model includes all the packets within the learned group in a normal category. The procedure to construct representative model is as follows.
We regard the packets of which payloads are same in a packet group as one type, and calculate its frequency. At this time, we include the packet types of which frequencies are bigger than pre-set representative threshold Trepr to the representative model. It means frequency of packet types are relatively higher than others. These packet types are expressed as PI1,PI2,… ,PIn-1,PIn. And then, the relatively-low frequency packets are expressed as PO1,PO2,… ,POn-1,POn, and we calculate the distance between each packet type and core model. Distance is calculated like this: Firstly, we find out the difference of each field by considering variable type in order to reflect the meaning of data in the payload of a packet, and sum them. The summation result is used as distance. The shortest distance to the packet type within representative model is adopted as the distance between one packet type and representative model.
D(POx) = min[D(POx,PI1),…,D(POx,PIn)]
For all the packet types outside of the model, we calculate their distance, and set max[D(PO1),…, D(POn)] which is the longest distance as base radius from the model in order to include all the packets in the normal boundary. When it needs to adjust the normal boundary, margin coefficient m can be used, and finally, the radius of normal boundary is set as m×max[D(PO1),…, D(POn)].
Fig.5. A Proposed Representative Detection Model
Representative detection model is constructed as shown in Fig. 5.
Experimental Result
Training Dataset and Experimental Environment
In this paper, in order to verify the effectiveness of the proposed detection model in the industrial CPS, training dataset is made of real packets which captured from actually operating SCADA–SA(Substation Automation) network in Korea. Characteristics of collected training dataset is as follows.
-
Protocols in Dataset : ARP, IEC61850, DNP3, and some other IP packets
-
Detection Target Protocol : DNP3
-
DNP3 Packets for Training : 165,007 packets
-
DNP3 Packets for Testing : 187,200 packets
Proposed anomaly detection system is developed using C and C++ programming Language on the Ubuntu Linux 16.04 LTS 64bit, pcap library used for managing captured packets. libsvm v3.21 used for construction and detection the OCSVM-based detection model.
Packet Preprocessing
In the chapter 3, to increase a sensitivity of detection model, packets should be grouped together with the same packet payload structure. This experiment is targeted to DNP3 protocol, and our strategies for grouping is as follows.
-
Target Protocol Extraction: Extract all DNP3 packets by checking known TCP port number(For SCADA, port 20000 is generally used for DNP3) and DNP3 start bytes(0x0564) in DNP3 header.
-
Packet Grouping: Group together the packets which is same values at some of DNP3 header features. DNP3 payload structure is decided by pairs of ‘Object group field’ and ‘Object variance field’ in DNP3 object Header.
-
Group Exception Rule 1: A packet group has few packets, this group is excluded from training dataset. In this experiment, this threshold value is setting 20.
-
Group Exception Rule 2: Some packet groups which do not have a DNP3 payload are excluded because these types packets are out of scope for proposed payload learning-based detection model. In DNP3, general cases which have payload are two types of DNP3 header contents. One is a value 0x28 is set to ‘qualifier’ field at object header and a value is not empty of the ‘range1’ field. Other is a value set 0x00 to ‘qualifier’ field and a value is not empty of the ‘range1’ field and ‘range2’. However, DNP3 Master sends these two types packet to DNP3 Outstation. In these payload types, DNP3 master sends without payload because it is just a data request. As such, sufficient knowledge of the target protocol is required because to reflect the characteristics of a protocol to construct the detection model.
Fig.6. Top 20 packet groups ordered by packet frequency
Fig. 6 is a result of packet grouping. At First, we grouped by packet header information. Next, the groups of below group 15 is excepted because these groups have low packet frequency. Finally, we choose group numbered 5,8,10,12,15, because other groups have not payload. Now, anomaly detection system construction learning-based models on each packet group.
Measuring Packet Diversity
Before constructing a learning-based detection model, a diversity of packets of each packet group must be measured for model selection. To measure diversity, payload variation and payload uncertainty must be calculated. Table 1 shows packet diversity of each packet group.
In case of packet group 5, the payload variation and uncertainty is very high. However, packet group 5 is grouped by ‘object group’ in DNP3 object header value is 30. It means, object has analog input values. So, the packet diversity calculated almost maximum values. On the other hand, other four groups have binary input, binary output, and device object value. These types of DNP3 payload values are limited in narrow value range.
Table 2. 10-fold Cross Validation Accuracy using OCSVM
Packet Group
|
Detection Accuracy
|
5
|
99.93%
|
8
|
Model can’t be created using OCSVM
|
10
|
Model can’t be created using OCSVM
|
12
|
90.19%
|
15
|
76.43%
|
Table 1. Packet Diversity of Selected Packet Groups
Packet Group
|
Packet Diversity
|
Payload
variation
|
Payload
entropy
(0 to 1)
|
Payload
uncertainty
(0 to 1)
|
5
|
17916
|
0.999768
|
0.999098
|
8
|
1
|
0.000000
|
0.000000
|
10
|
1
|
0.000000
|
0.000000
|
12
|
8
|
0.490289
|
0.006581
|
15
|
26
|
0.763109
|
0.139724
|
Although packet diversity uses payload variation and uncertainty, we give payload entropy values on the Table 1, for showing uncertainty’s importance compared with entropy. When the uncertainty calculating, an additional value is multiplied to entropy for scaling it. We want to reflect the relation of the number of packets and packet’s variation. For example, suppose two packet groups. One group has ten packets of ten types of packets. Another group has thousand packets of ten types of packets, and the number of each type is same. These two groups may get same entropy but has different meaning of construction a learning-based model. The case of packet group 12 and 15, the value of payload entropy is very ambiguous to say its diversity is high or low. But it would be decided low diversity of payload uncertainty. In this case, the representative detection model is more useful rather than OCSVM model.
Now we have to set the payload variation threshold Tvariation from 26 to 17916 for our training dataset. Also, have to set the payload uncertainty threshold Tuncertainty from 0.139724 to 0.999098.
Test Result and Comparison
The lower packet diversity, OCSVM-based detection model has the lower accuracy. In this case, proposed representative model can be reasonable detection model. To verify our approach, we have tried to create OCSVM-based detection model for all five packet groups. Here is the result of that cases.
Packet group 8 and 10 are not suitable for construct the OCSVM detection model, because the payload variation is only one of each group. Simply that is just a normal packet as itself. Packet group 5 is fitted to OCSVM-based detection model. This accuracy value is relatively sufficient for OCSVM-based detection model. Packet groups 8 and 10, these two groups payload variations are 1 and uncertainty is 0. This means there is only one type of packet is in the group. In this case, OCSVM cannot use for detection model. In case of packet groups 12 and 15, these detection accuracy percentages are not reasonable values of learning-based detection model. The proposed representative model can be acceptable to these groups.
After construction of representative models, with training dataset. The Trepr value is set to 0.2, and representative margin coefficient is set to 2, moderately. These values can be modified to calibrate a normal boundary. To validate a constructed detection models, testing dataset are used for detection tests. Representative model tested for 5340 packets of 187200 DNP3 packets of normal testing dataset. The model gives false alarming on 171 packets which are outside of normal boundary of representative model. In this experiment, the accuracy of representative model is approximately 99.97%. This result shows representative model can cover packet groups have low packet diversity. By applying two kinds of detection models together, it can cover more detection coverage rather than OCSVM only applied detection system.
Conclusion
Measuring packet diversity used for considering the characteristics of industrial CPS to selectively apply a learning-based detection model. Generally, OCSVM is the best solution to learning the training packets as normal behavior for establishing anomaly detection system. But, in case of packet diversity is low, OCSVM make a problem about setting hyperplane. We point out OCSVM’s limitation and propose a new anomaly detection model named ‘representative model’ for industrial CPS. Packet diversity must be low in using ‘representative model’. For that, we also propose the way to measure packet diversity. It based on packet variation and uncertainty. The proposed anomaly detection model is verified with real packet dataset from SCADA-SA network of Korea industrial CPS. The experimental result shows a needs of proposed model for low diversity packet dataset with comparing detection accuracy between only OCSVM detection and with use the proposed ‘representative model’. It is more meaningful to low diversity packet dataset. Refining the coefficient of representative model and some thresholds is further works to improve our anomaly detection system.
Acknowledgement
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (2015R1A1A1A05001238) and supported by the Korea Institute of Energy Technology Evaluation and
Planning (KETEP) and the Ministry of Trade, Industry and Energy (MOTIE) of the Republic of Korea
(No. 20162220200010).
References
J. Hong, C. C. Liu and M. Govindarasu, "Detection of cyber intrusions using network-based multicast messages for substation automation," Innovative Smart Grid Technologies Conference (ISGT), 2014 IEEE PES, Washington, DC, 2014, pp. 1-5.
Hyunguk Yoo, Taeshik Shon, “Novel approach for detecting network anomalies for substation automation based on IEC61850”, Multimedia Tools and Applications, Volume 74, Issue 1, 2015, pp. 303-318.
L. A. Maglaras and J. Jiang, "Intrusion detection in SCADA systems using machine learning techniques," Science and Information Conference (SAI), 2014, London, 2014, pp. 626-631.
doi: 10.1109/SAI.2014.6918252
Leandros A. Maglaras, Jianmin Jiang, Tiago J. Cruz, "Combining ensemble methods and social network metrics for improving accuracy of OCSVM on intrusion detection in SCADA systems", Journal of Information Security and Applications (Elsevier), available online , May 2016, DOI: 10.1016/j.jisa.2016.04.002
J. Jiang and L. Yasakethu, "Anomaly detection via One Class SVM for protection of SCADA systems," Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2013 International Conference on, Beijing, 2013, pp. 82-88.
Wenli Shang, Lin Li, Ming Wan and Peng Zeng, "Industrial communication intrusion detection algorithm based on improved one-class SVM," 2015 World Congress on Industrial Control Systems Security (WCICSS), London, 2015, pp. 21-25.
Taeshik Shon, Jongsub Moon, “A hybrid machine learning approach to network anomaly detection”, Information Sciences, Volume 177, Issue 18, 2007, pp. 3799-3821
R. R. R. Barbosa, R. Sadre and A. Pras, "Towards periodicity based anomaly detection in SCADA networks," Proceedings of 2012 IEEE 17th International Conference on Emerging Technologies & Factory Automation (ETFA 2012), Krakow, 2012, pp. 1-4.
A. Valdes and S. Cheung, "Communication pattern anomaly detection in process control systems," Technologies for Homeland Security, 2009. HST '09. IEEE Conference on, Boston, MA, 2009, pp. 22-29.
Automatic Control Laboratory of ETH, A Figure of Typical Architecture of SCADA System, VIKING Project, http://control.ee.ethz.ch/~viking/