Traffic characterization in a communications channel for monitoring and control in real-time systems

Received Oct 10, 2019 Revised Dec 20, 2020 Accepted Dec 22, 2020 The response time for remote monitoring and control in real-time systems is a sensitive issue in device interconnection elements. Therefore, it is necessary to analyze the traffic of the communication system in pre-established time windows. In this paper, a methodology based on computational intelligence is proposed for identifying the availability of a data channel and the variables or characteristics that affect the performance and data transfer, which is made up of four stages: a) integration of a communication system with an acquisition module and a final control structure; b) communication channel characterization by means of traffic variables; and c) relevance analysis from the characterization space using SFFS (sequential forward oating selection); d) Channel congestion classification as Low or High using a classifier based on Naive Bayes algorithm. The experimental setup emulates a real process using an on/off remote control of a DC motor on an Ethernet network. The communication time between the client and server was integrated with the operation and control times, to study the whole response time. This proposed approach allows support decisions about channel availability, to establish predictions about the length of the time window when the availability conditions are unknown.

tools have frequently been used to determine relevant parameters like jitters, delays in transmission, throughput, and periods without connectivity of the network [30]. Likewise, other variables that are part of the communication system must be considered, such as processing, memory, applications, and the number of available or occupied resources.

Influence of the network architecture on traffic behavior
Automation architectures consist of intelligent devices connected by a local or global communication network, which simplifies their complexity and expands the development of network control systems. Initially, it was essential to reduce the system wirings to ease the information exchange among different components of the system. Later, the trend was to use the same network technology at all levels in the industrial organizations [31] [32]. Mainly, these network architectures are based on the Ethernet standard, which is used for its advantages in interconnecting industrial devices and giving the possibility of improving performance in realtime applications [33] [34].
Improve or change the performance of a communications network requires to be analyzed from three levels: architecture, technology, and services [35]. The communication architecture consists of the physical organization and logical functioning of nodes linked by switches, where these links are modeled by buffers. However, every architecture requires a communication protocol and interfaces made up by applications and users [36]. The network calculus, introduced by Rene L. Cruz in [37][38] [39], only assumes that the number of bytes sent on the network links does not exceed an arrival curve service (traditionally, a leaky bucket) which can represent both periodic and aperiodic traffic. The network calculus then determines the packet maximum end-to-end delay or jitters from exchange matrices which model the traffic between the industrial devices. A widely accepted model of real-time industrial communications [40] [41] establishes that network infrastructure changes considerably and affect the real-time performance and traffic conditions [42] [43]. Thus, the impact of varying parameters such as communication mechanism, network architecture, network traffic, data size, number of connections and processor load on the peer-to-peer interlocking performance must be analyzed as a real-time operating condition [44]. The technology refers to the characteristics of the devices and the channel and, similarly, the services involve the kind of traffic and the applications that the communications system could support [45].

Characterization of the system for remote monitoring
Remote access for controlling and monitoring various devices in an industrial environment is of value to engineers and automation plants [46] [47]. Although Ethernet is the dominant technology in communications, in industrial applications, it requires meeting real-time performance requirements, traffic restrictions, and information security. Therefore, tunnels are incorporated into the network platform for monitoring and control via the Internet [48] [49], demonstrating the sensitivity of the conditions of availability, integrality, and confidentiality to the channel and data traffic. To all this, the combination of technologies to remotely monitor and control processes is added through the use of the GSM network [50][51] which incorporates, together with Ethernet and TCP/IP, solutions in the control plants exploiting new technologies in automation and process control. In monitoring, remote access to the industrial network requires security [52], because Internet-based accessibility increases the vulnerability of these systems.
Remote communication is based on the synchronization between local control and remote monitoring, and the quality of the data transmission depending on the characteristics of the channel and the configuration of the system. These configurations, such as quality of service [53] and temporary memory technologies, provide compensation for the time delay required to minimize the uncertainty in the transmission of data in a critical time interval. Likewise, the implementation of a robust network [54] [55] protects against data distortion during transmission, taking into account that industrial networks are specially designed for critical data transmission in multiple industrial applications [23].
The identification of the communication system involves analyzing all the possible characteristics involved in the mentioned aspects of security and architecture where the related variables include traffic behavior and the response of the network [56]. This monitoring process seeks to identify latency of the network, loss of data, identification of routes, bandwidth, percentage of the occupation, and use of resources. This process is done actively when additional traffic is injected into the network or passively when it is based only on the observed resources [57]. For both cases, the SNMP (Simple Network Management Protocol) protocol is used, which allows access to the operation conditions of each of the system's devices [58]. In the case study of the experiment, characteristics related to the channel were taken as transfer rate, bandwidth and packet loss, and aspects of the devices related to its performance as memory and processor use.

Experimental setup
In the selected study case, critical variables of the channel were considered and the response time emitted by a controller when a control signal is sent through the network that communicates it. With this, the relationship between the congestion of the channel and the response times in it was observed, and it was found how the behavior of the system altered the operating requirements in a control operation [13].
In this work, a business network was used, in where measurements of different the parameters of the device and channel traffic were considered. In the proposed methodology, four steps are used to examine the availability of the channel: in the first step, the variables used are evaluated by Pearson's correlation to identify the redundant system parameters. Multivariate data have a dependency structure between the variables, and these relationships are evaluated between peers and between all. The dependence between variables is evaluated using the correlation coefficient, for two variables and , with the properties: 0 ≤ | | ≤ 1, = + with | | = 1 for an exact linear relationship between the variables and r jk is invariant before linear transformations of the variables. It is expressed: Pearson's correlation computes a coefficient that measures the degree of the linear relationship between two variables; the coefficient is an index that measures the degree of covariation or the structure of dependence between different linearly related variables. It is represented by r and is obtained by typing the average of the products of the differential scores of each case in the two correlated variables: where x i and y i refer to the differential scores of each pair, n is the number of cases and S x and S y to the deviations typical of each variable. The absolute values of the correlation coefficients range between 0 and 1, although the numerical result fluctuates between the ranges of +1 to -1 [59]. It is defined as the ratio between covariation and the square root of the product of the variation in X and the variation in Y [60]: In the second step, the feature selector sequential forward floating selection (SFFS) is used to find the features that contain the most system information. In floating search methods, sequential forward floating selection (SFFS) has been commonly used [61], where the number of added features can also change at each step, and these wrapper routines search for a considerably smaller number of subsets.
Sequential forward selection SFS is a bottom-up search procedure which adds new features to a set of functions one at a time until the final feature set is reached. The characteristics of the database are enabled or selected beforehand for the design of the classifier, and the SFS (Scalar Feature Selection) allows to determine a sufficient number of characteristics. Likewise, the good selection of information-rich in characteristics, Simplify the design of the classifier. In addition to removing the redundant variables and choosing the most representative characteristics, the evaluation of their performance is important in the design of the classification system, in which the probability of classification error is considered [62]. With SFS the characteristics are treated individually, any criterion of the measure of class separability can be adopted as ROC (Receiver Operating Characteristic), FDR (Fisher's Discriminant Ratio), one-dimensional divergence, among others. The value of the criterion C(R) is calculated for each of the characteristics, R = 1,2, ..., m. The characteristics are sorted in descending order of values of C(R). The l feature corresponds to the l best value of C(R) and shown in a feature vector. The one-dimensional divergence d ij is used and calculated by each pair of classes, where the corresponding C(R) for each characteristic is expressed as the minimum divergence value overall peer classes [63] ( ) = Through steps 1 to 3 of the following algorithm, the cross-validation operation is detailed for the selection of the best feature set with the sequential search procedure: Step 1. Divide the data into training and test sets.
Step • Divide the training data into 10 equal parts, ensuring that all classes are representative in each part; It uses nine parts for training and the remaining part for testing.
• Train the classifier model for each subset of variables, h, in each subset, k, of the training data, in turn perform the tests of the remaining parts. Get the performance, CV (h, k).
• Average result Select the smallest subset of features, S h * so that CV (h) is optimal for the next stage of the search.
Step 3. Evaluate the test data set with the smallest subset of features, S h * , from the search procedure that offers the best performance, learning throughout the training set and performance evaluation with the test to the set.
In the next step, the Bayes classifier is used for training and learning the system with the characteristics found in the previous two steps.
This classifier is based on Bayes' theorem and allows a general solution to the problem when the parameters are known [64] [65]. Given the classes 1 , 2 … , assigned to M and known the pattern, which is represented by a characteristic of the vector x, the M conditional probability ( | ), i = 1,2,...,M is formed. This is known as a posterior probability. Having both classes 1 , 2 assumes that the prior probabilities ( 1 ), ( 2 ) are known. N is the total workouts and 1 and 2 belong to 1 and 2 respectively, then ( 1 ) ≈ 1 and ( 2 ) ≈ 2 and the probability density function is assumed to be known ( | ) , i = 1.2, the basic Bayes rule is defined This classification study is validated using a cross-validation procedure is performed to validate the classification method. The Cross-Validation strategy consists of dividing a set of samples of size n that can be analyzed into two data sets. One of these sets train the samples it contains, and the results obtained are applied to the other set that is used for the classification of samples or the estimation of the error [66]. Validation is the average overall possible training sets of size n -1. In order to determine an appropriate model, cross-validation is calculated by each member of the candidate family of models, , k = 1...,K, and the M k model is selected where Cross-validation allows seeing and comparing the performance of a classifier in a given problem. The objective is to minimize the error. The classifier is trained until the minimum validation of the error is reached, keeping in mind not to include training points in the validation.

Proposed procedure
Seen the above then, the most relevant parameters were tested through a practical laboratory implementation. The proposed procedure is shown in Figure 1 alongside graphic the control on the web system using as plant a DC motor. Two scenarios were analyzed: the first was the collection of data in the business network and the second was an emulation in a laboratory network (A P2P connection was implemented using an Ethernet channel built into a corporate network), in this traffic was generated in several states to simulate the high and low congestion of the channel. In both cases, the data was passed by the Characteristic Selector, the Bayes classifier, the crossvalidation, and the control operation of an engine as described in the following sections.

Network structure and data acquisition
The data network was made up based on wireless technology heterogeneous architecture [67][68], which is depicted in Figure 2. Measurements were acquired from a channel, and then, they were processed and represented in a specific scenario to validate the methodology used (see Figure 1).
The communication channel availability was characterized using monitoring equipment in the border router (connection to the Internet Service Provider -ISP) and in an internal router which accesses the corporate LAN using wireless techniques. The data acquisition of traffic measurements was performed using the PRTG Network Monitor software, version 13.01, by Paessler. In order to execute different tests for different congestion states, two opposite levels of congestion were established as labels (i.e., low and high congestion). Channel congestion reduced the quality of the service when the network carries more data than it can support. These effects generated a delay in the queue, loss of packets, blocking of new connections, and decreased the network performance using the Traffic Jam of Quest Free Network Tools and the Ping application for packet transfer of variable size and low/high file transfer between the monitoring and control stations. The applications used for this data analysis were Wireshark 1.10.1 GNU and Capsa 7.72, both freely distributed. For each congestion level, 31 repetitions were performed to average the results and reduce noise effects.

Feature selector
In order to determine the redundant features, Pearson correlation was calculated for each group. The results are shown in Figure 3, evidencing high correlation among features { 1 , 2 } and { 1 , 2 , 3 }. The whole feature set x from each group (low and high congestion group) was studied in order to test the linear dependence among variables, and the representation degree of each feature. The intra-class analysis allowed discard redundant features, and the number of features was reduced to 14 linearly independent features. The variables of the intraclass analysis were listed in the Table 2 where the set of variables reduced to 14 is observed, and the next row represents the dynamics of the system eliminating the correlated variables. Then, a relevant feature selection was applied to preserve most of the relevant information of the original data according to an optimality criterion that directs the representation context. For this work, the dataset was normalized (using mean and standard deviation estimations), and each group was taken as a class, where the optimality criterion was the classification accuracy of a Bayesian classifier [69] [70]. According to the classification capacity, weights were assigned to each feature. In Figure 4, this feature weighting is shown, where x 2 has a representation capacity of 51.7% (the highest weight), followed by x 6 with 44.14%. Likewise, features 12 , 7 , and 22 also represent in a lesser degree.  Figure 4. Feature weighting results

Training, validation and implementation
In order to determine the most representative features, the accuracy of a Bayesian classifier was used as optimizing criterion. This classifier was validated using a 30-fold cross-validation strategy (70% for training and 30% for validation). Experimental tests were performed in the laboratory using the three most relevant features. The communication channel was characterized, and with the trained classifier, the traffic conditions in terms of the instantaneous channel availability were determined. The response of DC motor was suitable.

RESULTS AND DISCUSSION
Using the three most relevant features, x 2 , x 6 , x 12 a classification accuracy result of 99.9% was achieved. This indicates a high degree of separability between classes (low between 0% to 50% and high between 51% to 100% congestion or occupation of the channel), and therefore, as shown in Figure 5, the location of a 3-dimensional point in this characterization space can determine the availability of a communication channel for remote monitoring and control tasks in a real-time system. The estimated time interval for a motor to reach 99% of an assigned velocity is 4τm, where the mechanical time constant, τm, expressed in ms, is the time for the rotor to reach 63% of its velocity. Here, the effects of friction, load, and load inertia were not considered. In order to reduce operative costs even more, in the implementation, two of the three most relevant features were tested (i.e., x 2 and x 6 ) which achieved a classification accuracy of 99.63%.
For an Ethernet connection of 100 Mb, in low traffic conditions, the control system via web required a time interval of 14 ms. The data transfer took 9 ms. As shown in Figure 6, the data transfer with the client represents the longest time (the notation 00: 00: 00: 000000 indicates the number of hours, minutes, seconds and microseconds). Likewise, the connection time, the three-way handshake time and the server response time represent 58.7% (7 ms) of the time used for control operations, while the server transfer time is of 80 µs, and therefore, the protocol for this connection must be longer in terms of execution time. Another test was performed with an Ethernet channel of 10 MB and P2P connection. In Figure 7, it can be seen that while data quantity was increased, the network latency proportionally incremented, which means loss of packets. Thus, this access form by the Ethernet standard provoked delay times in the channel mainly due to the number of packets and simultaneous access. When incrementing the channel's congestion to 10%, there is a packet loss of 18% and a control time for the DC motor is obtained that goes from 14 ms (without congestion) to a delay of 272 ms. In this way, when congestion increases up to 90%, the control time increases to 707 ms, as summarized in Table 3 Experimentally, different tests (many congestion levels) were performed, for validating the 2dimensional characterization space (using x 2 and x 6 ) and determining a time threshold for control operations via web on the DC motor, a time of 20 ms was the maximum time delay for a successful control process via web associated with the congestion level of the communication channel. This time threshold indirectly defined the class boundary in the characterization space for the classifier training and the validation process.

CONCLUSION
Although the methodology of [13] is similar to characterize channel traffic, it is not comparable to the method proposed due to the variables involved of the devices and the identification and elimination of parameters that do not represent information relevant to the system. For this, an approach for the characterization of a communication channel was discussed and implemented in a real-time system for remote monitoring and control. In this study, a feature set made up using 32 attributes extracted from a communication channel, which was analyzed using multivariate data analysis and machine learning techniques. The redundant features were discarded, and the relevant feature set was selected. The used optimization criterion was the classification accuracy of a Bayesian classifier. The results obtained using multivariate data analysis coincide with the expert intuition, as a form of validating the reduced characterization space. Thus, there can be stronger reliability for this proposed approach when the nature of features is not previously known. Additionally, in this study, a time threshold was determined after several experiments via Web on a real-time system (i.e., control and monitoring of a DC motor) with different congestion levels. This time threshold was a key factor for defining the plane between classes in the feature space. Experimental results determined the terms and conditions of use for control functions and variables that require critical response times in a particular process although this approach allows automatic decisions about the channel availability, in order to establish predictions about the length of the time window when the availability conditions are favorable, as future work it is necessary to involve alternatives such as evolutionary algorithms and forecasting routines to make those predictions