Analysis of OPC Data Using Federated Learning: An Evaluation of Performance and Privacy

Abstract

This study examines the benefits of applying federated learning (FL) technology to OPC (Operational Performance Control) systems within industrial automation and data analysis processes. FL enables each production facility to process its data locally while only transmitting model parameters to a central server, thereby preserving data privactJ, This approach proDides significant advantages in industrial environments, particularly concerning data prfaacy and communication costs. The study eDaluates FL's potential to ensure data privac,-y, reduce communication costs, improve efficiency in training time, and delfoer high performance in predictiDe maintenance and qualitıJ estimation. Model performance was analyzed using accuracy, Fl score, precision, and loss metrics; the results demonstrated that FL achieved a 90% accuracı; rate, offering competitive performance compared to centralized modeling. In predictive maintenance and quality analysis specifically, FL achieved 85-88% accuracy while reducing network data load by 65%. These findings validate that FL provides a secure, cost-effectfae, and efficient solution for industrial data analysis processes by eliminating the need for centralized data collection. In conclusion, FL and OPC integration supports data privacy, cost savings, and communication efficiency in industrial processes. The study highlights that FL could become a prevalent technology in industrial data analysis, establishing a new standard particularly in digital manufacturing processes.

1. Introduction

Operational Performance Control (OPC) systems are critical tools widely used in industrial production processes to monitor performance, optimize processes, and enhance production efficiency. These systems use data obtained from numerous sensors within the production process to monitor and evaluate machine performance and product quality in real time [5]. With the advent of lntemet of Things (loT) technologies, the volume and diversity of tis data have increased rapidly, making data processing increasingly complex. However, as OPC systems are structured to transmit data directly to a central server, they face significant challenges, such as data privacy and security risks [8]. At this point, federated leaming (FL) technology offers an effective solution to these issues by providing a decentralized data processing approach. FL preserves data privacy and reduces communication costs by processing data on local devices and transmitting only model parameters to the server [1] [2]. üne of the primary advantages of FL is that it allows each device or sensor to conduct local model training without centralizing the data. Consequently, only the updated model parameters are transmitted to the central server. This approach minimizes privacy risks while simultaneously reducing network bandwidth usage and communication costs [18]. The structure of FL not only significantly contributes to ensuring data privacy but also offers a substantial opportunity to enhance cost efficiency and perform rapid analyses in production processes [4]. These advantages are especially important in industrial environments where numerous sensors are aggregated, as the continuous transfer of data to a central server can place a heavy load on the network infrastructure and raise concerns about data privacy [15]. Federated learning addresses these security and efficiency issues, playing a crucial role in industrial data analysis processes. in industrial loT (IloT) environments, security and privacy are essential for protecting sensitive data. Data collected in IloT environments includes machine operation information, proprietary production process data, and facility-based performance data. As this data is commercially valuable, it requires protection against unauthorized access. However, traditional centralized data collection methods are more susceptible to privacy breaches and data leakage risks due to the concentration of all data in one location [9]. With FL, processing this data locally on each device eliminates the need for centralized data collection, significantly reducing security risks. FL technology provides highly effective solutions with privacy-preserving mechanisms, particularly in sectors such as healthcare, finance, and industry, where

2. Literature Review

Federated learning (FL) is a method initially developed by Google that enables the training of machine learning models on mobile devices while preserving data privacy [1]. The primary aim of FL is to process data locally on devices rather than transferring it to a central server, and then to combine model parameters on a central server [2]. This feature allows FL to play a significant role in enhancing data privacy and security in industrial applications. Additionally, FL is widely used in fields such as healthcare, finance, retail, and education [3]. Research shows that FL reduces data transfer costs by eliminating the need for centralized data collection [4]. However, the distributed nature of FL requires minimizing communication costs and addressing challenges related to device heterogeneity [18]. Operational Performance Control (OPC) encompasses data collection and analysis systems aimed at enhancing efficiency and quality in industrial production processes [5]. OPC systems gather sensor data and device status information to perform performance analyses and integrate with decision support systems [6]. OPC has a data-driven structure designed to increase efficiency in the production process and to detect quality issues proactively [7]. However, the reliance of OPC systems on data collected on a central server raises various concerns regarding data privacy and security [8]. in this context, FL has the potential to enhance the efficiency of OPC systems while preserving data privacy [9]. in industrial applications, predictive maintenance and quality estimation are areas where both OPC and FL methods can be effectively applied. Predictive maintenance aims to monitor the performance of machines and equipment to anticipate failure risks [10]. The use of FL in predictive maintenance applications ensures data security by processing data locally [11]. Research on predictive maintenance and quality estimation focuses on detecting potential faults in the production process through real-time data analysis and enhancing production efficiency [12]. Analyzing sensor data using FL, in particular, contributes to a centralized predictive model while preserving the local data of each device [13]. When OPC systems are combined with FL, they offer a strong synergy in enhancing the performance of industrial production processes and ensuring data privacy. By safeguarding data privacy and security in OPC data through FL, production data can be analyzed without centralization, thus preserving data confidentiality [14]. This integration holds promise, particularly for reducing costs and increasing efficiency in large-scale production facilities [15]. Additionally, this method is regarded as an effective solution for synchronizing across multiple devices and minimizing data transfer costs [4]. Finally, studies in the field of FL and OPC indicate that the combination of these two technologies creates a new paradigm in industrial analysis and

3. Materials and Methods

The client and server codes used in this study were developed based on the federated learning (FL) architecture, enabling data to be processed on local devices without being transferred to a central server. Within the FL application, each client performs model training on its local data and transmits the updated model parameters to the central server. The server aggregates and optimizes the incoming updates to create a general model, which is then sent back to the clients. in this process, communication protocols and data transfer methods are designed to preserve dlata privacy.

3.1. Data Collection and Preparation

The dlata collection and preparation process is of critical importance in federated learning applications. For the FL model to operate securely and efficiently,

Table 1: Dataset Column Descriptions and Types

The parameters in Table 1 represent the primary features and target variables used during model training. The steps for handling missing

3.2. Federated Leaming Architecture

Federated leaming (FL) is a structure that offers a decentralized

• Initialization Phase: The server creates an initial model and sends it to the selected clients.

• Loca! Training: Clients perform a set number of model training steps on their local

• Model Updates: Clients send the trained model updates (e.g., weight changes) back to the server.

• Model Aggregation: The server combines the received updates using a weighted average or another algorithm and updates the global model.

• Cyde Repetition: The updated model is redistributed to the clients, and the process is repeated.

As shown in Figure 1, the Federated Learning architecture is implemented step by step in this manner. There are three main types of this architecture:

3.2.1. Horizontal Federated Leaming

Horizontal FL, as shown in Figure 2, is used in scenarios where devices representing different users participate with the same data features. For example, model training conducted on the phones of different users using the same mobile application is an instance of horizontal FL. in this scenario, the dataset on each client contains similar features, but the data is spread across different users.

3.2.2. Vertical Federated Learning

Vertical FL, as shown in Figure 3, enables model traınıng among organizations representing the same users but with different

3.3.3. Transfer Federated Leaming

Transfer FL, as shown in Figure 4, is an architecture that facilitates the sharing and transfer of knowledge between different tasks. in this method, a model previously trained on one dataset or problem is adapted to another dataset and task. it is particularly useful in situations with limited data or where heterogeneous data exists.

in this study, horizontal FL is adopted. Each facility performs independent model training using local data and only transmits the updated model parameters to the server. The central server aggregates parameter updates from all clients to create a general model. The client and server codes used in the project enable clients to train models on local devices and transmit only the model parameters to the server. This architecture enhances

• Local Model Training: Each client performs independent model training using local data.

• Transmission of Model Parameters to the Server: After training, clients transmit the updated model parameters to the server.

3.4. System Architecture

Federated Leaming (FL) enables machine leaming model training across different

The choice of a centralized federated learning architecture in this study is the most suitable solution for analyzing OPC

4. Result

This study aims to analyze OPC (Operational Performance Control)

Model Performance

it was observed that FL-based models provide lower accuracy compared to centralized learning models. However, advantages such as data privacy and reduced communication costs make this accuracy difference reasonable. Particularly in industrial environments where data privacy is critical, it is important to achieve an optimal balance between data accuracy and privacy. it has been noted that while FL offers privacy advantages over centralized models, it may experience some accuracy losses [2]. Nevertheless, FL provides a secure modeling environment by ensuring data privacy through local analysis without transferring data to a central server. This study provides an in-depth examination of the advantages of applying the federated learning (FL) approach to OPC (Operational Performance Control) systems in industrial environments, specifically in terms of data privacy, cost efficiency, and predictive analytics. FL enables data privacy by allowing each facility to train models locally on its data without sending it directly to a central server. Consequently, data is processed on each facility's own devices, and only model parameters are transmitted to the central server. This approach offered by the FL architecture presents a significant solution for data privacy and security in industrial data analysis processes. The performance of the model developed with FL was evaluated using metrics such as accuracy, Fl score, loss, and precision. The model achieved an accuracy rate above 90%, an Fl score of 87%, and a precision metric of 85%. This success highlights the importance of FL as a high-accuracy solution, particularly in industrial environments that require data privacy. Additionally, during the model training process, the loss value continuously decreased and stabilized, indicating that the model was able to make increasingly accurate predictions. in terms of predictive maintenance and quality analysis, FL enabled the early detection of potential failures and quality deviations in production processes. in predictive maintenance analyses, the model achieved an 88% accuracy rate, contributing to the optimization of maintenance processes and helping to reduce downtime by anticipating potential failures. in predictive quality analyses, an accuracy rate of 85% allowed for the prediction of quality deviations on the production line. These accuracy rates demonstrate that FL provides comparable performance to centralized modeling while preserving data privacy. The FL architecture also offered a significant advantage in reducing communication costs in industrial data analysis. in this study, the use of the FL model allowed only model parameters to be transmitted to the server, reducing network data load by 65%. Thus, in industrial environments where large datasets are analyzed, cost efficiency was achieved, and communication costs were significantly reduced. In conclusion, the federated learning approach stands out in industrial data analysis for its advantages in security, privacy, cost efficiency, and high accuracy in predictive analytics. Our study emphasizes the need for FL to become a standard in industrial automation processes, indicating that FL technology will likely be widely used in future industrial data analyses. These findings support the potential of FL and OPC integration to enhance industrial data security and efficiency.

5. Discussion and Conclusion

The results of this study demonstrate that federated learning technology offers numerous advantages when applied to industrial automation processes, particularly on large and sensitive datasets such as OPC data. These advantages include data privacy, reduced communication costs, training time, and the effectiveness of predictive analyses. The ability of FL to ensure

6. Acknowledge

The authors gratefully acknowledge the Digital Transformation Center at trex Smart Manufacturing Systems for providing access to the OPC

References

[1] Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and trends® in machine learning, 14(1-2), 1-210.

[2] McMahan, B., Moore, E., Ramage, O., Hampson, S., & y Arcas, B. A. (2017, April). Communication-efficient learning of deep networks from decentralized data. in Artificial intelligence and statistics (pp. 1273-1282). PMLR.

[3] Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on lntelligent Systems and Technology (TIST), 10(2), 1-19.

[4] Bonawitz, K. (2019). Towards federated learning at scale: Syste m design. arXiv preprint arXiv:1902.01046.

[5] Garcia, J. M., Jeschke, S., Brecher, C., Song, H., & Rawat, D. B. (2017). Industrial lnternet of Things: Challenges and Research Roadmap. in S. Jeschke, C. Brecher, H. Song, & D. B. Rawat

(Eds.), Industrial Internet of T1ıings: Cybermanufacturing Systems (pp. 70-405). Springer

[6] Grieves, M., & Vickers, J. (2017). Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. Transdisciplinary perspectives on complex systems: New findings and approaches, 85-113.

[7] Lee, J., Bagheri, B., & Kao, H. A. (2015). A cyber-physical systems architecture for industry 4.0-based manufacturing systems. Manufacturing letters, 3, 18-23.

[8] Xu, H., Yu, W., Griffith, D., & Golmie, N. (2018). A survey on industrial lnternet of Things: A cyber-physical systems perspective. leee access, 6, 78238-78259.

[9] Shokri, R., & Shmatikov, V. (2015, October). Privacy-preserving deep learning.

in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security (pp. 1310-1321).

[10] Jardine, A. K., Lin, D., & Banjevic, D. (2006). A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical systems and signal processing, 20(7), 1483-1510.

[11] Nilsson, A., Smith, S., Ulm, G., Gustavsson, E., & Jirstrand, M. (2018, December). A performance evaluation of federated learning algorithms. in Proceedings of the second workshop on distributed infrastructures for deep learning (pp. 1-8).

[12] Mobley, R. K. (2002). An lntroduction to Predictive Maintenance. Elsevier Science google schola, 2, 485-520.

[13] Sattler, F., Müller, K. R., & Samek, W. (2020). Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE transactions on neural networks and learning systems, 32(8), 3710-3722.

[14] Hard, A., Rao, K., Mathews, R., Ramaswamy, S., Beaufays, F., Augenstein, S., ... & Ramage, D. (2018). Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604.

[15] Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557.

[16] Konecny, J. (2016). Federated Learning: Strategies for lmproving Communication Efficiency. arXiv preprint arXiv:1610.05492.

[17] Nguyen, D. C., Ding, M., Pathirana, P. N., Seneviratne, A., Li, J., Niyato, D., & Poor, H.

V. (2021). Federated learning for industrial İnternet of things in future industries. IEEE Wireless Communications, 28(6), 192-199.