Successful data analytics strategies for Industry 4.0 require dynamically selecting and routing required application data from a subset of devices. In turn this data must flow through processing at the edge, or upstream at a core or cloud facility. In addition to managing data flows, Industry 4.0 practitioners need the ability to flow data model artifacts to infrastructure that hosts useful, problem-solving applications such as:
When selecting the most efficient strategies for IIoT data management, processing, and application deployment architectures, you must consider:
This document has shown how the Confluent Platform can be applied to the challenges of IIoT and Industry 4.0. The Confluent event streaming platform is based on Apache Kafka that empowers organizations in the industrial sector to easily access IIoT device data as real-time streams. Event streams can then feed many of the traditional big data analytics (BDA) tools, leveraging staff with existing skills to develop operational and corporate IT business intelligence for IIoT environments. If BDA technologies, algorithms, and techniques can be leveraged in the development of intelligent cyber-physical IIoT systems, the promise of Industry 4.0 will be fulfilled quicker.
Development of the convergence of IIoT edge analytics and BDA using the Confluent Platform is already producing measurable results in the automotive industry and others. The extreme scale that is required for autonomous driving applications proves that the Kafka engine can handle the frequency, volume, and variety of IIoT environments. This document did not explore the scale limits of Kafka. Dell EMC developed a data generator with the capability to produce large numbers of high-frequency data streams. This design verified that transforming raw streams into messages for the Confluent event streaming platform was robust.
The Confluent Platform combines open-source developed resources with community developed features and commercial features to produce a complete solution for IIoT, including enterprise support. This document described integrating the event streaming engine, Confluent Replicator, Confluent Connector for HDFS, and KSQL for an end-to-end anomaly detection application use case. Dell EMC used the integration of the Confluent Replicator with the Kafka schema registry to coordinate data schema changes between the edge and the core. This document demonstrated how to combine traditional big data analytics tools including:
These designs close the loop from model development with data that is collected at the edge, to model deployment back to edge, using MLOps best practices.