SDP core architecture contains several different components that are used for live cache, long term storage, container orchestration, and management, among other things. This section provides an overview of those components that are relevant to this DVD. See the Streaming Data Platform Documentation Page for more details.
Pravega
Pravega is an open-source streaming storage system that implements streams and acts as first-class primitive for storing or serving continuous and unbounded data. This open-source project is driven and designed by Dell Technologies. Pravega automatically scales with individual data streams and processes each received event exactly once for ordering guarantee and data integrity. Pravega offers very low latency and high throughput for large-scale edge data ingest with a high degree of storage efficiency. Pravega offers Tier 1 storage that uses an append only log that offers high-performance reads for the most recently written data, and new writes are appended to the tail end. Pravega asynchronously migrates data from Tier 1 to Tier 2, which is high-throughput, cost-effective storage. All writes are acknowledged as soon as data is on Tier 1 storage offering high performance. See the Pravega site for more information.
Kubernetes
Kubernetes (K8s) is an open-source platform for container orchestration. SDP uses K8s for all container deployments in single-node configurations.
Management platform
The management platform is Dell Technologies proprietary software. It integrates the other components and adds security, performance, configuration, and monitoring features. It includes a web-based user interface for administrators, application developers, and end users.
Networking and load balancers
The SDP software uses a single IP address and three load balancer IP addresses as part of the install. It can leverage static or dynamic IP addresses for deployment. Hosts that are used for the SDP management UI need to have their hosts file or DNS updated for the name resolution.
Containers
The SDP deployment creates a set of containers for Pravega store management, core configuration, monitoring, cataloging, and storage management. Additional containers are created for user-defined projects and streams, and flink clusters are created for data sinks for InfluxDB and TimescaleDB.
Long-term store
SDP leverages NFS for managing long-term store. If needed, the entire available capacity of the SDP VM can be used for long term storage. Retention periods are defined for various scopes and streams to manage available historical data. Persistent data providers like InfluxDB and TimescaleDB adhere to a similar configuration. Upon exhaustion of the retention period, data will be purged from Pravega and these persistent data providers and will no longer be available.
Persistent data providers
SDP allows the creation of persistent data provider containers for each of the incoming scopes, and all underlying streams leverage these providers to present the data stored in Pravega. These data providers can be deployed on-demand by the user, at which point SDP pulls all relevant data for that scope from Pravega and builds the time series instances from them. Whenever such containers are deleted, the underlying data sources are also discarded, and will need to be rebuilt if such containers are ever redeployed.
IoT and edge transport
SDP currently offers the connector for MQTT-based event transport for data ingest from Telit deviceWISE Edge and Litmus. If the Telit deviceWISE and Litmus MQTT messages are batched and send multiple events to SDP through this mechanism, it may be beneficial from a throughput standpoint, but they will be stored as a single entity within SDP, which may have some implications for data analytics. In light of this, plan for batching accordingly.