Home > Data Protection > Data Protection (general) > Dell Hadoop Application Agent: Hadoop Protection > Solution components
PowerProtect DD is a storage subsystem that is designed to preserve copies of data for backup, recovery, and long-term compliance requirements. A key feature of PowerProtect DD is its variable-block deduplicating file system that eliminates redundant copies of data blocks stored in the system.
DDHCFS is the Hadoop-compatible file system on PowerProtect DD. It provides versioned backup and recovery of the Hadoop Distributed File System (HDFS).
PowerProtect DD series appliance provides its own DD Boost protocol to efficiently transport data from various sources into PowerProtect DD.
Hadoop provides native copy functionality. This allows data to be copied to and from Hadoop clusters at massive scale. This process relies on Hadoop’s own distributed processing framework to parallelize and distribute the work required to read and write data to Hadoop data nodes.
HDFS snapshots enable you to capture point in time copies of the file system and protect your important data against user or application errors. You can take snapshots of the entire filesystem or specified sub-trees on the filesystem.
Dell Hadoop app agent uses snapshot differences (diff) to compare new and previous HDFS snapshots and copies only the files that are changed in the source directory.
Hadoop provides a plug-in architecture for third-party storage providers (such as DDHCFS) to seamlessly integrate with DDHCFS and use data management tools such as Distcp with Hadoop clusters.
HBase is a distributed column-oriented database that is built on top of the Hadoop filesystem. Namespace is a logical grouping of multiple HBase tables. You can use the Dell Hadoop App agent to back up HBase tables and namespaces using the Export Snapshot functionality of Hadoop.