We approached the implementation of our Oracle Big Data SQL solution in phases that parallel how enterprises that are interested in creating a new data virtualization infrastructure project would appropriately proceed. Our first step was to design and configure a resilient storage layer with the PowerFlex system to demonstrate how software-defined storage can accelerate and protect critical analytics and reporting services. We used a PowerFlex configuration with multiple controller nodes to eliminate a single point of failure on the controller. Although large-scale performance was not a requirement for this solution, larger PowerFlex configurations can easily enable massive scale-out performance with the addition of more nodes.
PowerFlex provides you with the flexibility to deploy the architecture in three ways:
The HCI configuration enables both applications and storage services to use the same servers. We used the HCI configuration option for both Hadoop and Oracle NoSQL Database (ONDB) in our lab work for this solution. Our engineers provisioned Hadoop over three PowerFlex nodes, distributing the workload. We placed an Oracle NoSQL Database on a fourth PowerFlex node. In placing Hadoop and NoSQL on the PowerFlex nodes in HCI mode, we were able to consolidate workloads and more fully use PowerFlex.
PowerFlex also supports a disaggregated two-layer architecture that separates data management workloads from the storage of datasets. Both Oracle and Microsoft SQL Server were placed on dedicated compute nodes. These nodes were not responsible for managing storage so that they could instead provide substantial compute resources for these data management systems. In this two-layer architecture, PowerFlex was used for storage-only services. By using both an HCI and the disaggregated two-layer configuration, we optimized the overall data virtualization architecture across all the services.