Home > Storage > Data Storage Essentials > DataIQ and CloudIQ > Dell APEX AIOps Infrastructure Observability: A Detailed Review > Infrastructure Observability features
Infrastructure Observability makes it faster and easier to analyze and identify issues accurately and intelligently, by delivering:
Infrastructure Observability allows you to improve your system health by providing instant insight into your Dell IT environment without the maintenance of installed software. The Home Page summarizes key aspects of the environment so that users can quickly see what needs to be addressed and provides hyperlinks to easily open more detailed views. Some examples of these summaries include Proactive Health Scores, Capacity Predictions, Performance Anomaly and Impact Detection, and Reclaimable Storage. These features and others are discussed in detail below.
Infrastructure Observability advanced predictive analytics differentiate it from other monitoring and reporting tools.
Using machine learning and analytics, Infrastructure Observability identifies performance anomalies (supported across all storage platforms, networking devices, and PowerEdge servers). It compares current performance metrics with historical values to determine when the current values deviate outside of normal ranges. This feature provides timely information about the risk level of the storage systems with insights into conditions and anomalies affecting performance.
Besides detecting performance anomalies, Observability goes one step further and identifies performance impacts (supported for PowerMax or VMAX, PowerStore, VxRail, Unity XT family, PowerScale, and PowerFlex systems). Observability analyzes increases in latency against other metrics such as IOPS and bandwidth to determine if an increase in latency is caused by a change in workload characteristics or competing resources. In the case where an impact is identified, Observability also identifies the most likely storage objects causing the workload contention. By differentiating between changes in workloads characteristics and workload contention, Observability enables users to narrow the focus of troubleshooting on when actual impacts to performance may have occurred.
Infrastructure Observability provides historical trending and both short- and longer-term future predictions to provide intelligent insight on how capacity is being used, and what future needs may arise.
The Proactive Health Score is another key differentiator for Infrastructure Observability, relative to other monitoring and reporting tools. Observability proactively monitors the critical areas of each system to quickly identify potential issues and provide recommended remediation solutions. The Health Score is a number ranging from 100 to 0, with 100 being a perfect Health Score.
The Health Score is based on the five categories shown in the following table. Some examples of how Proactive Health mitigates risk are:
Category | Sample Health Issues | |
Components | Physical components with issues: for example, faulty cables and fans | |
Configuration | Non-HA host connections | |
Capacity | Pools or clusters that are oversubscribed and reaching full capacity | |
Performance | Storage groups not meeting their SLO | |
Data Protection | Native replication and snapshot schedules are not being met |
Cybersecurity is a set of features in Infrastructure Observability that identifies potential security violations. System configurations are continuously monitored and compared to a user-configurable evaluation plan at which point a risk level is assigned to each system. Users can quickly get a visual representation of system security risks by seeing the identified misconfigurations and can address security violations using the recommended remediations. Dell Security Advisories and associated Common Vulnerabilities and Exposures (CVEs) are reported against any applicable systems. This provides users with a notification of the vulnerability and an in-context link to the associated knowledge base article for remediation. Cybersecurity ransomware incidents identify potential ransomware attacks in near real-time. By learning the expected behavior of reducible data, Observability can identify anomalies in this behavior that provide indications of possible encryption attacks.
The multisystem update feature pertains to VxRail clusters and PowerEdge servers. Users can initiate VxRail cluster update pre-checks, software downloads, and system updates from the Infrastructure Observability UI. Users can also initiate PowerEdge firmware updates across their server fleet. This feature provides more operational efficiency while maintaining security and consistency.