Home > Storage > PowerScale (Isilon) > Product Documentation > Storage (general) > Dell PowerScale: Considerations and Best Practices for Large Clusters > File layout and directory structure
In general, it is more efficient to create a deep directory hierarchy that consolidates files in balanced subdirectories than it is to spread files out over a shallow subdirectory structure. This is particularly true for large clusters.
Although the recommended maximum file limit per directory is one million, a best practice is to constrain the number of files in any one directory to one hundred thousand. A maximum of 100,000 directories per directory is also recommended.
The general guidelines are to try to:
Since OneFS is a hierarchical file system, these will help prevent the creation of a data hot spot. Since directory updates (creates and deletes) are serialized, only one can happen at once. As such, a create/delete-heavy workload can benefit from as much as two orders of magnitude better performance when the data is organized into a hundred directories each containing ten thousand files compared to a single directory with a million files.
Note: The key for file and directory layout always revolves around balance. The recommended goal is for a directory tree structure and its file contents is to be as uniform as possible.
A typical dataset consists of a mix of large and small files stored in a file system consisting of a hierarchical directory structure. Usually, around 30 percent of the data is active; 70 percent is inactive. Snapshots typically back up the data for short-term retention combined with a long-term DR strategy, which frequently includes replication to a secondary cluster, and disk-to-disk or disk to tape NDMP backups.
OneFS uses erasure coding (FEC) to parity protect a file, which results in high levels of storage efficiency. Conversely, files less than 128 KB in size are essentially mirrored, so have a larger on-disk footprint. Large file efficiency using erasure coding offsets the penalty of mirroring of small files.
While OneFS is highly extensible and many of the configuration options are physically unbounded, there are several recommended limits that do apply – especially in the context of large clusters. For example, the recommendation is to configure a maximum of forty thousand NFS exports on a single cluster. Even though it is possible to create more than this, the best practice is bounded at no more than forty thousand. Often, these ‘limit’ numbers are the maximum that Dell PowerScale engineering have tested and certified a cluster to. While things may work fine beyond these thresholds, such configurations will typically be unsupported.
Further information about OneFS limits and guidelines is available in the OneFS Technical Specifications guide.