SmartPools enables a multi-tier architecture to be created using high-performance nodes with SSD for performance tiers and high-capacity SATA-only nodes for the high-capacity archive tier. For example, a file pool policy could move files from the performance tier to a more cost-effective capacity-biased tier after the wanted period of inactivity.
Figure 3. SmartPools tiering
The following figure shows the creation of an ‘archive’ file pool policy for colder data, which moves files that have not been accessed for more than 30 days to a lower storage tier.
Figure 4. Creating a file pool policy
For optimal cluster performance, we recommend observing the following OneFS SmartPools best practices:
- It is not recommended to tier-based on modify time (-mtime). Access time is the preferred tiering criteria, with an –atime value of 1 day.
- Ensure that cluster capacity utilization (hard drive and SSD) remains below 90% on each pool.
- If the cluster consists of more than one node type, direct the default file pool policy to write to the higher performing node pool. Data can then be classified and down-tiered as necessary.
- A file pool policy can have three ‘OR’ disjunctions and each term joined by an ‘OR’ can contain at most five ‘AND’s.
- The number of file pool policies should not exceed thirty. More than thirty policies may affect system performance.
- Define a performance and protection profile for each tier and configure it accordingly.
- File pool policy order precedence matters, as the policies are applied on first match basis (the first file pool policy to match the expression will be the applied policy).
- When employing a deep archiving strategy, ensure that the performance pool is optimized for all directories and metadata and the archive tier is just for cold file storage as they age out. To implement this configuration, add a ‘TYPE=FILE’ statement to the aging file pool policy rule or rules to only move files to the archive tier.
- By default, the SmartPools job runs only once per day. If you create a file pool policy to be run at a higher frequency, ensure the SmartPools job is configured to run multiple times per day.
- Enable SmartPools Virtual Hot Spares with a minimum of 10% space allocation. This allocation ensures that space is available for data reconstruction and reprotection in case a drive or node fails. It also generally helps guard against file-system-full issues.
- Avoid creating hard links to files which will cause the file to match different file pool policies
- If node pools are combined into tiers, the file pool rules should target the tiers rather than specific node pools within the tiers.
- Avoid creating tiers that combine node pools both with and without SSDs.
- The number of SmartPools tiers should not exceed five. Although you can exceed the guideline of five tiers, doing so is not recommended because it might affect system performance.
- Where possible, ensure that all nodes in a cluster have at least one SSD, including nearline and high-density nodes.
- For performance workloads, SSD metadata read-write acceleration is recommended. The metadata read acceleration helps with getattr, access, and lookup operations while the write acceleration helps reduce latencies on create, delete, setattr, mkdir operations. Ensure that sufficient SSD capacity (6-10%) is available before turning on metadata-write acceleration.
- Determine if metadata operations for a particular workload are biased towards reads, writes, or an even mix, and select the optimal SmartPools metadata strategy.
- Avoid using OneFS Filesystem Explorer or the ‘isi set’ command to change file attributes, such as protection level, for a group of data. Instead use SmartPools file pool policies.
- If SmartPools takes more than a day to run on OneFS 8.2 or later, or the cluster is already running the FSAnalyze job, consider scheduling the FilePolicy (and corresponding IndexUpdate job) to run daily. Also consider reducing the frequency of the SmartPools job to monthly. The following table provides a suggested job schedule when deploying FilePolicy:
Table 3. Suggested job schedule when deploying FilePolicy
FilePolicy | Every day at 22:00 | LOW | 6 |
IndexUpdate | Every six hours, every day | LOW | 5 |
SmartPools | Monthly – Sunday at 23:00 | LOW | 6 |
- If planning on using atime, be sure to enable Access Time Tracking as early as possible. The use of a 24-hour precision is recommended to prevent performance problems.
Figure 5. Access time tracking configuration
For more information about OneFS data tiering and file pool policies, see the SmartPools white paper.