Home > Storage > PowerScale (Isilon) > Industry Solutions and Verticals > Media and Entertainment > PowerScale OneFS: A Metadata Driven Approach to On Demand Tiering > Overview
As data volumes increase exponentially, it is unrealistic to keep all files on the highest costing tiers of storage. Various automated storage tiering approaches have been around for years, but for many use cases this automated tiering approach falls short. Bringing together rich metadata and an API driven workflow bridges the gap.
The metadata created by workflow management tools (such as Autodesk ShotGrid) provides the best information about what tier of storage datasets should be stored on. Using this business logic to drive tiering choices, combined with the ability to move data in an ad hoc manner, is immensely powerful.
Much of this paper involves the DataIQ. However other API driven tools with suitable visibility into the OneFS file system and a metadata tagging methodology could be substituted for DataIQ.
The OneFS and DataIQ APIs together provide granular movement of data between storage tiers driven by workflow metadata. This paper describes the tools required to make that happen. Those tools include the OneFS SmartPools tiering methodology and DataIQ data tags.
DataIQ has a tagging system for identifying subsets of data within the file systems it tracks. Metadata created elsewhere in an organization can inform DataIQ tags. DataIQ also maintains an index of all files on the OneFS file system. The DataIQ Python API can be used for accessing this index to pull a list of all files and directories associated with particular DataIQ tags.
That list of files and directories can be passed to the OneFS REST API and used to write extended attributes to the data stored in OneFS. Once the data tags from DataIQ have been written as extended attributes in OneFS, the OneFS tiering engine (SmartPools) can be run on those tagged paths. Smartpools tiers data in place with no change in file path.
The methodology outlined in this paper is meant to provide an example of how these APIs can be combined to extend the functionality of both tools beyond their native capabilities. It is assumed that storage administrators and DevOps groups will customize the code for their own unique environments and use cases.