File virtualization abstracts the underlying specifics of the physical file servers and NAS devices and creates a uniform namespace across those physical devices. A namespace is simply a fancy term referring to the hierarchy of directories and files and their corresponding metadata. Typically with a standard file system such as NTFS, a namespace is associated with a single machine or file system. By bringing multiple file systems and devices under a single namespace, file virtualization provides a single view of directories and files and gives administrators a single control point for managing that data.
Many of the benefits will sound familiar. Like storage virtualization, file virtualization can enable the nondisruptive movement and migration of file data from one device to another. Storage administrators can perform routine maintenance of NAS devices and retire old equipment without interrupting users and applications.
File virtualization, when married with clustering technologies, also can dramatically boost scalability and performance. A NAS cluster can provide several orders of magnitude faster throughput (MBps) and IOPS than a single NAS device. HPC (high performance computing) applications, such as seismic processing, video rendering, and scientific research simulations, rely heavily on file virtualization technologies to deliver scalable data access.
Three architectural approaches
File virtualization is still in its infancy. As always, different vendors' approaches are optimally suited for different usage models, and no one size fits all. Broadly speaking, you'll find three different approaches to file virtualization in the market today: Platform-integrated namespaces, clustered-storage derived namespaces, and network-resident virtualized namespaces.
Platform-integrated namespaces are extensions of the host file system. They provide a platform-specific means of abstracting file relationships across machines on a specific server platform. These types of namespaces are well suited for multisite collaboration, but they tend to lack rich file controls and of course they are bound to a single file system or OS. Examples include Brocade StorageX, NFS v4, and Microsoft Distributed File System (DFS).
Clustered storage systems combine clustering and advanced file system technology to create a modularly expandable system that can serve ever-increasing volumes of NFS and CIFS requests. A natural outgrowth of these clustered systems is a unified, shared namespace across all elements of the cluster. Clustered storage systems are ideally suited for high performance applications and to consolidate multiple file servers into a single, high-availability system. Vendors here include Exanet, Isilon, Network Appliance (Data ONTAP GX), and HP (PolyServe).
Network-resident virtualized name-spaces are created by network-mounted devices (commonly referred to as network file managers) that reside between the clients and NAS devices. Essentially serving as routers or switches for file-level protocols, these devices present a virtualized namespace across the file servers on the back end and route all NFS and CIFS traffic as between clients and storage. NFM devices can be deployed in band (F5 Networks) or out of band (EMC Rainfinity). Network-resident virtualized namespaces are well suited for tiered storage deployments and other scenarios requiring nondisruptive data migration.
File and block storage virtualization may be IT's best chance of alleviating the pain associated with the ongoing data tsunami. By virtualizing block and file storage environments, IT can gain greater economies of management and implement centralized policies and controls over heterogeneous storage systems. The road to adoption of these solutions has been long and difficult, but these technologies are finally catching up to our needs. You will find the current crop of file and block virtualization solutions to be well worth the wait.