As business users turn to high-performance hardware, IBM is adding features to its high-performance file systems to help push supercomputing more into the mainstream.
IBM has released a new version of its General Parallel File System (GPFS) with improved file management capabilities. The file system can search across multiple systems, up to 1,000 nodes in parallel, said Scott Handy, vice president of marketing and strategy for IBM Power Systems.
In a test, Handy said IBM scanned one billion files using GPFS to show off its capabilities to customers in fields such as financial services and retail who deal with massive amounts of unstructured files. The scan was completed in 2.5 hours; Handy said IBM is now working to shorten that to one hour.
The update to GPFS, now at Version 3.2, includes policy-based file management that will allow a user to tell the system how to store and search files. For instance, this upgrade will allow a user to stipulate that files saved in a certain format are to be stored on a particular kind of disk.
What that will mean, Handy said, is that users can take a tiered approach to how they distribute data. A user can write a policy telling the system to store certain kinds of data on its fastest and most expensive disk, with other types of data going to lower-cost systems where performance isn't as critical.
That capability would help users save money because they could use cheaper storage where appropriate, he said.
Another policy-based system, said Handy, could require files not accessed for 30 days or so to be moved to a cheaper system. The previous release of GPFS treated all such files the same, he said. IBM is also adding clustered management features.
The file system runs on IBM System p and System x hardware and is supported by AIX as well as some versions of Red Hat and SUSE Linux.