Absent any recognized standard, savvy PC buyers rely on benchmarking software to determine which processor will best suit their overall computing needs. However, charges of bias and manipulation over the last few days against one of the most recognized programs have brought to light the discrepancies in benchmarking software and in its effectiveness to users.
Advanced Micro Devices Inc. (AMD) is launching an assault against the SYSmark 2002 benchmark, charging it has been altered to favor processors from Intel Corp., according to Hal Speed, marketing manager for AMD, based in Sunnyvale, California.
SYSmark is distributed by a consortium called BAPCo, or the Business Application Performance Corporation. Members of the group include Intel, Dell Computer Corp., IBM Corp., Hewlett-Packard Co., Microsoft Corp., and AMD, which recently joined the consortium. The group also includes several industry publications, including InfoWorld, a division of International Data Group Inc., parent company of IDG News Service.
AMD's processors outperformed Intel's on the 2001 version of the SYSmark benchmark and AMD says that certain application tests were removed from that version and tests that favor Pentium 4 processors from Intel were repeated several times in this year's version of the benchmarking software.
The company contends that the presence of Intel as the only major microprocessor vendor in BAPCo has caused the benchmarking software to drift toward Intel's philosophy of real-world performance without a balancing viewpoint from AMD, Speed said.
Specifically, AMD charges that of 13 filters measuring the performance of Adobe Systems Inc.'s Photoshop software in the 2001 version of the benchmark, eight that favored the Athlon processor were replaced with new filters that favor the Pentium 4 in the 2002 version. The 2002 benchmark also uses multiple instances of three filters that favored Intel and adds three new filters favoring the Pentium 4, Speed said.
The portion of the benchmark that measures Microsoft Excel performance is heavily tilted toward sorting, a procedure that does not represent the most common uses of Excel, Speed said. The Pentium 4 sorts data in Microsoft Excel faster than AMD's Athlon chips, according to the 2001 benchmarking results.
Also, tasks that favored AMD for Microsoft's Access database were almost completely removed, according to AMD. The Access results from the 2001 benchmark favored Athlon processors, but the 2002 version draws much less of the overall score from Access results.
Intel would not comment on the specific allegations. Calls to BAPCo were not immediately returned.
"Our whole product strategy is based on having objective benchmarks. We are (now) working with BAPCo to ensure the industry has credible application-based benchmarks," Speed said.
The controversy over the SYSmark benchmark calls into question just how reliable benchmarks are in general.
"A good benchmark does mean something, their purpose in life is to distill the variables that affect real-world systems performance. The difficulty comes from the judgment call in what you test to measure real-world performance," said Dean McCarron, principal analyst at Mercury Research Inc. in Cave Creek, Arizona.
Users across the IT landscape have different ideas of what represents real-world performance, and the results often come down to differences of opinion, he said.
According to McCarron, in the business world the main benchmarks are the SYSmark program and Winstone and Winbench, formerly run by Ziff-Davis Media Inc. subsidiary eTesting Labs Inc., which was purchased by Lionbridge Technologies Inc. of Waltham, Massachusetts, in July.
Prior to setting the benchmark parameters, Winstone gathered data on corporate application use, and developed benchmarks around that. The problem with this strategy is that it's always looking backward, and not measuring the performance of cutting-edge applications, McCarron said.
Other benchmarks tend to focus on the gaming market and test the performance of specific games on different processors, looking at game-play frame rates.
"Benchmarks have always had uncertainty. You're trying to put a world's worth of experience into one number," McCarron said. In the end, end users are the only ones who know what they use, and what they need, and they should go and evaluate the systems themselves based on their requirements, McCarron advised. Application vendors are often specific as to what platform will work best with their software, he said.
McCarron thinks it is unlikely a future benchmarking standard will come into usage. Several attempts have already been made, including the BAPCo consortium and the objectivity of the benchmark depends on who is driving things.
"It's not whether anybody can claim to be the fastest, it's that everybody can. You can contrive a benchmark to produce just about anything," he said.