New TACC supercomputer wrangles big data

Raza Retiwala

The Wrangler gets its name from the cowboys of the Old West — it’s capable of taming the previously untameable “big data.”  

While most supercomputers are meant to create large-scale models and simulations, Wrangler is specifically designed to handle big data, a moniker used to describe large sets of various types of data. 

Wrangler is smaller than other computers in the Texas Advanced Computing Center (TACC). It’s also more flexible. Niall Gaffney, TACC director of data intensive computing, compared regular supercomputers, such as Stampede, to Ferraris, fine-tuned for a specific task. Wrangler is like a derby car, able to handle the roughest terrain. 

“On Stampede, the system doesn’t change. It’s well-tuned and well-configured to do what it’s meant to do,” Gaffney said. “But we needed to have a flexible environment that could handle different things.” 

Wrangler’s flexibility comes from its software, which is different from a typical supercomputer’s. It allows the computer to take any task and spread it out amongst its 3000 nodes, all of which have access to the file system that contains the data Wrangler needs to analyze. 

This flexibility gives Wrangler a number of advantages over regular supercomputers. According to Gaffney, Wrangler adapts based on its tasks. Scientists can configure parts of the computer, allowing different portions of the computer to perform different tasks. 

“If we need to, we can reconfigure that so that one portion is running something else. And then when we’re done with that, we can take it apart and put it back together in a different way,” Gaffney said. “By doing that, it gives us the flexibility so we can tune the system to what people are doing.” 

Currently, Wrangler is being used by over 60 researchers for a variety of tasks. Because of Wrangler’s capabilities, these researchers have been able to get results that are five to 20 times the speeds of other supercomputers. For example, Hans Hofmann, a UT integrative biology professor, used supercomputers to compare the genetic makeup of animals to find the markers for certain traits. 

Hofmann started his work on Stampede, another supercomputer in the TACC. However, according to Gaffney, Stampede had trouble handling the vast amount of data that it had to process. 

“They would try to run something, but it would fail because components would fail in the week they were running a job,” Gaffney said. “We let him loose on Wrangler, and he was able to get things done in less than four hours.“

Gaffney said Wrangler expands the horizons of what supercomputers are capable of doing. It allows researchers to move beyond the restraints of typical supercomputers and expand their research. Since Wrangler is the first of its kind, Gaffney said he and others at the TACC are still learning about its capabilities and how to improve it for a larger system. 

“It’s sort of a new generation of what we’re doing in computing,” Gaffney said.