Sherpas of Supercomputing
By Lucy Birmingham and Mark Matthews
Alex Feltus and O. Vernon Burton occupy very different research realms at Clemson University, but they share a passion for digging up meaningful gems buried in mountains of data. Feltus, a computational biologist, looks for genetic secrets to improving agricultural production and fighting disease. He and his team recently studied more than 1,000 kidney cancer samples to learn why certain strains of the disease respond to particular drugs while others do not. Burton, a historian, analyzes the 19th- and 20th-century American South in granular detail, drawing from troves of census, tax, and court records to reveal the big picture. Along the way, he tests and debunks popular theories, such as the notion of a black matriarchy.
Both scholars depend heavily on high-performance computing (HPC). “It’s my microscope, my scientific instrument,” Feltus says; problems that would consume months on a laptop can be completed in days. For the historian, powerful computers are key to discerning patterns and making the most of a plethora of digitized sources. “Numbers do matter,” says Burton, who doubles as a professor of history and director of the Clemson CyberInstitute.
Feltus and Burton have plenty of company on campuses worldwide. From cell science to climate and neuroscience, physics, and large-scale historical studies, more and more researchers depend on computing to speed up their work and open new avenues of inquiry. And as computing and big data play a growing role in all manner of science, engineering, and the humanities, researchers increasingly rely on the computer scientists and computer and electrical engineers who understand how HPCs work and how they can promote discovery and save time.
Call them research facilitators, cyberinfrastructure (CI) facilitators, or CI professionals; their titles vary. Dirk Colbry began his career as one of these. Pursuing a Ph.D. in computer science at Michigan State University, he fell into a “side hustle” helping researchers from across campus with computing and became one of the “computer people” on research teams comprising a variety of disciplines. He continued in that role as a postdoc. Researchers “thought I was awesome” in helping them bridge the gap between their science and the technology, he recalls. Now a faculty member and director of high-performance computing studies in MSU’s department of computational mathematics, science, and engineering, he’s a key figure in training others like himself, expanding and professionalizing the ranks of HPC-savvy research facilitators across the country.
“These are the folks who are actually helping, many times, the CI users to understand and effectively use advanced cyberinfrastructure methods and tools, and port their software,” says Sushil Prasad, until recently a program director in NSF’s Office of Advanced Cyberinfrastructure.
Even in labs like Feltus’s that employ computer scientists, facilitators are called on for “thorny, troubleshooting” problems, he says. Until domain scientists themselves become experts at using HPCs, research facilitators will be in demand. Right now there aren’t enough. “There are so many job opportunities for high-performance computing specialists who have the right training to assist researchers. It’s mind-numbing,” says John Towns, deputy chief information officer for research IT at the University of Illinois–Urbana-Champaign, and executive director for science and technology at the National Center for Supercomputing Applications. “At the same time, it’s become increasingly difficult for academic institutions to hire and retain staff of this type, especially if they have data science experience, because companies are snapping them up and doubling their salaries.” Towns says he doesn’t regret the loss to industry. “I celebrate it, despite the challenge it creates.”
Alone or in teams, academic researchers now have access to a level of computing power once found at just a few institutions, such as national laboratories. Researchers can now tap into supercomputers on a number of campuses as well as high-speed networks, commercial cloud services, and deep data repositories.
Waiting to Exascale
Even greater demand is expected when the current generation of superfast computers, with speeds measured in petaflops, give way to exascale machines. A petaflop is one quadrillion (or million billion) calculations (floating point operations, or FLOPS) per second. Exascale supercomputers will perform a quintillion (or billion billion) calculations per second. Two Department of Energy laboratories, Argonne and Oak Ridge, plan to take delivery in 2021 of exascale supercomputers. A joint analysis by DOE, NSF, and the National Institutes of Health found that exascale computing promised “an exciting array of . . . potentially transformative” advances in eight broad fields, including health sciences, materials sciences, and engineering and energy technology, each comprising at least four distinct disciplines. At the same time, “a number of respondents noted the present barriers” encountered by researchers seeking to use HPCs. “There was a clear call for an expert workforce capable of developing and using HPC applications that use and maintain advanced computing frameworks [and] a cadre of experts, with hybrid skills in domain research and computer science knowledge.” The agencies also found a need for “stable career pathways for a national cadre of computational technologists and HPC experts.”
At a 2018 National Academies workshop held to generate momentum behind “convergence”—integration of different fields to tackle challenges beyond the scope of a single discipline—several participants stressed the need for access not only to advanced instrumentation “but also to affiliated, highly skilled technical staff who are able to run the instruments across a broad array of research applications,” according to a report on the workshop. Stanford professor Ann Arvin, one of the workshop’s leaders, noted in a subsequent webinar that in her own field of biomedicine, “almost none of us learned how to incorporate data analytics at the scale that we need to now.”
Jim Bottum, a pioneer in connecting computer experts and domain scientists, recently retired as Clemson’s chief information officer and vice provost for computing and information technology. As a young management intern at NSF, he worked with a number of directorates and also joined a task force that built the foundation for the NSF Supercomputing Centers Program and the NSFnet. He subsequently worked as NCSA’s deputy director at the University of Illinois at Urbana-Champaign. Joining Purdue University in 2001 as chief information officer, he realized, “Wow, I’ve got to form partnerships with the faculty and there are over 2,500 of them.” He worked on building alliances between faculty and IT support staff and integrating computing and research.
Clemson, which hired Bottum in 2006, had “fledgling efforts in high-performance computing,” he recalls. “Typically, faculty had shelved research ideas because they didn’t see the resources to carry them through.” He rapidly built the university’s supercomputing capacity, but eventually recognized more could be done if universities shared facilities. The result was Advanced Cyberinfrastructure – Research and Educational Facilitation (ACI-REF), which linked Clemson’s and five other universities’ research support capabilities. “By that time, there was a lot of firepower (hardware) on campus, but people—that’s where the shortage was.” Before ACI-REF, “I had two people in research computing, and it was all they could do to keep up with the day-to-day.” With the addition of an ACI-REF facilitator, Clemson’s cyberinfrasture team went from serving 14 departments with large data needs to serving over 40 within about two years. Even now, “what most campuses don’t have are enough research facilitators” to help faculty navigate the growing complexity of today’s cyberinfrastructure, he says.
ACI-REF, backed by NSF, sought to develop facilitators who could build relationships with researchers whom they consulted on advanced computing solutions. By the time it ended in 2018, ACI-REF boasted of having enabled “significant and previously unimagined scholarship outcomes” in fields ranging from knee biomechanics to botany and high-energy physics. “We started off with the stated goal of creating a national facilitator network, but then quickly realized there was a need to create a network of cyberinfrastructure professionals . . . not just facilitators, but also including systems administrators; people who are writing and deploying software; and people who are supporting data management, data science,” says Lauren Michael, ACI-REF’s lead facilitator. That idea jelled with the Campus Research Computing Consortium (CaRCC), a national forum for cyberinfrastructure professionals, which has expanded on ACI-REF’s efforts. Michael, who is based at the University of Wisconsin–Madison, is now co-lead of CaRCC’s “people network” for cyberinfrastructure professionals.
A Huge Investment
At NCSA in 2011, John Towns launched XSEDE (Extreme Science and Engineering Discovery Environment). NSF’s single largest investment in cross-domain research cyberinfrastructure, XSEDE is due to receive a total of $250 million by 2021. “We primarily are the glue of the national distributed environment, which brings together high-performance computing, advanced research computing, storage, visualization, data analysis, and particularly technical expertise in the use of all of these technologies to enable research teams at primarily U.S.-based institutions,” Towns says.
Last year, nearly 14,000 researchers made use of resources and services available via XSEDE. Towns discovered the need for computing assistance to research in the 1990s, when he was a young astrophysicist studying relativity and black holes. Now he’s at the forefront of increasing and professionalizing the community of research facilitators. In 2018 alone, XSEDE held about 40,000 hours of free participant training. “These facilitators are so important because first they guide students and faculty onto the right path toward the technology they’ll need to use, and then help them to understand that technology,” he says. More than 600 research facilitators, representing 272 academic institutions and 32 nonprofits, belong to Campus Champions, which grew out of XSEDE.
At Wisconsin, Michael’s team of facilitators tries to meet with every researcher seeking a supercomputer user account to understand what is needed and how computing fits in. Even though research topics vary widely, the actual computational requirements fall into just a handful of categories. At Michigan State, Colbry may assist a frustrated grad student attempting a computational analysis. The laptop freezes. “They’re told to come to me. I walk them through the problem, move them to a supercomputer. After an hour meeting, they walk out with a plan and have it [done] in a week.” Many researchers need training with foundational coding and data science skills. A nonprofit called the Carpentries, begun in 1998 at the University of Toronto, works to fill this void with two-day teaching workshops. Since 2012, the group has held some 2000 workshops, reaching over 40,000 people in 46 countries, says associate director Erin Becker.
From Days to Hours
Facilitators can come up with inventive ways to save researchers time. An example is electrical engineer John Lusher’s work with neuroscientist Joseph Orr at Texas A&M University. Orr analyzes functional magnetic resonance imaging data, generated in units called voxels. Each voxel represents about a million brain cells. By correlating individual voxels or groups of voxels, Orr can find networks of neural connectivity—the wiring of the brain. The problem is that a typical fMRI data set contains close to a million individual voxels. Doing the correlation using a multicore computer takes about five days, even operating around the clock. Enter Lusher and his Ph.D. adviser, Jim Ji, an associate professor of computer and electrical engineering. Lusher developed what became his thesis topic: the High-Performance Correlation and Mapping Engine. The system, able to handle high-volume correlations, cut those five days to seven hours. Lusher is now an associate professor of practice at TAMU.
Brain research is a HPC growth area. Powerful computing “opens up other possibilities” and allows researchers to tackle problems not solvable with classical techniques, says psychologist Felix Hoffstaedter of the Research Center Jülich in Germany. He scripted and executed analyses on the center’s HPC system, JURECA, for a recent study showing the effects from alcohol consumption, smoking, and lack of exercise and social integration on the brain health of 549 aging adults. The study, which Hoffstaedter coauthored, found clear evidence of damage when those risk factors were combined. Jülich researchers now hope to use HPCs to explore brain data emerging from the UK Biobank, which collects a variety of health information from 100,000 current participants. Other huge data sets include the NIH-sponsored Human Connectome Project.
Compared with well-established HPC users such as climate scientists and theoretical physicists, neuroscience researchers still “are very small players,” says Hoffstaedter. Yet the arrival of big data has increased the need for HPCs—and people who know how to use them—in any number of fields. “We see biology just blown up with bioinformatics—it’s the Wild West of research,” Colbry says. “Researchers are scrambling.” Likewise, he adds, “digital humanities is just going crazy.”
One of the more ambitious recent humanities ventures is the European Reassembling the Republic of Letters project—a self-described “radically multilateral collaboration” of scholars in 32 nations who gather, digitize, analyze, and visualize the explosion of written communication across Europe made possible by the arrival of postal service after 1500. Letters from this civil-society “republic” reveal the ideas behind the Enlightenment and other intellectual currents of the age. Among techniques computing allows is mining and modelling topics buried in paragraphs within large sets of correspondence.
HPC experts don’t always get the recognition they deserve for their contribution to research, says Hoffstaedter. Nor, at least in the United States, do they have a straightforward career path. “They went from soft-money position to soft-money position,” Bottum recalls. Adds Towns: “The field is largely academic, and grant funded, which means you have a job for as long as you’ve got a grant.” It does, though, “make people very entrepreneurial, which makes them interesting characters.” A Ph.D. is not required, although Colbry says it helped him win acceptance among some domain scientists. One of Towns’s aims is to see clear career paths and the establishment of a professional organization for research facilitators. Toward that end, XSEDE currently provides yearlong professional development training to a select group of Campus Champions, who work alongside NCSA technical experts.
Turning research facilitation into a formal profession—with barriers to entry and formal jurisdiction over certain practices—doesn’t make sense at this point, Towns and 10 other experts argue. In a paper presented to this year’s PEARC (Practice and Experience in Advanced Research Computing) conference, held July 28-August 1 in Chicago, they do favor developing standards and credentials for work with systems, software, and research.
Colbry, meanwhile, has a NSF grant to develop a curriculum and train research facilitators in communication, teamwork, and leadership skills so they can be most effective in dealing with researchers from a variety of domains. His CyberAmbassadors program draws both on his own experience (“I had a lot of tricks for talking to people”) and on years of work conducted with his spouse, Katy Luchini-Colbry, to impart so-called soft, or professional, skills to members of Tau Beta Pi, the engineering honor society. Luchini-Colbry, MSU’s assistant dean of engineering for graduate student services, is national director of TBP’s professional skills program, called Engineering Futures, and a coprincipal investigator on the NSF grant.
A Need to Keep Up
Feltus’s ideal research facilitator was a system administrator at the University of Georgia, where he formerly worked. “He would show me how to do it and leave me alone.” Nowadays, every scientist should be trained in how to use supercomputers, Feltus says, but facilitators will be needed to keep up with changing technology. That includes cloud computing, which Feltus says is “really the future.” He is currently leading a NSF-sponsored $4 million project—Scientific Data Analysis at Scale (SciDAS)—that connects universities with national facilities, including NSF Cloud, Open Science Grid, and XSEDE, and makes high-performance computing more accessible to more users.
Besides professional development, CI professionals need to stay up-to-date with advances in computation. “Parallel and distributed computing, I think, is one of the essential concepts now and skillsets that every computer scientist needs to have, and by extension, every computer engineer needs to have,” says Sushil Prasad, who is now computer science department chair at the University of Texas–San Antonio.
Eventually, more domain scientists may follow the course of Priya Vashishta. He’s the uncommon researcher with expertise in both high-performance computing and an engineering domain—in his case, materials science. An Indian-educated physicist, he spent 18 years at Argonne National Laboratory, where he used computer simulations for a variety of experiments on materials. In 1990, he and two Argonne colleagues were recruited to create a lab at Louisiana State University, where they launched a joint master’s-Ph.D. program in computer science and physics and pursued research blending condensed matter physics, materials science, and computer science. A decade later, they formed the Collaboratory for Advanced Computing and Simulations at the University of Southern California and started offering a joint Ph.D. in materials science with a master’s in computer science.
Vashishta’s work with the Department of Energy continues—among other things, “studying exotic two-dimensional materials using petascale simulations and ultrafast x-ray and electron diffraction experiments at SLAC” (formerly the Stanford Linear Accelerator Center). He is among seven awardees who will be sharing $32 million over the next four years to accelerate the design of new materials through use of supercomputers. Applying both domain knowledge and computer expertise makes research less cumbersome, Vashishta contends. Otherwise, it’s like “two people cooking, (where) one controls the spice and another the range temperature. It works, but it can be confusing and slow.”
One sign that the need for computing expertise is recognized and growing: Clemson plans to launch a Ph.D. program in digital history. Coming in 2021, it will be the nation’s first.
Lucy Birmingham is a freelance writer based in Los Angeles. Mark Matthews is editor of Prism.
Originally posted on asee-prism.org