ScalaBLAST is a tool that allows hypothesis discovery in real-time. Looking for similarities in one protein sequence with thousands or millions others can take minutes. However, when scientists try to do the same with millions of sequences to begin with, the results can take years to develop. In an effort to speed up this process, PNNL scientists developed ScalaBLAST – a program that compares protein sequences using multiple processors instead of one and visualizes the results for the user. By running comparisons parallel to one another, the supercomputer can compare millions of protein sequences in days, rather than years. Middleware for Data Intensive Computing (MeDICi), an integrated framework data pipeline, connects the ScalaBLAST program with PNNL’s supercomputer to allow the analysis and visualization to take place. This program changes the way science is done in the proteomics field: instead of going in with a hypothesis and looking for data, scientists use the data to create a hypothesis, based on the information gathered.

Previously, scientists would go into a dataset (in the form of excel) with an idea of what they’re looking for. They would disregard the unexpected or irregular data, "which is where all the good stuff is," says Oehmen, and throw away most of the information that didn’t pertain to their protein of interest. With ScalaBLAST, they’re able to keep all the data and visualize the information to navigate and look deeper into their dataset. The visualization and analysis tool run in near-real time to allow quicker information absorption and better decision making when it comes to curing diseases, environment clean-up and protecting against biological warfare.

The ScalaBLAST program and visualizations were presented at the Super Computing ’08 Challenge in November 2008 and won due to their superior computing abilities. With their demonstration at the Super Computing conference, PNNL pushed the philosophy of Data Intensive Computing – “Stop throwing away data,” said Oehmen.



Last Update: 13 July 2011 | Pacific Northwest National Laboratory