Software updates slowing you down?

March 3, 202012 AM Central

We’ve all shared the frustration of software updates intended to make our applications run faster, but that inadvertently end up doing just the opposite. Fixing these bugs, known in computer science as performance regressions, is time-consuming, since locating software errors typically requires substantial human intervention.

To overcome this obstacle, researchers at Texas A&M University, working with computer scientists at Intel Labs, have now automated identification of the sources of errors caused by software updates. Their algorithm, based on a specialized form of machine learning called deep learning, is turnkey and quick, finding software update bugs in hours instead of days.

“Updating software can sometimes turn on you when errors creep in and cause slowdowns. This problem is even more exaggerated for companies that use large-scale software systems that are continuously evolving,” said Abdullah Muzahid, assistant professor in A&M’s Department of Computer Science and Engineering. “We have designed a convenient tool for diagnosing performance regressions that is compatible with a whole range of software and programming languages, expanding its usefulness tremendously.”

To pinpoint the source of software errors, debuggers often check the status of performance counters within a computer’s central processing unit. These counters monitor how a program is being executed by the computer’s hardware. So, when the software runs, counters track of the number of times it accesses certain memory locations, the time it stays there and when it exits. When the software’s behavior goes awry, counters can be used for diagnostics.

Newer desktops and servers have hundreds of performance counters, making it virtually impossible to keep track of all of their statuses manually and to look for aberrant patterns that are indicative of performance errors. That is where Muzahid’s machine learning comes in.

Inline article image — Graphic: Rachel Barton/Texas A&M Engineering

By using deep learning, the researchers are able to monitor data coming from a large number of counters simultaneously by reducing the size of the data, which is similar to compressing a high-resolution image to a fraction of its original size by changing its format. In that reduced data, their algorithm can look for patterns that deviate from normal.

The researchers tested if the algorithm could find and diagnose a performance bug in commercially available data management software used by companies to keep track of their numbers and figures. First, they trained their algorithm to recognize normal counter data by running an older, glitch-free version of the data management software. Next, they ran their algorithm on an updated version of the software with the performance regression. They found that their algorithm located and diagnosed the bug within a few hours. Muzahid said this type of analysis could take a considerable amount of time if done manually.

Muzahid noted that their deep learning algorithm has potential uses in other areas of research as well, such as developing the technology needed for autonomous driving.

“The basic idea is … being able to detect an anomalous pattern,” Muzahid said. “Self-driving cars must be able to detect whether a car or a human is in front of it and then act accordingly. So, it’s again a form of anomaly detection and the good news is that is what our algorithm is already designed to do.”

This research is partly funded by the National Science Foundation CAREER grant and Intel.

Read more about this research at Texas A&M Today.

By Texas A&M University