Data-Mining for Zero-Day Vulnerabilities

A team of Australian researchers has developed a data-mining technique they claim is almost 100 percent effective at identifying potentially destructive flaws in security software.

Zero-day exploits—which attack vulnerabilities in commercial software the developer has not yet found or fixed—are hard to find because anti-virus or intrusion-detection systems don’t have an attack signature they can use to identify the threat.

Zero-day flaws exist in almost every type of software—from the tautest HPC code to the most random smartphone app. Even a small flaw in a casual application can become a threat to enterprise security, however, by creating an opening through which attackers can exploit other systems.

New flaws are valuable to cyber-spies trying to build the next Stuxnet, which leads national intelligence agencies to stand in line to buy undiscovered flaws, according to The New York Times.

Heuristic analysis can catch some unknown threats by running code in a sandbox and looking for attempts to hide newly installed files, attempts to overwrite system files and other functions typical of malicious software.

Heuristics don’t come close to solving the problem, however. Successful zero-day exploits live in a compromised system an average of 312 days before being discovered, according to research from anti-virus maker Symantec. When a flaw is made public, the number of attacks designed to exploit it rises five orders of magnitude, the study found.

The new approach to catching zero-day exploits relies on data-mining algorithms to identify the frequency of specific Windows function calls and rate the potential threat of a new piece of software by comparing the Windows calls it makes to those of similar applications.

By looking for anomalies in the behavior of applications at the system level, rather than simply comparing their behavior to known threats, researchers Mamoun Alazab and Sitalakshmi Venkatraman were able to identify threats with 98.5 percent accuracy, with a 2.5 percent rate of false positives. Their upcoming paper on the topic will appear in The Journal of Electronic Security and Digital Forensics Analysis.

Researchers from Symantec Research reported a similar approach succeeded in identifying 18 undisclosed vulnerabilities (PDF), including 11 that had never been seen in zero-day attacks before. Like Alzab and Venkatraman, the Symantec researchers used pattern matching to identify potentially malicious behavior.

The database Symantec used for analysis came from the activity logs of 11 million Internet servers to identify patterns of behavior common to either benign or malicious software. Their body of data was far larger than that used by the Australians, and relies more heavily on records of software already known to be malicious. The paper was published in the October 16, 2012 issue of the Journal of the Association of Computing Machinery.

Meanwhile, commercial security software developer Cyvera pulled in $11 million in venture funding earlier this week on the strength of its agent-based approach to stopping unknown threats. Rather than focus only on behavior, Cyvera’s TRAPS (Targeted Remote Attack Prevention System) software uses agent software installed on every client that blocks “every conceivable path an attack could take,” then passes alerts of suspicious activity up to a management server designed to identify attacks by identifying unknown but similar patterns of behavior in machines that may have been compromised.

Wherever the data come from, knowing the difference between good and bad behavior is the key to identifying and stopping zero-day exploits, according to Alzab and Venkatraman. “What is most important is to expand the knowledgebase for security research through anomaly detection by applying innovative pattern recognition techniques with appropriate machine learning algorithms to detect unknown malicious behavior,” they wrote in their paper.

 

Image: rvlsoft/Shutterstock.com

Post a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>