Microsoft: Here’s an “Unprecedented” Dataset – Predict Infection, Win $20k

Microsoft has launched a new competition challenging researchers and programmers to come up with an AI model that predicts the likelihood of malware infection based on a machine’s configuration.

It is providing an “unprecedented malware dataset” to train the AI on. The winner will receive $12,000, with a second price of $7,000.

The competition was announced December 13 on Kaggle, described as “an AirBnB for Data Scientists”. Kaggle is a community platform for data scientists founded by Google. It has over 536,000 active members.

There are already 261 competitors, with the competition closing in three months.

Microsoft’s aim is to further improve Microsoft layered defence system by establishing a predictive approach to system vulnerabilities.

Announcing the competition, the company said: “The malware industry continues to be a well-organized, well-funded market dedicated to evading traditional security measures. Once a computer is infected by malware, criminals can hurt consumers and enterprises in many ways.”

With more than one billion enterprise and consumer customers, Microsoft takes this problem very seriously and is deeply invested in improving security.

Microsoft Malware Prediction AI

Participants are tasked with building the models using 9.4GB of anonymised data collected from 16.8 million devices by Microsoft.

This data is then divided into two lots, train.csv and test.csv.

Within these data sets each row corresponds to a machine with the indicator MachineIdentifier. A second labelled HasDetections informs the participants that malware was detected within the indicated machine.https://news.microsoft.com/2018/12/03/mastercard-microsoft-join-forces-to-advance-digital-identity-innovations/

Using the information and labels in train.csv, the participants are tasked with predicting the value for HasDetections for each device in test.csv.

The train.csv file contains a wealth of machine configuration information such as the Operating System, processor type, country location and the current firewall setup.

Chase Thomas and Robert McCann Windows Defender Research team commented in a security blog that: “The competition provides academics and researchers with varied backgrounds a fresh opportunity to work on a real-world problem using a fresh set of data from Microsoft.”

“Results from the contest will help us identify opportunities to further improve Microsoft’s layered defenses, focusing on preventative protection. Not all machines are equally likely to get malware; competitors will help build models for identifying devices that have a higher risk of getting malware so that preemptive action can be taken.”

See Also: New Trojan Targets PayPal App

You can enter the Microsoft Malware Prediction competition on Kaggle here. It finishes on March 13, 2019. The maximum team size is eight people and up to five entries can be submitted per day.

Microsoft is also awarding

The post Microsoft: Here’s an “Unprecedented” Dataset – Predict Infection, Win $20k appeared first on Computer Business Review.

Subscribe to our Newsletter