PRISM is an open-source Linux backdoor that has been in the news recently, after AT&T Alien Labs discovered several variants of it.These samples are modified versions of the original PRISM code. They are fairly simple backdoors, able to spawn reverse shells ordrop additional payloads.
The interesting fact about these samples is that they have remained completely undetected until now, some for over 3 years. Most ofthese samples have now more than 20 detections on VirusTotal, but these are probably signature-based detections, as theyappeared just after the publication of Alien Labs' article.
HLAI is a malware detection module powered by artificial intelligence. It works on executable files (ELF files for Linux, and PE files forWindows), and gives, for each sample, a "score of potential maliciousness".
This module is deployed directly on the endpoints HarfangLab equip and this score is computed when the binary file is about to beexecuted. Therefore, according to the chosen configuration of our EDR, it is possible to stop the exececution of the file if it has asignificant potential maliciousness.
Machine learning approaches have the advantage of being complementary to the more classical, rule and signature-basedapproaches. Where the latter techniques are rather specific, machine learning approaches benefit from a capacity for generalization,allowing them to detect threats they have never seen before. In addition, HLAI has the advantage of being very light and quick whilehaving a good detection performance, allowing it to work offline on the endpoint and block threats immediately. HLAI has nodependency, uses less than 1 megabyte of disk space, 10 megabytes of RAM and performs predictions in milliseconds.
HLAI works in two steps. The first consists in extracting and computing relevant features on the given binary file, thus giving anumerical representation of the file. The second step consists in taking this representation and feeding it to a classification model,designed to discriminate the samples according to their features.
The model is composed of an ensemble of decision trees, trained using gradient boosting technique.
On the 38 binary samples identified by AT&T ALien Labs, HLAI classifies 34 of them as CRITICAL threats, without having ever seenthem. It is a good example of how machine learning approaches can shine when encountering new, never seen threats.
However there are still 4 malicious binary samples. In order to detect those samples, there are several axes of work :
1) Enrich our training dataset. We already improved detection performance by providing our model more diversed binaries,especially regarding goodwares.
2) Perform a residual analysis. Using specific machine learning tools, we assess the importance of each feature in the prediction. Inthe case of a wrong prediction, namely when a malware is predicted to have a low score of maliciousness, we can hopefullydetermine where the misleading information lies, and craft new features.
Discover HarfangLab EDR from different angles