Looking for a similar answer, essay, or assessment help services?

Simply fill out the order form with your paper’s instructions in a few easy steps. This quick process ensures you’ll be matched with an expert writer who
Can meet your papers' specific grading rubrics. Find the best write my essay assistance for your assignments- Affordable, plagiarism-free, and on time!

Posted: September 10th, 2024

Machine Learning in Malware Detection

1.0 Background Research

Malware was first created in 1949 by John von Neumann. Ever since then, more and more malwares are created. Antivirus company are constantly looking for a method that is the most effective in detecting malware. One of the most famous method used by antivirus company in detecting malware is the signature based detection. But over the years, the growth of malware is increasing uncontrollably. Until recent year, the signature based detection have been proven ineffective against the growth of malware. In this research, I have chosen another method for malware detection which is implementing machine learning method on to malware detection. Using the dataset that I get from Microsoft Malware Classification Challenge (BIG 2015), I will find an algorithm that will be able to detect malware effectively with low false positive error.

Can You Help with Research Essay Topic Refinement?

Yes, we refine your research essay topic to ensure it’s focused, researchable, and aligned with your academic goals. Our experts assess your initial ideas and suggest improvements for clarity and scope. This strengthens your research essay’s foundation. Contact us to perfect your topic today!

We also provide in-depth research paper support, identifying gaps in existing literature to ensure your topic contributes to academic discourse. Our writers craft research questions that are both innovative and feasible, setting the stage for a compelling thesis.

1.1 Problem Statement

With the growth of technology, the number of malware are also increasing day by day. Malware now are designed with mutation characteristic which causes an enormous growth in number of the variation of malware (Ahmadi, M. et al., 2016). Not only that, with the help of automated malware generated tools, novice malware author is now able to easily generate a new variation of malware (Lanzi, A. et al., 2010). With these growths in new malware, traditional signature based malware detection are proven to be ineffective against the vast variation of malware (Feng, Z. et al., 2015). On the other hand, machine learning methods for malware detection are proved effective against new malwares. At the same time, machine learning methods for malware detection have a high false positive rate for detecting malware (Feng, Z. et al., 2015).

1.2 Objective

To investigate on how to implement machine learning to malware detection in order to detection unknown malware. To develop a malware detection software that implement machine learning to detect unknown malware. To validate that malware detection that implement machine learning will be able to achieve a high accuracy rate with low false positive rate.

Do You Offer Research Essay Writing for Short Deadlines?

We provide fast research essay writing services, delivering quality sections or drafts within days for urgent needs. Our expert writers maintain academic rigor under tight timelines. Specify your deadline in the order form for prompt delivery. Trust us for efficient, high-quality research essay support!

For urgent research papers or essays, we ensure concise arguments and robust citations, tailored to your academic level, to meet tight deadlines without compromising quality.

1.3 Theoretical / Conceptual Framework

Can You Assist with Research Essay Data Interpretation?

Yes, we offer expert data interpretation services, analyzing your research essay data to draw meaningful conclusions. Our writers present results clearly, using statistical or qualitative methods as needed. Provide your data for tailored, accurate interpretation. Order now to enhance your research essay’s analysis!

We also support thesis data analysis, offering detailed explanations of findings to strengthen your research paper or essay’s credibility.

1.4 Significance

Do You Provide Research Essay Writing for Specific Disciplines?

We specialize in disciplines like Business, Maritime Law, Nursing, and IT, crafting tailored research essays. Our expert writers deliver in-depth, subject-specific research to meet academic standards. Specify your field for a customized, high-quality paper. Trust our services for discipline-focused research essay excellence!

Our research paper and essay writing services cover niche topics, ensuring precise terminology and field-specific frameworks for academic success.

With Machine Learning in Malware detection that have a high accuracy and low false positive rate, it will help end user to be free from fear malware damaging their computer. As for organization, they will have their system and file to be more secure.

2.0 Literature Review

2.1 Overview

Traditional security product uses virus scanner to detect malicious code, these scanner uses signature which created by reverse engineering a malware. But with malware that became polymorphic or metamorphic the traditional signature based detection method used by anti-virus is no long effective against the current issue of malware (Willems, G., Holz, T. & Freiling, F., 2007). In current anti-malware products, there are two main task to be carried out from the malware analysis process, which are malware detection and malware classification. In this paper, I am focusing on malware detection. The main objective of malware detection is to be able to detect malware in the system. There are two type of analysis for malware detection which are dynamic analysis and static analysis. For effective and efficient detection, the uses of feature extraction are recommended for malware detection (Ahmadi, M. et al., 2016). There are various type of detection method, the method that we are using will be detecting through hex and assembly file of the malware. Feature will be extracted from both hex view and assembly view of malware files. After extracting feature to its category, all category is to be combine into one feature vector for the classifier to run on them (Ahmadi, M. et al., 2016). For feature selection, separating binary file into blocks to be compare the similarities of malware binaries. This will reduce the analysis overhead which cause the process to be faster (Kim, T.G., Kang, B. & Im, E.G., 2013). To build a learning algorithm, feature that are extracted with the label will be undergo classification with using any classification method for example Random Forest, Neural Network, N-gram, KNN and many others, but Support Vector Machine (VCM) is recommended for the presence of noise in the extracted feature and the label (Stewin, P. & Bystrov, I., 2016). As to generate result, the learning model is to test with dataset with label to generate a graph which indicate detection rate and false positive rate. To find the best result, repeat the process using many other classification and create learning model to test on the same dataset. The best result will the one graph that has the highest detection rate and lowest false positive rates (Lanzi, A. et al., 2010).

2.2 Dynamic and Static Analysis

Can You Help with Research Essay Proposal Revisions?

Yes, we revise research essay proposals to address feedback and align with academic requirements. Our writers refine structure, arguments, and clarity to strengthen your proposal. Submit feedback via your account for prompt adjustments. Order now to perfect your research essay proposal!

We also revise research paper outlines and essay drafts, ensuring alignment with professor feedback to enhance academic rigor.

Dynamic Analysis runs the malware in a simulated environment which usually will be a sandbox, then within the sandbox the malware is executed and being observe its behavior. Two approaches for dynamic analysis that is comparing image of the system before and after the malware execution, and monitors the malware action during the execution with the help of a debugger. The first approach usually give a report which will be able to obtain similar report via binary observation while the other approach is more difficult to implement but it gives a more detailed report about the behavior of the malware (Willems, G., Holz, T. & Freiling, F., 2007).

Static Analysis will be studying the malware without executing it which causing this method to be more safe comparing to dynamic analysis. With this method, we will dissemble the malware executable into binary file and hex file. Then study the opcode within both file to compare with a pre-generated opcode profile in order to search for malicious code that exist within the malware executable (Santos, I. et al., 2013).

All malware detection will be needed either Static Analysis or Dynamic Analysis. In this paper, we will be focusing on Static Analysis (Ahmadi, M. et al., 2016). This is because, Dynamic analysis has a drawback, it can only run analysis on 1 malware at a time, making the whole analysis process to take a long time, as we have many malware that needed to be analysis (Willems, G., Holz, T. & Freiling, F., 2007). As for Static Analysis, it mainly uses to analyze hex code file and assembly code file, and compare to Dynamic Analysis, Static Analysis take much short time and it is more convenient to analyze malware file as it can schedule to scan all the file at once even in offline (Tabish, S.M., Shafiq, M.Z. & Farooq, M., 2009).

2.3 Features Extraction

For an effective and efficient classification, it will be wise to extract feature from both hex view file and assembly view file in order to retrieve a complementary date from both hex and assembly view file (Ahmadi, M. et al., 2016).

Do You Offer Research Essay Services for ESL Students?

We provide research essay writing services tailored for ESL students, ensuring clarity and academic quality. Our writers adapt to your language proficiency while meeting institutional standards. Your research essay will be professional and accessible. Order now for ESL-friendly research essay support!

Our essay writing services for ESL students focus on clear, concise prose, helping you articulate complex ideas effectively in research papers or theses.

Few types of feature that are extracted from the hex view file and assembly view file, which is N-gram, Entropy, Image Representative, String Length, Symbol, Operation Code, Register, Application Programming Interface, Section, Data Define, Miscellaneous (Ahmadi, M. et al., 2016). For N-gram feature, it usually used to classify a sequence of action in different areas. The sequence of malware execution could be capture by N-gram during feature extraction (Ahmadi, M. et al., 2016).  For Entropy feature, it extracts the probability of uncertainty in a series of byte in the malware executable file, these probability of uncertainty is depending on the amount of information on the executable file (Lyda, R.,Hamrock, J,. 2007). For Image Representative feature, the malware binary file is being read into 8-bit vector file, then organize into a 2D array file. The 2D array file can be visualize as a black and gray image whereas grey are the bit and byte of the file, this feature look for common in bit arrangement in the malware binary file (Nataraj, L. et al., 2011). For String Length feature, we open each malware executable file and view it in hex view file and extract out all ASCII string from the malware executable, but because it is difficult to only extract the actual string without extract other non-useful element, it is required to choose important string among the extracted (Ahmadi, M. et al., 2016). For Operation Code features, Operation code also known as Opcode are a type of instruction syllable in the machine language. In malware detection, different Opcode and their frequency is extracted and to compare with non-malicious software, different set of Opcodes are identifiable for either malware or non-malware (Bilar, D., n.d.). For Register feature, the number of register usage are able to assist in malware classification as register renaming are used to make malware analysis more difficult to detect it (Christodorescu, M., Song, D. & Bryant, R.E., 2005). For Application Programming Interface feature, API calling are code that call the function of other software in our case it will be Windows API. There are large number of type of API calls in malicious and non-malicious software, is hard to differentiate them, because of this we will be focusing on top frequent used API calls in malware binaries in order to bring the result closer (Top maliciously used apis, 2017). For Data Define feature, because not all of malware contains API calls, and these malware that does not have any API calls they are mainly contain of operation code which usually are db, dw, dd, there are sets of features (DP) that are able to define malware (Ahmadi, M. et al., 2016). For Miscellaneous feature, we choose a few word that most malware have in common from the malware dissemble file (Ahmadi, M. et al., 2016).

Among so many feature, the most appropriate feature for our research will be N-gram, and Opcode. This is because it is proven that there two feature have the highest accuracy with low logloss. This two feature appears frequently in malware file and it already have sets of well-known features for malware. But the drawback using N-gram and Opcode are they require a lot of resource to process and take a lot of time (Ahmadi, M. et al., 2016). We will also try other feature to compare with N-gram and Opcode to verified the result.

2.4 Classification

In this section, we will not review about the algorithm or mathematical formula of a classifier but rather their nature to able to have advantage over certain condition in classifying malware feature. The type of classifier that we will review will be Nearest Neighbor, Naïve Bayes, Decision tree, Support Vector Machine and XGBOOST [21] (Kotsiantis, S.B., 2007) (Ahmadi, M. et al., 2016).

Can You Write a Research Essay Introduction?

Yes, we craft compelling research essay introductions that outline your research objectives and significance. Our writers ensure clarity, academic rigor, and alignment with your study. Specify your requirements for a tailored introduction. Order now for a strong research essay start!

We also write engaging introductions for research papers and essays, setting a scholarly tone and framing your argument effectively.

As we need a classifier to train our data with the malware feature, we will need to review the classifier to choose the most appropriate classifier that are able to have the best result. The Nearest Neighbor classifier are one of the simplest method for classifying and it is normally implement in case-based reasoning [21]. As for Naïve Bayes, it usually generates simply and constraint model and not suitable for irregular data input, which make it not suitable for malware classification because that the data in malware classification are not regular (Kotsiantis, S.B., 2007). For Decision Tree, it classify feature by sorting them into tree node base on their feature values and each branch represent the node value. Decision Tree will determine either try or false based on node value, which make it difficult to dealt with unknown feature that are not stored in tree node (Kotsiantis, S.B., 2007). For Support Vector Machine, it has a complexity model which enable it to deal with large amount of feature and still be able to obtain good result from it, which make it suitable for malware classification as malware contains large number of feature (Kotsiantis, S.B., 2007). For XGBOOST, it is a scalable tree boosting system which win many machine learning competition by achieving state of art result. The advantage for XGBOOST, it is suitable for most of any scenario and it run faster than most of other classification technique (Chen, T., n.d.).

To choose a Classification for our malware analysis, we will be choosing XGBOOST, as it is suitable for malware classification, it also recommended by winner from Microsoft Malware Classification Challenge (Ahmadi, M. et al., 2016). But we will also use Support Vector Machine, as it too is suitable for malware classification and we will use it to compare the result with XGBOOST to get a more accurate result.

References 

  1. Ahmadi, M. et al., 2016. Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification. ACM Conference on Data and Application Security and Privacy, pp.183-194. Available at: http://doi.acm.org/10.1145/2857705.2857713.
  2. Amin, M. & Maitri, 2016. A Survey of Financial Losses Due to Malware. Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies – ICTCS ’16, pp.1-4. Available at: http://dl.acm.org/citation.cfm?doid=2905055.2905362.
  3. Berlin, K., Slater, D. & Saxe, J., 2015. Malicious Behavior Detection Using Windows Audit Logs. Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, pp.35-44. Available at: http://doi.acm.org/10.1145/2808769.2808773.
  4. Feng, Z. et al., 2015. HRS : A Hybrid Framework for Malware Detection. , (10), pp.19-26.
  5. Han, K., Lim, J.H. & Im, E.G., 2013. Malware analysis method using visualization of binary files. Proceedings of the 2013 Research in Adaptive and Convergent Systems, pp.317-321.
  6. Kim, T.G., Kang, B. & Im, E.G., 2013. Malware classification method via binary content comparison. Information (Japan), 16(8 A), pp.5773-5788.
  7. Küçüksille, E.U., Yalçınkaya, M.A. & Uçar, O., 2014. Physical Dangers in the Cyber Security and Precautions to be Taken. Proceedings of the 7th International Conference on Security of Information and Networks – SIN ’14, pp.310-317. Available at: http://dl.acm.org.proxy1.athensams.net/citation.cfm?id=2659651.2659731.
  8. Lanzi, A. et al., 2010. AccessMiner: Using System-Centric Models for Malware Protection. Proceedings of the 17th ACM Conference on Computer and Communications Security — CCS’10, pp.399-412. Available at: http://dl.acm.org/citation.cfm?id=1866353%5Cnhttp://portal.acm.org/citation.cfm?doid=1866307.1866353.
  9. Nicholas, C. & Brandon, R., 2015. Document Engineering Issues in Document Analysis. Proceedings of the 2015 ACM Symposium on Document Engineering, pp.229-230. Available at: http://doi.acm.org/10.1145/2682571.2801033.
  10. Patanaik, C.K., Barbhuiya, F.A. & Nandi, S., 2012. Obfuscated malware detection using API call dependency. Proceedings of the First International Conference on Security of Internet of Things – SecurIT ’12, pp.185-193. Available at: http://www.scopus.com/inward/record.url?eid=2-s2.0-84879830981&partnerID=tZOtx3y1.
  11. Pluskal, O., 2015. Behavioural Malware Detection Using Efficient SVM Implementation. RACS Proceedings of the 2015 Conference on research in adaptive and convergent systems, pp.296-301.
  12. Santos, I. et al., 2013. Opcode sequences as representation of executables for data-mining-based unknown malware detection. Information Sciences, 231, pp.64-82.
  13. Stewin, P. & Bystrov, I., 2016. Detection of Intrusions and Malware, and Vulnerability Assessment, Available at: http://dblp.uni-trier.de/db/conf/dimva/dimva2012.html#StewinB12.
  14. Willems, G., Holz, T. & Freiling, F., 2007. Toward automated dynamic malware analysis using CWSandbox. IEEE Security and Privacy, 5(2), pp.32-39.
  15. Tabish, S.M., Shafiq, M.Z. & Farooq, M., 2009. Malware detection using statistical analysis of byte-level file content. Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics – CSI-KDD ’09, pp.23-31. Available at: http://portal.acm.org/citation.cfm?doid=1599272.1599278.
  16. Lyda, R.,Hamrock, J,. 2007.Using Entropy Analysis to Find Encrypted and Packed Malware.
  17. Nataraj, L. et al., 2011. Malware Images : Visualization and Automatic Classification.
  18. Bilar, D., Statistical Structures : Fingerprinting Malware for Classification and Analysis Why Structural Fingerprinting ?
  19. Christodorescu, M., Song, D. & Bryant, R.E., 2005. Semantics-Aware Malware Detection.
  20. Top maliciously used apis. https: //www.bnxnet.com/top-maliciously-used-apis/, 2017.
  21. Weiss, S.M. & Kapouleas, I., 1989. An Empirical Comparison of Pattern Recognition , Neural Nets , and Machine Learning Classification Methods. , pp.781-787.
  22. Kotsiantis, S.B., 2007. Supervised Machine Learning : A Review of Classification Techniques. , 31, pp.249-268.
  23. Chen, T., XGBoost : A Scalable Tree Boosting System.

Do You Provide Research Essay Editing for PhD Programs?

We offer specialized editing for PhD research essays, enhancing structure, clarity, and academic rigor. Our editors ensure your paper meets doctoral standards and your institution’s guidelines. Upload your draft for professional, tailored editing. Trust our services for a polished PhD research essay!

Our thesis and research paper editing services refine arguments and citations, ensuring compliance with PhD-level expectations.

Tags: Complete the assignment in a page paper, Custom Dissertation Writing Services for PhD Students, Essay USA, In a 4 to 6 page essay

Order|Paper Discounts

Why Choose Essay Bishops?

You Want The Best Grades and That’s What We Deliver

Top Essay Writers

Our top essay writers are handpicked for their degree qualification, talent and freelance know-how. Each one brings deep expertise in their chosen subjects and a solid track record in academic writing.

Affordable Prices

We offer the lowest possible pricing for each research paper while still providing the best writers;no compromise on quality. Our costs are fair and reasonable to college students compared to other custom writing services.

100% Plagiarism-Free

You’ll never get a paper from us with plagiarism or that robotic AI feel. We carefully research, write, cite and check every final draft before sending it your way.

How it works

When you decide to place an order with Assessment Essays, here is what happens:

Complete the Order Form

You will complete our order form, filling in all of the fields and giving us as much detail as possible.

Assignment of Writer

We take a look at your order and pair it with a writer who’s got just the right skills for the job—they’ll start fresh and make it their own.

Order in Production and Delivered

You can chat directly with your writer while they work, and once you get the final draft, you can give it a thumbs-up or ask for a few tweaks.

Giving us Feedback (and other options)

We want to know how your experience went. Feel free to drop a quick review and give a shoutout to your favorite writer for other students to check out.