Case Study based On Cyber Crime And Confusion Matrix

Lalita Sharma
9 min readJun 6, 2021

โ†’ ๐ถ๐‘ฆ๐‘๐‘’๐‘Ÿ ๐ถ๐‘Ÿ๐‘–๐‘š๐‘’ ๐‘–๐‘  ๐‘กโ„Ž๐‘’ ๐‘ค๐‘Ž๐‘ฆ ๐‘ก๐‘œ ๐‘—๐‘Ž๐‘–๐‘™, ๐ถ๐‘ฆ๐‘๐‘’๐‘Ÿ ๐‘†๐‘’๐‘๐‘ข๐‘Ÿ๐‘–๐‘ก๐‘ฆ ๐‘–๐‘  ๐‘กโ„Ž๐‘’ ๐‘ค๐‘Ž๐‘ฆ ๐‘ก๐‘œ ๐‘Ž๐‘ฃ๐‘Ž๐‘–๐‘™ โ†

โ€œCan we secure the world from a Bloodless War ?โ€

I am talking about CYBER SECURITY . Nowadays, it has become exceedingly difficult to ensure the security of our systems including both corporate and personal data. For every lock, there is someone out there trying to pick it or break in.

CYBER-SECURITY

Cyber security is the application of technologies, processes and controls to protect systems, networks, programs, devices and data from cyber attacks. It aims to reduce the risk of cyber attacks and protect against the unauthorised exploitation of systems, networks and technologies.

CYBER-ATTACK

A cyber attack is an assault launched by cybercriminals using one or more computers against a single or multiple computers or networks. A cyber attack can maliciously disable computers, steal data, or use a breached computer as a launch point for other attacks. Cybercriminals use a variety of methods to launch a cyber attack, including malware, phishing, ransomware, denial of service, among other methods.

CYBER-CRIME

Crimes committed by using a network or computer is known as cybercrime. Cybercriminals either use the computer as a tool to commit the crime or aim the computer to commit the crime. Most cybercriminals resided in America since Americaโ€™s development with the computer was faster than that of any other country , but now, no place is devoid of cybercriminals.

Examples of the different types of cybercrime :

  • Email and internet fraud.
  • Identity fraud.
  • Theft of financial or card payment data.
  • Theft and sale of corporate data.
  • Cyberextortion (demanding money to prevent a threatened attack).
  • Ransomware attacks (a type of cyberextortion).
  • Cryptojacking (where hackers mine cryptocurrency using resources they do not own).
  • Cyberespionage (where hackers access government or company data).

These examples are enough to send alerts everywhere. They warn us against being cautious and alert against such mails. Money laundering through mail has become very common. Let us not assume ourselves to be safe and under any protection shield.

Donโ€™t trust emails or text messages so easily. First confirm and assure then take your desired step. Also do not trust anonymous and pretentious mails or phone calls because you never know in which form cyber criminals are knocking at your doorstep.

๐ถ๐ด๐‘†๐ธ-๐‘†๐‘‡๐‘ˆ๐ท๐‘Œ: ๐ถ๐‘ฆ๐‘๐‘’๐‘Ÿ ๐ด๐‘ก๐‘ก๐‘Ž๐‘๐‘˜ ๐‘œ๐‘› ๐ถ๐‘œ๐‘ ๐‘š๐‘œ๐‘  ๐ต๐‘Ž๐‘›๐‘˜

In August 2018, the Pune branch of Cosmos bank was drained of Rs 94 crores, in an extremely bold cyber attack. By hacking into the main server, the thieves were able to transfer the money to a bank in Hong Kong. Along with this, the hackers made their way into the ATM server, to gain details of various VISA and Rupay debit cards.
The switching system i.e. the link between the centralized system and the payment gateway was attacked, meaning neither the bank nor the account holders caught wind of the money being transferred.
According to the cybercrime case study internationally, a total of 14,000 transactions were carried out, spanning across 28 countries using 450 cards. Nationally, 2,800 transactions using 400 cards were carried out.
This was one of its kinds, and in fact, the first malware attack that stopped all communication between the bank and the payment gateway.

Email Classification: spam vs. important

Letโ€™s take the case of the email classification problem. The goal is to classify incoming emails in two classes: spam vs. useful (โ€œnormalโ€) email. For that, we use the Spambase Data Set provided by UCI Machine Learning Repository. This dataset contains 4601 emails described through 57 features, such as text length and presence of specific words like โ€œbuyโ€, โ€œsubscribeโ€, and โ€œwinโ€. The โ€œSpamโ€ column provides two possible labels for the emails: โ€œspamโ€ and โ€œnormalโ€.

Figure below shows a workflow that covers the steps to build a classification model: reading and preprocessing the data, partitioning into a training set and a test set, training the model, making predictions by the model, and evaluating the prediction results.

The workflow shown below is downloadable from on the KNIME Hub page and from the EXAMPLES Server under: 04_Analytics / 10_Scoring / 01_Evaluating_Classification_Model_Performance.

Workflow building, applying and evaluating a supervised classification model: data reading and preprocessing, partitioning, model training, prediction, and model evaluation. This workflow predicts whether emails are โ€œspamโ€ or โ€œnormalโ€.

The last step in building a classification model is model scoring, which is based on comparing the actual and predicted target column values in the test set. The whole scoring process of a model consists of a match count: how many data rows have been correctly classified and how many data rows have been incorrectly classified by the model. These counts are summarized in the confusion matrix.

In the email classification example we need to answer several different questions:

  • How many of the actual spam emails were predicted as spam?
  • How many as normal?
  • Were some normal emails predicted as spam?
  • How many normal emails were predicted correctly?

These numbers are shown in the confusion matrix. And the class statistics are calculated on top of the confusion matrix. The confusion matrix and class statistics are displayed in the interactive view of the Scorer (JavaScript) node as shown in below figure.

Confusion matrix and class statistics in the interactive view of the Scorer (JavaScript) node.

What is the Confusion Matrix?

The ๐Ÿ…ฒ๐Ÿ…พ๐Ÿ…ฝ๐Ÿ…ต๐Ÿ†„๐Ÿ†‚๐Ÿ…ธ๐Ÿ…พ๐Ÿ…ฝ ๐Ÿ…ผ๐Ÿ…ฐ๐Ÿ†ƒ๐Ÿ†๐Ÿ…ธ๐Ÿ†‡ is a useful tool used for classification tasks in machine learning with the primary objective of visualizing the performance of a machine learning model.

In a binary classification setting where the negative class is 0 and the positive class is 1, the confusion matrix is constructed with a 2x2 grid table where the rows are the actual outputs of the data, and the columns are the predicted values from the model.

Figure 1: A visual representation of the confusion matrix

How to interpret a Confusion Matrix

I will start by first describing some terminology, if you are able to work out where they would go then you are halfway there:

True Positives: The model predicted positive and the label was actually positive.

True Negatives: The model predicted negative and the label was actually negative.

False Positives: The model predicted positive and the label was actually negative โ€” I like to think of this as falsely classified as positive.

False Negatives: The model predicted negative and the label was actually positive โ€” I like to think of this as falsely classified as negative.

Figure 2: Confusion Matrix

Confusion Matrix example

Letโ€™s look at how a confusion matrix could be used in the business context. The following matrix represents the results of a model predicting if a customer will purchase an item after receiving a coupon. The training data consists of 1000 observations.

The matrix shows us that 525 customers actually made a purchase, while 475 did not make a purchase. The model predicted, however, that 600 made a purchase and 400 did not. So, is this a good model? Letโ€™s figure that out by calculating important classification metrics with the confusion matrix.

๐˜ผ๐™˜๐™˜๐™ช๐™ง๐™–๐™˜๐™ฎ

In the simplest terms, accuracy tells you generally how โ€œrightโ€ your predictions are. It is the sum of true positives and negatives divided by the total population. In this case thatโ€™s 0.775 ((450+325)/1000), meaning that 77.5% of the modelโ€™s predictions were correct.

Is that good or bad? It depends. One issue with accuracy is that it assumes positive and negative errors are equal. We know thatโ€™s not always the case. In addition, depending on the problem youโ€™re trying to solve you may be concerned about minimizing the prevalence of false positives over false negatives or vice versa.

The next measures help you deal with this issue.

๐‘บ๐’†๐’๐’”๐’Š๐’•๐’Š๐’—๐’Š๐’•๐’š

Also known as recall or the true positive rate, sensitivity tells you how often the model chooses the positive class when the observation is in fact in the positive class. It is calculated by dividing the number of true positives in the matrix by the total number of real positives in the data.

In our example, sensitivity is 0.857 (450/525), meaning that the model correctly predicts that a customer will make a purchase 85.7% of the time.

๐‘ท๐’“๐’†๐’„๐’Š๐’”๐’Š๐’๐’

Precision measures how often a model is correct when it predicts the positive class. It is calculated by dividing the number of true positives in the matrix by the total number of predicted positives.

In our example, precision is 0.75 (450/600). In other words, when the model predicted a positive class, it was correct 75% or the time.

๐‘บ๐’‘๐’†๐’„๐’Š๐’‡๐’Š๐’„๐’Š๐’•๐’š

Also known as the true negative rate, specificity measures how often the model chooses the negative class when the observation is in fact in the negative class. It is calculated by dividing the number of true negatives by the total number of real negatives in the data.

The true negative rate in our example is 0.684 ((325/(150+325)), meaning that the model correctly predicts that a customer will not make a purchase 68.4% of the time.

What can you do with classification metrics?

So now that you have all of these metrics how do you make sense of them? As we said earlier, general accuracy is often not enough information to allow you to decide on a modelโ€™s value. One useful thing that a confusion matrix allows you to see is the prevalence of false positives and false negatives. If youโ€™re familiar with statistics, you may know them as Type I and Type II errors, respectively.

Indeed, a confusion matrix shows the performance of a classification model: how many positive and negative events are predicted correctly or incorrectly. These counts are the basis for the calculation of more general class statistics metrics. Here, we reported those most commonly used: sensitivity and specificity, recall and precision, and the F-measure.

Conclusion

The above method predicts and detects cyber-attacks by using both machine-learning algorithms and the data from previous cyber-crime cases. In the model, the characteristics of the people who may be attacked and which methods of attack they may be exposed to are predicted. It has been observed that machine-learning methods are successful enough. The SVMs linear method is the most successful of these methods. The success rate of predicting the attacker who will make a cyber-attack in the model is around 60%. Other artificial intelligence methods may be able to try to increase this ratio. In our approach, it is concluded that it is necessary to draw attention to especially malware and social engineering attacks. It was found that the higher the levels of the victimโ€™s education and income are, the less the probability of cyber-attack is.

Machine learning techniques have proven to be beneficial for the whole security industry. However, the application of machine learning is often limited by the lack of standardized datasets, overfitting issues, the architecture cost, and so on. Therefore, it is important to apply and design new approaches to maintain the benefits of machine learning algorithms while addressing the limitations in practice.

Finally, no more confusion anymore๐Ÿ™†.

๐“๐ก๐š๐ง๐ค๐ฒ๐จ๐ฎ ๐Ÿ๐จ๐ซ ๐ซ๐ž๐š๐๐ข๐ง๐  ๐š๐ง๐ ๐ฌ๐ญ๐š๐ฒ ๐ญ๐ฎ๐ง๐ž๐ ๐Ÿ๐จ๐ซ ๐ฆ๐จ๐ซ๐ž ๐ฎ๐ฉ๐œ๐จ๐ฆ๐ข๐ง๐  ๐ญ๐ž๐œ๐ก๐ง๐ข๐œ๐š๐ฅ ๐š๐ซ๐ญ๐ข๐œ๐ฅ๐ž๐ฌ ๐ŸŒป!!

--

--

Lalita Sharma

Aeromodeller|Passionate|Technoholic|Learner|Technical writer