Confusion Matrix And Its Use In The World Of Cyber Security

Tribhuban Mishra
7 min readJun 15, 2021

--

In this Article, we are going to talk about the Confusion Matrix and its types of error. and we will also see the how confusion matrix play a vital role to make the system safer in the world of Cyber Security.

A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.

Understanding Confusion Matrix

When we get the data, after data cleaning, pre-processing, and wrangling, the first step we do is to feed it to an outstanding model and of course, get output in probabilities. But how can we measure the effectiveness of our model? Better the effectiveness, better the performance and that is what we want. And it is where the Confusion matrix comes into focus. Confusion Matrix is a performance measurement for machine learning classification.

There are multiple ways of finding errors in the machine learning model. The Mean Absolute Error (Error/cost) function helps the model to be trained in the correct direction by trying to make the distance between the Actual and predicted value to be 0. We find the error in machine learning model prediction by “y — y^”.

Mean Square Error (MSE): Points from the data set are taken and they are squared first and then the mean is taken to overcome the error.

In Binary Classification models, the error is detected with the help of a confusion matrix.

Confusion Matrix is a performance measurement for machine learning classification problems where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

It is extremely useful for measuring Recall, Precision, Specificity, Accuracy, and most importantly AUC-ROC curves.

Understanding Confusion Matrix in a simpler manner:

Let’s start with an example confusion matrix for a binary classifier (though it can easily be extended to the case of more than two classes):

What can we learn from this matrix?

● There are two possible predicted classes: “yes” and “no”. If we were predicting the presence of a disease, for example, “yes” would mean they have the disease, and “no” would mean they don’t have the disease.

● The classifier made a total of 165 predictions (e.g., 165 patients were being tested for the presence of that disease).

● Out of those 165 cases, the classifier predicted “yes” 110 times, and “no” 55 times.

● In reality, 105 patients in the sample have the disease, and 60 patients do not.

Let’s now define the most basic terms, which are whole numbers (not rates):

true positives (TP): These are cases in which we predicted yes (they have the disease), and they do have the disease.

true negatives (TN): We predicted no, and they don’t have the disease.

false positives (FP): We predicted yes, but they don’t have the disease. (Also known as a “Type I error.”)

false negatives (FN): We predicted no, but they do have the disease. (Also known as a “Type II error.”)

I’ve added these terms to the confusion matrix, and also added the row and column totals:

This is a list of rates that are often computed from a confusion matrix for a binary classifier:

Accuracy: Overall, how often is the classifier correct?

o (TP+TN)/total = (100+50)/165 = 0.91

Misclassification Rate: Overall, how often is it wrong?

o (FP+FN)/total = (10+5)/165 = 0.09

o equivalent to 1 minus Accuracy

o also known as “Error Rate”

True Positive Rate: When it’s actually yes, how often does it predict yes?

o TP/actual yes = 100/105 = 0.95

o also known as “Sensitivity” or “Recall”

False Positive Rate: When it’s actually no, how often does it predict yes?

o FP/actual no = 10/60 = 0.17

True Negative Rate: When it’s actually no, how often does it predict no?

o TN/actual no = 50/60 = 0.83

o equivalent to 1 minus False Positive Rate

o also known as “Specificity”

Precision: When it predicts yes, how often is it correct?

o TP/predicted yes = 100/110 = 0.91

Prevalence: How often does the yes condition occur in our sample?

o actual yes/total = 105/165 = 0.64

Is it necessary to check for recall (or) precision if you already have a high accuracy?

We can not rely on a single value of accuracy in classification when the classes are imbalanced. For example, we have a dataset of 100 patients in which 5 have diabetes and 95 are healthy. However, if our model only predicts the majority class i.e. all 100 people are healthy even though we have a classification accuracy of 95%.

When to use Accuracy / Precision / Recall / F1-Score?

● Accuracy is used when the True Positives and True Negatives are more important. Accuracy is a better metric for Balanced Data.

● Whenever a False Positive is much more important use Precision.

● Whenever a False Negative is much more important use Recall.

● F1-Score is used when the False Negatives and False Positives are important. F1-Score is a better metric for Imbalanced Data.

Why do we need a Confusion matrix?

Here are the pros/benefits of using a confusion matrix.

● It shows how any classification model is confused when it makes predictions.

● The confusion matrix not only gives you insight into the errors being made by your classifier but also the types of errors that are being made.

● This breakdown helps you to overcome the limitation of using classification accuracy alone.

● Every column of the confusion matrix represents the instances of that predicted class.

● Each row of the confusion matrix represents the instances of the actual class.

● It provides insight not only into the errors which are made by a classifier but also errors that are being made.

Cyber Security and Cyber Crime

Cyber security refers to the body of technologies, processes, and practices designed to protect networks, devices, programs, and data from attack, damage, or unauthorized access. Cyber security may also be referred to as information technology security.

Cybercrime is criminal activity that either targets or uses a computer, a computer network or a networked device.Most, but not all, cybercrime is committed by cybercriminals or hackers who want to make money. Cybercrime is carried out by individuals or organizations.Some cybercriminals are organized, use advanced techniques and are highly technically skilled. Others are novice hackers.Rarely, cybercrime aims to damage computers for reasons other than profit. These could be political or personal.

Cyber Attack Detection and Classification Using Parallel Support Vector Machine

Cyber-attack is becoming a critical issue of organizational information systems. Several cyber-attack detection and classification methods have been introduced with different levels of success that are used as a countermeasure to preserve data integrity and system availability from attacks. The classification of attacks against computer networks is becoming a harder problem to solve in the field of network security.

The rapid increase in connectivity and accessibility of computer systems has resulted in frequent chances for cyber-attacks. Attacks on computer infrastructures are becoming an increasingly serious problem. Cyber attack detection is a classification problem, in which we classify the normal pattern from the abnormal pattern (attack) of the system. The subset selection decision fusion method plays a key role in cyber-attack detection. It has been shown that redundant and/or irrelevant features may severely affect the accuracy of learning algorithms. The SDF is a very powerful and popular data mining algorithm for decision-making and classification problems. It has been used in many real-life applications like medical diagnosis, radar signal classification, weather prediction, credit approval, and fraud detection, etc.

Conclusion :

In the present world, cybercrime offenses are happening at an alarming rate. As the use of the Internet is increasing many offenders, make use of this as a means of communication in order to commit a crime. The framework developed in our work is essential to the creation of a model that can support analytics regarding the identification, detection, and classification of integrated cybercrime offenses (structured and unstructured). The main focus of our work is to find the attacks that take advantage of the security vulnerabilities and analyze these attacks by making use of machine learning techniques. The aim is that the developed framework will provide the essential broad knowledge of cybercrime offenses in society, enable them to consider the threat landscape of such attacks, and avoid the incarnation of the cybercrime offenses.

From the results, it is evident that the developed framework reduces the time consumption and manual reporting process. It helps to identify the number of filing cases incident-wise and area-wise. This report is useful to predict the cases and to take precautionary steps against filing cybercrime cases on certain hot-spot places identified.

Thanks for giving your Valuable time……

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response