Skip to content

Latest commit

 

History

History
17 lines (16 loc) · 864 Bytes

File metadata and controls

17 lines (16 loc) · 864 Bytes

vulnerable-and-non-vulnerable-websites (Deprecated)

Classification task using KNN classifier

The repo includes the following:

data.csv - It includes the hyperlinks of various trusted and non trusted websites

knn.ipynb - It is a python notebook the includes the approach used

APPROACH

The approach used here is to first convert the hyperlinks into vectors.For doing this we use gensim's Word2Vec library. Hyperlinks are converted and the feature matrix is obtained which is padded with zero values as the length of the hyperlinks may not be the same.After this the feature matrix is fed into the KNN classifier, the one supplied with the sklearn package.

ACCURACY

Here the accuracy obtained is calculated by taking the percentage of correctly classified instances with respect to the total no. of instances.