Day 12 KNN & Clustering


Day 12 Poster

Today I explored two of the simplest โ€” yet surprisingly powerful โ€” machine learning techniques:

๐Ÿ”น K-Nearest Neighbors (KNN) ๐Ÿ”น Clustering Algorithms like K-Means


๐Ÿ”ธ KNN โ€“ Like asking your 3 closest neighbors for restaurant recommendations โ€” and going with the majority.

  • Doesnโ€™t scale well with large data (lazy learning)

  • Suffers from the curse of dimensionality

  • Use Case: Real-time classification, stock market forecasting, data pre-processing


๐Ÿ”ธ Clustering โ€“ Like sorting socks by color โ€” no names, just similarity.

  • Sensitive to initial conditions and number of clusters

  • Inability to handle categorical data

  • Use Case: Grouping similar logs across distributed DBs, customer segmentation, threat pattern discovery


๐Ÿง  Security Relevance

Both are intuitive, interpretable, and widely used in cybersecurity โ€” for anomaly detection, threat grouping, and log clustering. But when nearness = trust, it opens the door to subtle โ€” and dangerous โ€” manipulations ๐Ÿ‘‡


๐Ÿ” Security Lens

โš ๏ธ Evasion via Distance Manipulation (KNN)

Attackers can subtly modify malicious inputs to appear close to benign ones โ€” bypassing detection.

๐Ÿ’ก Example: Slightly altered malware that lives in the "neighborhood" of clean files.


โš ๏ธ Cluster Poisoning Attacks

In unsupervised setups, adversaries inject crafted data to shift cluster centers or distort groupings.

๐Ÿ’ก Example: Fake logs or reviews injected to confuse anomaly detectors.


โš ๏ธ Model Extraction Risks

KNN-based systems are query-heavy and memory-based โ€” attackers can reconstruct training data if they know the distance metric.

๐Ÿ’ก Example: API misuse to reverse-engineer sensitive training sets.


๐Ÿ“š Key References

  • Jagielski et al. (2018): Manipulating Machine Learning with Adversarial Clustering

  • Tramer et al. (2016): Model Extraction via Query Attacks


๐Ÿ’ฌ Discussion Prompt

Have you ever used clustering for log analysis or threat detection? What was your biggest challenge?


๐Ÿ“… Coming Up

Naive Bayes โ€” and how its โ€œstrong independenceโ€ assumption becomes an adversaryโ€™s playground ๐ŸŽฏ


๐Ÿ”— Missed Day 11?

Catch up here: https://lnkd.in/g3EwkEQAarrow-up-right


#100DaysOfAISec - Day 12 Post #AISecurity #MLSecurity #MachineLearningSecurity #KNN #Clustering #CyberSecurity #AIPrivacy #AdversarialML #LearningInPublic #100DaysChallenge #ArifLearnsAI #LinkedInTech

Last updated