Phishing Website Detection with Machine Learning
A browser extension and API that classify whether a URL/website is phishing using lexical, host, and content features, warning users in real time.
How to build it — step by step
- 1Feature engineering: Extract URL lexical features, domain age/WHOIS, SSL, and page-content signals.
- 2Modelling: Train classifiers on a labelled phishing/legit dataset; optimise for high recall on phishing.
- 3Serving: Expose a low-latency API and a browser extension that scores pages on navigation.
- 4Feedback: Let users report misclassifications to build a retraining loop.
Key features to implement
- ✓URL + content feature extraction
- ✓Real-time phishing classification
- ✓Browser-extension warnings
- ✓High-recall tuning
- ✓User feedback loop
💡 Unique twist to stand out
Add an explainability popup listing the top signals that made a site look like phishing (e.g. lookalike domain, no SSL).
🎓 What you'll learn
Security feature engineering, classification, browser extensions, and explainable warnings.