This is an interactive series on machine learning where every algorithm is built from scratch, explained with math, and visualized so you can see exactly what is happening under the hood. The series covers linear and logistic regression, gradient descent with optimizers like SGD, Momentum, and Adam, polynomial regression and the bias-variance tradeoff, regularization with Ridge, Lasso, and Elastic Net, and neural networks from a single perceptron to multi-layer networks with backpropagation and activation functions. It also includes chapters on loss functions, decision trees, and clustering algorithms like K-Means and so much more.
Each post starts with the intuition, walks through the math step by step, and then lets you experiment with interactive demos right in your browser. You can drag data points, tune hyperparameters, and watch algorithms train in real time. No prerequisites beyond basic algebra is required. This series will give you a deep understanding of how machine learning algorithms work from the ground up, and the interactive visualizations will help you build intuitions easily.
See It in Action
This is a simple example. You can expect similar interactive visualizations in each blog post. Switch between algorithms to see how each one tackles the same dataset differently.
Part 1
Foundations of Supervised Learning
Linear Regression
The starting point. Learn how a line fits data and how gradient descent finds the best weights.
Linear Regression - Multivariate
More features, a plane instead of a line. See how each input contributes to the prediction.
Logistic Regression
Use the sigmoid to turn a score into a probability. Train a binary classifier and find your first decision boundary.
Logistic Regression - Multivariate
Two features, one boundary. See why cross-entropy works better than MSE for classification.
Optimization & Regularization
Gradient Descent Deep Dive
The engine behind every ML model. Race SGD, Momentum, and Adam on loss surfaces and see which one wins.
Polynomial Regression & Bias-Variance
Fit curves, not just lines. See what happens when a model is too simple or too complex.
Regularization: Ridge, Lasso & Elastic Net
Penalize large weights to stop overfitting. See why Lasso pushes coefficients to zero but Ridge only shrinks them.
Neural Networks from Scratch
The Perceptron & MLP
One neuron cannot solve XOR. Add a hidden layer and unlock nonlinear decision boundaries.
Backpropagation Visualized
The chain rule applied layer by layer. Watch gradients weaken as they travel back through the network.
Activation Functions
What sits between layers shapes what a network can learn. Compare sigmoid, ReLU, tanh, and GELU side by side.
Coming Soon