- Given a set of data points of the form
$(x_i,y_i)$ , this method works by minimizing the expression$J = \sum_i (y_i - (m x_i + c))^2$ , which is also called least squares error. - Taking partial derivatives w.r.t
$m$ and$c$ and equating them to 0, we get that$$m = \frac{n \sum x_i y_i - \sum x_i \sum y_i}{n \sum {x_{i}}^{2} - (\sum x_i)^2} \ c = \frac{\sum y_i - m \sum x_i}{n} $$
- The program has only one file, it has three functions namely,
main()
,cost()
andplot()
, where the functions do the computation above, calculate least squares error and plot the graph. - For more details regarding the working of this project, please take a look at the file
main.py
Linear regression is one of the most fundamental and widely used techniques in data science and machine learning. Even though this project is a basic implementation, it demonstrates the core idea behind many real-world applications, such as:
- Predicting trends — e.g., stock prices, sales forecasts, or temperature changes
- Analyzing relationships — e.g., how study time affects test scores
- Making data-driven decisions — by modeling how one variable influences another
- Serving as a foundation — for more advanced models like logistic regression, regularized regression, or neural networks
Understanding linear regression from scratch helps build an intuition for how machine learning models learn patterns in data. This simple Python program is a step toward that understanding.