Logistic Regression

Given a new sample, we denote it by

where the first element is the bias term and the others are the feature values.

Binary problem

Consider a binary classification task with a positive class and a negative class.

Denote

Then

and the probability that the new sample is positive is
Multiclass problem

Consider a multiclass classification task with classes .

Denote

Then

and the probability that the new sample belongs to class is

In both cases, is a vector containing all weights, and is a constant that determines the strength of regularization.

penalty_type: the norm used in the regularization term (L1 or L2)
penalty: inverse of regularization strength (i.e. larger values lead to weaker regularization.)
fit_intercept: whether to use a bias term
intercept_scaling: scale of the bias term
solver: learning algorithm used to optimize the loss function
multi_class: mode for multiclass problems
- ovr: one vs. all (one classifier for each class)
- multinomial: one classifier for all classes
class_weight: weights associated with the classes
- uniform: every class receives the same weight.
- balanced: class weights are inversely proportional to class frequencies.

Stopping criteria:

tol: minimum reduction in loss required for optimization to continue.
max_iter: maximum number of iterations allowed for the learning algorithm to converge.

Check out the documentation listed below to view the attributes that are available in sklearn but not exposed to the user in the software.

Further readings

sklearn tutorial on linear models (including Logistic Regression).

sklearn LogisticRegression documentation

Provide feedback