Logistic regression is a popular machine learning algorithm that is used for classification tasks. It is a type of regression analysis that is used to predict a binary outcome, such as whether a customer will churn or not, given a set of features.
In Python, we can use the
LogisticRegression class from the
sklearn library to train and test a logistic regression model. Here’s a step-by-step guide on how to do this:
- Import the necessary libraries:
from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split
2. Load and prepare the data:
# Load the data into a Pandas DataFrame import pandas as pd data = pd.read_csv('data.csv') # Split the data into features and target X = data.drop('target', axis=1) y = data['target'] # Split the data into training and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
3. Create and fit the model:
# Create the model model = LogisticRegression() # Fit the model to the training data model.fit(X_train, y_train)
4. Make predictions and evaluate the model:
# Make predictions on the test data y_pred = model.predict(X_test) # Evaluate the model using a classification report from sklearn.metrics import classification_report print(classification_report(y_test, y_pred))
There are also several hyperparameters that you can tune to improve the performance of the model. Some common ones include the
C parameter, which controls the regularization strength, and the
solver parameter, which determines the algorithm used to solve the optimization problem.