A Machine Learning Framework for Training Models from CSV Files
This project is a machine learning framework that provides tools for loading datasets, preprocessing data, training machine learning models, and making predictions. It utilizes the Random Forest Classifier from the scikit-learn library to perform classification tasks.
To set up the project, ensure you have Python installed, then install the required packages using pip:
pip install pandas scikit-learn
-
Import the framework:
from primera import MLFramework
-
Create an instance of the framework:
ml = MLFramework('path_to_your_dataset.csv')
-
Load the data:
data = ml.load_data()
-
Preprocess the data:
processed_data = ml.preprocess_data()
-
Select features and target:
X, y = ml.feature_selection('target_column_name')
-
Train the model:
ml.train_model(X, y)
-
Save the model:
ml.save_model('model_filename.pkl')
-
Load the model:
ml.load_model('model_filename.pkl')
-
Make predictions:
predictions = ml.predict(X)
-
Evaluate the model:
accuracy = ml.evaluate_model(X, y) print(f'Accuracy: {accuracy}')
This project uses datasets in CSV format. Ensure that your dataset is structured correctly for the framework to process it.
Contributions are welcome! Please submit a pull request or open an issue for any suggestions or improvements.
This project is licensed under the MIT License.