API Reference

Core Classes

UniversalMLPipeline

class universal_ml_framework.UniversalMLPipeline(problem_type='classification', random_state=42, verbose=False, fast_mode=False, tuning_method='random', n_jobs=-1)[source]

Bases: object

Universal ML Pipeline untuk Classification dan Regression

auto_detect_features(df, exclude_columns=None)[source]

Automatically detect feature types

create_preprocessor()[source]

Create preprocessing pipeline

cross_validate_models()[source]

Cross validate all models

define_models()[source]

Define models based on problem type

hyperparameter_tuning()[source]

Hyperparameter tuning for best model

load_data(train_path, test_path=None, target_column=None)[source]

Load data dari file CSV

make_predictions(save_predictions=True, id_column=None)[source]

Make predictions on test set

prepare_data(custom_features=None)[source]

Prepare data for training

run_pipeline(train_path, target_column, test_path=None, problem_type='classification', exclude_columns=None, custom_features=None, feature_engineering_func=None, verbose=None, fast_mode=None, tuning_method=None, n_jobs=None, id_column=None)[source]

Run complete pipeline

save_model(filename='best_model.pkl')[source]

Save trained model

Helper Functions

Quick Setup Functions

universal_ml_framework.quick_classification_pipeline(train_path, target_column, test_path=None, exclude_columns=None)[source]

Quick setup untuk classification problem

universal_ml_framework.quick_regression_pipeline(train_path, target_column, test_path=None, exclude_columns=None)[source]

Quick setup untuk regression problem

universal_ml_framework.run_pipeline_with_config(config_name)[source]

Run pipeline dengan konfigurasi yang dipilih

universal_ml_framework.list_available_configs()[source]

List all available configurations

Data Generation

DataGenerator

class universal_ml_framework.DataGenerator[source]

Bases: object

Generate synthetic datasets for various ML problems

static generate_all_datasets()[source]

Generate all synthetic datasets

static generate_customer_churn(n_samples=800, save_to_csv=True)[source]

Generate synthetic customer churn dataset

static generate_house_prices(n_samples=1000, save_to_csv=True)[source]

Generate synthetic house prices dataset

static generate_sales_forecasting(n_samples=600, save_to_csv=True)[source]

Generate synthetic sales forecasting dataset

Method Details

Main Pipeline Methods

UniversalMLPipeline.run_pipeline(train_path, target_column, test_path=None, problem_type='classification', exclude_columns=None, custom_features=None)

Main method to execute the complete ML pipeline.

Parameters:
  • train_path (str) – Path to training CSV file

  • target_column (str) – Name of target column

  • test_path (str) – Path to test CSV file (optional)

  • problem_type (str) – ‘classification’ or ‘regression’

  • exclude_columns (list) – Columns to exclude from features (optional)

  • custom_features (list) – Custom feature list (optional)

UniversalMLPipeline.load_data(train_path, test_path=None, target_column=None)

Load training and test data from CSV files.

UniversalMLPipeline.auto_detect_features(df, exclude_columns=None)

Automatically detect feature types (numeric, categorical, binary).

UniversalMLPipeline.cross_validate_models()

Compare multiple models using cross-validation.

UniversalMLPipeline.hyperparameter_tuning()

Optimize hyperparameters for the best model.

UniversalMLPipeline.make_predictions(save_predictions=True)

Generate predictions on test data.

UniversalMLPipeline.save_model(filename='best_model.pkl')

Save trained model and metadata to files.