Models
Collection of wrappers for machine learning models
LogExpModel
- class ml_investment.models.LogExpModel(base_model)[source]
Bases:
objectModel wrapper to fit on log of target and exp produced prediction. May be usefull for some target distributions.
- Parameters
base_model – class implements
fit(X, y),predict(X)/predict_proba(X)interfaces
EnsembleModel
- class ml_investment.models.EnsembleModel(base_models: List, bagging_fraction: float = 0.8, model_cnt: int = 20)[source]
Bases:
objectClass for training ansamble of base models.
- Parameters
base_models – list of classes implements
fit(X, y),predict(X)/predict_proba(X)interfacesbagging_fraction – part of random data subsample for training models
model_cnt – total number of models in resulted ansamble
GroupedOOFModel
- class ml_investment.models.GroupedOOFModel(base_model, group_column: str, fold_cnt: int = 5)[source]
Bases:
objectModel wrapper incapsulate out of fold separation within data groups. Each sample in group can not be in training and validation fold at the same time.
- Parameters
base_model – model implements
fit(X, y),predict(X)/predict_proba(X)interfacesgroup_column – name of column for grouping training data.
Xinfit(X, y)andpredict(X)should contain this column. Samples with one group value will be placed only in one training fold.fold_cnt – number of folds for training
TimeSeriesOOFModel
- class ml_investment.models.TimeSeriesOOFModel(base_model, time_column: str, fold_cnt: int = 5)[source]
Bases:
objectModel wrapper incapsulate out of fold time-series separation.
- Parameters
base_model – model implements
fit(X, y),predict(X)/predict_proba(X)interfacestime_column – name of column for separating training data.
Xinfit(X, y)andpredict(X)should contain this column. Samples from feature would not be used for training and prediction past.fold_cnt – number of folds for training