Models#

Classes#

LibraryModel(**data)

SklearnModel(**data)

A class used to instantiate and manage a Scikit-learn model.

XGBoostModel(**data)

A class used to instantiate and manage an XGBoost model.

ModelFactory(params_list)

Takes in a list of dictionaries and constructs model classes based on the library keyword provided for each.

LibraryModel#

class mlcompare.models.LibraryModel(**data)[source]#

Bases: ABC, BaseModel

Parameters:
  • name (str)

  • module (str | None)

  • params (dict | None)

instantiate_model()[source]#
Return type:

None

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'module': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'name': FieldInfo(annotation=str, required=True), 'params': FieldInfo(annotation=Union[dict, NoneType], required=False, default=None)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_post_init(context, /)#

We need to both initialize private attributes and call the user-defined model_post_init method.

Return type:

None

Parameters:
  • self (BaseModel)

  • context (Any)

module: str | None#
name: str#
params: dict | None#
abstract predict(X_test)[source]#
Parameters:

X_test (DataFrame)

resolve_model_submodule()[source]#
Return type:

Any | None

save(save_directory)[source]#
Parameters:

save_directory (Path)

abstract train(X_train, y_train)[source]#
Return type:

None

Parameters:

SklearnModel#

class mlcompare.models.SklearnModel(**data)[source]#

Bases: LibraryModel

A class used to instantiate and manage a Scikit-learn model.

Attributes:#

name (str): Class name of the model. Ex: RandomForestRegressor. module (str | None): Module containing the model class if it’s not imported at the library level. params (dict | None): Parameters to pass to the model class constructor if any.

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'module': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'name': FieldInfo(annotation=str, required=True), 'params': FieldInfo(annotation=Union[dict, NoneType], required=False, default=None)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_post_init(context, /)#

We need to both initialize private attributes and call the user-defined model_post_init method.

Return type:

None

Parameters:
predict(X_test)[source]#
train(X_train, y_train)[source]#
Return type:

None

Parameters:
  • name (str)

  • module (str | None)

  • params (dict | None)

XGBoostModel#

class mlcompare.models.XGBoostModel(**data)[source]#

Bases: LibraryModel

A class used to instantiate and manage an XGBoost model.

Attributes:#

name (str): Class name of the model. Ex: XGBRegressor. module (str | None): Module containing the model class if it’s not imported at the library level. params (dict | None): Parameters to pass to the model class constructor if any.

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}#

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'module': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'name': FieldInfo(annotation=str, required=True), 'params': FieldInfo(annotation=Union[dict, NoneType], required=False, default=None)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_post_init(context, /)#

We need to both initialize private attributes and call the user-defined model_post_init method.

Return type:

None

Parameters:
predict(X_test)[source]#
train(X_train, y_train)[source]#
Return type:

None

Parameters:
  • name (str)

  • module (str | None)

  • params (dict | None)

ModelFactory#

class mlcompare.ModelFactory(params_list)[source]#

Bases: object

Takes in a list of dictionaries and constructs model classes based on the library keyword provided for each. The class is designed to be iterated over.

Attributes:#

params_list (list[dict[str, Any]] | str | Path): List of dictionaries containing dataset parameters or a path to a .json file with one. For a list of keys required in each dictionary, see below:

Required keys: - library (Literal[“sklearn”, “xgboost”, “pytorch”, “tensorflow”, “custom”]): The library to use. - module (str): Module containing the model class. - name (str): Name of the model class.

Optional keys: - params (dict | None): The parameters to pass to the model class constructor

Raises:#

AssertionError: If dataset_params is not a list of dictionaries or a path to a .json file containing one.

static create(library, **kwargs)[source]#

Factory method to create a dataset instance based on the dataset type.

Return type:

SklearnModel | XGBoostModel | PyTorchModel

Parameters:

library (Literal['sklearn', 'scikit-learn', 'skl', 'xgboost', 'xgb', 'pytorch', 'torch', 'tensorflow', 'tf'])

Args:#

library (LibraryNames): Type of dataset to create. **kwargs: Arbitrary keyword arguments to be passed to the dataset class constructor.

Returns:#

:

BaseDataset: An instance of a dataset class (KaggleDataset or LocalDataset).

Raises:#

ValueError: If an unknown dataset type is provided.

Parameters:

params_list (ParamsInput)