Serialzy is a library for python objects serialization into portable and interoperable data formats (if possible).
Suppose you have a catboost model:
from catboost import CatBoostClassifier
model = CatBoostClassifier()
model.fit(...)Firstly you should find a proper serializer for the catboost model type or the corresponding data format:
from serialzy.registry import DefaultSerializerRegistry
registry = DefaultSerializerRegistry()
serializer = registry.find_serializer_by_type(type(model)) # registry.find_serializer_by_data_format("cbm")Serializers have several properties:
serializer.available() # can be used in the current environment
serializer.requirements() # libraries needed to be installed to use this serializer
serializer.stable() # has portable data formatSerializers can provide data format and schema for a type:
serializer.data_format()
serializer.schema(type(model))Serialization:
with open('model.cbm', 'wb') as file:
serializer.serialize(model, file)Deserialization:
with open('result', 'rb') as file:
deserialized_obj = serializer.deserialize(file)| Library | Types | Data format |
|---|---|---|
| Python std lib | int, str, float, bool, None | string representation |
| Python std lib | List, Tuple | custom format |
| CatBoost | CatBoostRegressor, CatBoostClassifier, CatBoostRanker | cbm |
| CatBoost | Pool | quantized pool |
| Tensorflow.Keras | Sequential, Model with subclasses | tf_keras |
| Tensorflow | Checkpoint, Module with subclasses | tf_pure |
| LightGBM | LGBMClassifier, LGBMRegressor, LGBMRanker | lgbm |
| XGBoost | XGBClassifier, XGBRegressor, XGBRanker | xgb |
| Torch | Module with subclasses | pt |
| ONNX | ModelProto | onnx |