机器学习sklearn(十): 数据处理(五)自定义转换器

在机器学习中,想要将一个已有的 Python 函数转化为一个转换器来协助数据清理或处理。可以使用 FunctionTransformer 从任意函数中实现一个转换器。例如,在一个管道中构建一个实现日志转换的转化器,这样做:

>>> import numpy as np
>>> from sklearn.preprocessing import FunctionTransformer
>>> transformer = FunctionTransformer(np.log1p, validate=True)
>>> X = np.array([[0, 1], [2, 3]])
>>> transformer.transform(X)
array([[0.        , 0.69314718],
       [1.09861229, 1.38629436]])

通过设置check_reverse=True并在转换之前调用fit,可以确保funcinverse_func是彼此的拟过程。请注意,请注意一个warning会被抛出,并且可以使用filterwarnings将其转为一个error

使用一个 FunctionTransformer 类来做定制化特征选择的示例,请见 Using FunctionTransformer to select columns 。

class sklearn.preprocessing.FunctionTransformer(func=Noneinverse_func=None*validate=Falseaccept_sparse=Falsecheck_inverse=Truekw_args=Noneinv_kw_args=None)

Constructs a transformer from an arbitrary callable.

A FunctionTransformer forwards its X (and optionally y) arguments to a user-defined function or function object and returns the result of this function. This is useful for stateless transformations such as taking the log of frequencies, doing custom scaling, etc.

Note: If a lambda is used as the function, then the resulting transformer will not be pickleable.

New in version 0.17.

Read more in the User Guide.

Parameters
funccallable, default=None

The callable to use for the transformation. This will be passed the same arguments as transform, with args and kwargs forwarded. If func is None, then func will be the identity function.

inverse_funccallable, default=None

The callable to use for the inverse transformation. This will be passed the same arguments as inverse transform, with args and kwargs forwarded. If inverse_func is None, then inverse_func will be the identity function.

validatebool, default=False

Indicate that the input X array should be checked before calling func. The possibilities are:

  • If False, there is no input validation.

  • If True, then X will be converted to a 2-dimensional NumPy array or sparse matrix. If the conversion is not possible an exception is raised.

Changed in version 0.22: The default of validate changed from True to False.

accept_sparsebool, default=False

Indicate that func accepts a sparse matrix as input. If validate is False, this has no effect. Otherwise, if accept_sparse is false, sparse matrix inputs will cause an exception to be raised.

check_inversebool, default=True

Whether to check that or func followed by inverse_func leads to the original inputs. It can be used for a sanity check, raising a warning when the condition is not fulfilled.

New in version 0.20.

kw_argsdict, default=None

Dictionary of additional keyword arguments to pass to func.

New in version 0.18.

inv_kw_argsdict, default=None

Dictionary of additional keyword arguments to pass to inverse_func.

New in version 0.18.

Methods

fit(X[, y])

Fit transformer by checking X.

fit_transform(X[, y])

Fit to data, then transform it.

get_params([deep])

Get parameters for this estimator.

inverse_transform(X)

Transform X using the inverse function.

set_params(**params)

Set the parameters of this estimator.

transform(X)

Transform X using the forward function.

Examples

>>> import numpy as np
>>> from sklearn.preprocessing import FunctionTransformer
>>> transformer = FunctionTransformer(np.log1p)
>>> X = np.array([[0, 1], [2, 3]])
>>> transformer.transform(X)
array([[0.       , 0.6931...],
       [1.0986..., 1.3862...]])
原文地址:https://www.cnblogs.com/qiu-hua/p/14903451.html