sklearn-MultinomialNB朴素贝叶斯分类器

原型

class sklearn.naive_bayes.MultinomialNB(alpha=1.0, fit_prior=True, class_prior=None)

参数

Parameters:

Parameters:	alpha : float, optional (default=1.0) Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing). fit_prior : boolean, optional (default=True) Whether to learn class prior probabilities or not. If false, a uniform prior will be used. class_prior : array-like, size (n_classes,), optional (default=None) Prior probabilities of the classes. If specified the priors are not adjusted according to the data.

alpha : float, optional (default=1.0)

Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).

fit_prior : boolean, optional (default=True)

Whether to learn class prior probabilities or not. If false, a uniform prior will be used.

class_prior : array-like, size (n_classes,), optional (default=None)

Prior probabilities of the classes. If specified the priors are not adjusted according to the data.

alpha的说明——

The parameters $heta_y$ is estimated by a smoothed version of maximum likelihood, i.e. relative frequency counting:

$hat{ heta}_{yi} = frac{ N_{yi} + alpha}{N_y + alpha n}$

where $N_{yi} = sum_{x in T} x_i$ is the number of times feature $i$ appears in a sample of class $y$ in the training set $T$ , and $N_{y} = sum_{i=1}^{|T|} N_{yi}$ is the total count of all features for class $y$ .

The smoothing priors $alpha ge 0$ accounts for features not present in the learning samples and prevents zero probabilities in further computations. Setting $alpha = 1$ is called Laplace smoothing, while $alpha < 1$ is called Lidstone smoothing.

示例

>>> import numpy as np
>>> X = np.random.randint(5, size=(6, 100))
>>> y = np.array([1, 2, 3, 4, 5, 6])
>>> from sklearn.naive_bayes import MultinomialNB
>>> clf = MultinomialNB()
>>> clf.fit(X, y)
MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)
>>> print(clf.predict(X[2:3]))