比较KNN、逻辑回归、SVM三种算法的分类效果

还是水果分类原始数据,这次使用KNN、逻辑回归、SVM三种算法实现水果分类器,看哪种算法效果好。
输出如下:
KNN模型的准确率是:75.00% 逻辑回归模型参数是:[[-0.05274036 4.80089662 -0.2919612 9.34272797] [-0.32977103 6.31580761 -1.35272117 1.14952952] [-0.23650438 -8.17278107 11.71949993 -1.45948241] [ 0.02063462 0.29756545 -0.29966445 2.01418258]];截距是:[-31.55768938 1.34960096 -0.68908458 -5.76087243] LogicRe模型的准确率是:58.33% SVM模型的准确率是:50.00%
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

data_path = './data/fruit_data.csv'
output_dir = './output/'

label_dict = {'apple':0,
               'mandarin':1,
               'lemon':2,
               'orange':3
            }
feat_cols = ['mass','width','height','color_score']

if __name__ == '__main__':
    data_df = pd.read_csv(data_path)
    data_df['label'] = data_df['fruit_name'].map(label_dict)

    X = data_df[feat_cols]
    y = data_df['label']
    X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=19)

    model_dict = {
        'KNN':KNeighborsClassifier(n_neighbors=3),
        # C值越小表示越强的正则化,也就是更弱复杂度;C值默认为1.0;后面2个参数不设置会有警告
        'LogicRe':LogisticRegression(C=1e3,solver='liblinear',multi_class='auto'),
        'SVM':SVC(C=1e3,gamma='auto')               # C值越小表示越强的正则化,也就是更弱复杂度;C值默认为1.0
    }


    for model_name,model in model_dict.items():
        model.fit(X_train,y_train)
        accuracy = model.score(X_test,y_test)
        if model_name == 'LogicRe':   # 会有4组数,分别对应4种水果的概率
            print('逻辑回归模型参数是:{};截距是:{}'.format(model.coef_,model.intercept_))
        print('{}模型的准确率是:{:.2f}%'.format(model_name,accuracy*100))
原文地址:https://www.cnblogs.com/djlbolgs/p/12631860.html