关注微信公共号:小程在线

关注CSDN博客:程志伟的博客
R版本:3.6.1
setwd('G:\\R语言\\大三下半年\\数据挖掘:R语言实战\\')
> library("e1071", lib.loc="H:/Program Files/R/R-3.6.1/library") Warning message: 程辑包‘e1071’是用R版本3.6.2 来建造的
#############模拟线性可分下的SVM > set.seed(12345) > x<-matrix(rnorm(n=40*2,mean=0,sd=1),ncol=2,byrow=TRUE) > y<-c(rep(-1,20),rep(1,20)) > x[y==1,]<-x[y==1,]+1.5 > data_train<-data.frame(Fx1=x[,1],Fx2=x[,2],Fy=as.factor(y)) #生成训练样本集 > x<-matrix(rnorm(n=20,mean=0,sd=1),ncol=2,byrow=TRUE) > y<-sample(x=c(-1,1),size=10,replace=TRUE) > x[y==1,]<-x[y==1,]+1.5 > data_test<-data.frame(Fx1=x[,1],Fx2=x[,2],Fy=as.factor(y)) #生成测试样本集 > plot(data_train[,2:1],col=as.integer(as.vector(data_train[,3]))+2,pch=8,cex=0.7,main="训练样本集-1和+1类散点图")
 > SvmFit<-svm(Fy~.,data=data_train,type="C-classification",kernel="linear",cost=10,scale=FALSE) > summary(SvmFit)
Call: svm(formula = Fy ~ ., data = data_train, type = "C-classification", kernel = "linear", cost = 10, scale = FALSE)
Parameters: SVM-Type: C-classification SVM-Kernel: linear cost: 10
Number of Support Vectors: 16
( 8 8 )
Number of Classes: 2
Levels: -1 1
> SvmFit$index [1] 1 6 7 10 11 16 17 20 22 24 28 31 33 35 36 37 > plot(x=SvmFit,data=data_train,formula=Fx1~Fx2,svSymbol="#",dataSymbol="*",grid=100)
 > SvmFit<-svm(Fy~.,data=data_train,type="C-classification",kernel="linear",cost=0.1,scale=FALSE) > summary(SvmFit)
Call: svm(formula = Fy ~ ., data = data_train, type = "C-classification", kernel = "linear", cost = 0.1, scale = FALSE)
Parameters: SVM-Type: C-classification SVM-Kernel: linear cost: 0.1
Number of Support Vectors: 25
( 12 13 )
Number of Classes: 2
Levels: -1 1
##############10折交叉验证选取损失惩罚参数C > set.seed(12345) > tObj<-tune.svm(Fy~.,data=data_train,type="C-classification",kernel="linear", + cost=c(0.001,0.01,0.1,1,5,10,100,1000),scale=FALSE) > summary(tObj)
Parameter tuning of ‘svm’:
- sampling method: 10-fold cross validation
- best parameters: cost 5
- best performance: 0.175
- Detailed performance results: cost error dispersion 1 1e-03 0.675 0.3129164 2 1e-02 0.375 0.3584302 3 1e-01 0.225 0.2486072 4 1e+00 0.200 0.2297341 5 5e+00 0.175 0.2371708 6 1e+01 0.175 0.2371708 7 1e+02 0.175 0.2371708 8 1e+03 0.175 0.2371708
> BestSvm<-tObj$best.model > summary(BestSvm)
Call: best.svm(x = Fy ~ ., data = data_train, cost = c(0.001, 0.01, 0.1, 1, 5, 10, 100, 1000), type = "C-classification", kernel = "linear", scale = FALSE)
Parameters: SVM-Type: C-classification SVM-Kernel: linear cost: 5
Number of Support Vectors: 16
( 8 8 )
Number of Classes: 2
Levels: -1 1
> yPred<-predict(BestSvm,data_test) > (ConfM<-table(yPred,data_test$Fy)) yPred -1 1 -1 6 0 1 1 3 > (Err<-(sum(ConfM)-sum(diag(ConfM)))/sum(ConfM)) [1] 0.1
训练样本40个观测。不同颜色代表不同类别。
当损失惩罚参数C=10时,一共16个向量。
利用tune.svm函数尝试不同的惩罚参数。
> ##############模拟线性不可分下的SVM > set.seed(12345) > x<-matrix(rnorm(n=400,mean=0,sd=1),ncol=2,byrow=TRUE) > x[1:100,]<-x[1:100,]+2 > x[101:150,]<-x[101:150,]-2 > M M(=Ёф}(AMY4Q老MY4-老耀(MY耀((9
((Q}хqqKqs6+qKV62[:cZW>+Rqs2}Q|ИUQ} =ЙQ} =ФС MQ}
4хQ} =Ф
4
4
4l(z/rǖJ1^z/rǚr$R2BG<
|