|
公司三台测试服务器,测试spark的集群模式是否正常运行遇到的问题:
1.spark运行spark任务遇到的,
SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
参考博客:
https://www.cnblogs.com/huanongying/archive/2017/10/12/7655598.html
运行脚本有问题:
sudo -u hdfs /usr/hdp/2.6.5.0-292/spark2/bin/spark-submit \ spark的bin目录下执行
–master yarn \ spark的集群模式
–deploy-mode cluster \ yarn 的模式
–class com.amhy.test.Sprk01 \ 类的全路径
–num-executors 3 \ executor的数量
–driver-memory 512m \ driver的内存
–executor-memory 1g \ executor的内存
–executor-cores 1
/bigdata/jars/scala-yarn.jar \
将submit.py文件修改成可执行文件:
chmod +x 文件名

执行方法: ./文件名
执行时产生的异常:
Exception in thread “main” java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction2mcIIIsp
可能的问题:1.集群的scala的版本和idea中scala的版本不一致
解决办法:idea中修改scala的sdk,改成集群中scala的版本
2.可能是包的问题,把依赖包打进去一直在集群上执行,可能集群没有这个依赖包
====================================打包的问题:
1.打第三包:
参考博客:https://blog.csdn.net/qq_26597927/article/details/80170073
通用插件:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.xxg.Main</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
2.idea自身打包:
参考博客:
https://blog.csdn.net/Venry_/article/details/80400282 |