HBase問題集
1、ERROR: org.apache.hadoop.hbase.MasterNotRunningException
問題:
運行hbase的時候發現這個錯誤:
- ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 times
原因:
查看log,發現大量的
2012-04-26 08:13:39,600 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting for dfs to exit safe mode...
原來hdfs還處于安全模式
- ./hadoop fsck //hbase/.logs/slave1,60020,1333159627316/slave1%2C60020%2C1333159627316.1333159637444: Under replicated blk_-4160280099734447327_1626. Target Replicas is 3 but found 2 replica(s).
- ....
- /home/hadoop/tmp/mapred/staging/hadoop/.staging/job_201203211238_0002/job.jar: Under replicated blk_-7807519084475423360_1012. Target Replicas is 10 but found 2 replica(s).
- ......................................................................Status: HEALTHY
- Corrupt blocks: 0
- Missing replicas: 9 (3.0612245 %)
Number of data-nodes: 2沒有損壞的block,有9個丟失的replicas,狀態健康
所以可以強制離開安全模式
解決:
- hadoop dfsadmin -safemode leave
- Safe mode is OFF
運行hbase命令成功
2、HMaster啟動后自動關閉
問題:
一啟動就出了問題,原先調試好的分布式平臺卻提示了錯誤:
- Zookeeper available but no active master location found
原因:
HMaster的問題,JPS查看發現沒有了HMaster進程,進入到hbase-master日志中查看,發現了以下錯誤提示:
- Could not obtain block: blk_number... ...file=/hbase/hbase.version
無法訪問數據塊的原因無非有兩個:一是該數據塊不存在;二是該數據塊沒有權限。自己去HDFS下查看發現了/hbase目錄,也有hbase.version文件,雖然該文件大小為0kb。于是自己首先想到是權限問題,接下來開始為/hbase修改權限:
- %hadoop fs -chmod 777 /hbase
- %hadoop fs -chmod -R 777 /hbase (修改目錄權限)
但是試過之后結果依舊。這時自己確定HMaster自動關閉的問題不是因為目錄權限拒絕訪問,那么是什么呢?之前也發生過HMaster啟動后自動關閉的問題,自己當時的解決辦法是格式化namenode即可:
- %hadoop namenode -format
但是這次試過之后仍舊不成功,于是自己考慮會不會是由于分布式環境下不同節點的hdfs的重復工作導致的不一致使得HMaster無法正常啟動呢?
解決:
抱著這樣的想法刪掉了各個節點和master上的hdfs數據,在master上重新啟動hbase結果成功,HMaster不再自動關閉。
這時我們需要重新復制生成HDFS干凈的HDFS:
- %rm -Rf reports/data
- %hadoop fs -copyFromLocal reports /texaspete/templates/reports
- %hadoop fs -put match/src/main/resources/regexes /texaspete/regexes
3、Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: te
問題:
- Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: te
原因:
hbase on hadoop2時,配置的hdfs路徑是HA的映射目錄,而這個路徑并不是一個ip:port的格式,hbase在查找主機名的時候并不知道,就把路徑中的目錄當成了一個ip,無法找到
解決:
把hadoop的hdfs-site.xml和core-site.xml 放到hbase/conf下
4、hive+hbase得不到返回結果
問題:
在hive中建立外表鏈接到hbase的表,在做復雜查詢時發現得不到結果返回。都是hive 0.9 版本。
遠程客戶端錯誤:
- Caused by: java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
- at org.apache.hadoop.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:177)
- at org.apache.hadoop.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:140)
- at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
- at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
- at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
- at java.lang.reflect.Method.invoke(Unknown Source)
- at org.hibernate.engine.jdbc.internal.proxy.AbstractStatementProxyHandler.continueInvocation(AbstractStatementProxyHandler.java:122)
- ... 88 more
- Caused by: HiveServerException(message:Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask, errorCode:9, SQLState:08S01)
- at org.apache.hadoop.hive.service.ThriftHive$execute_result.read(ThriftHive.java:1318)
- at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
- at org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:105)
- at org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:92)
- at org.apache.hadoop.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:175)
- ... 94 more
原因:
查看hadoop日志:缺少hbase的包
- java.io.IOException: Cannot create an instance of InputSplit class = org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit >
- at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:145) >
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) >
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) >
- at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:348) >
- at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:364) >
- at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) >
- at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >
- at java.security.AccessController.doPrivileged(Native Method) >
- at javax.security.auth.Subject.doAs(Subject.java:415) >
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) >
- at org.apache.hadoop.mapred.Child.main(Child.java:262)
解決:
把相關的包導入進hive
修改hive-site.xml
添加:
- <property>
- <name>hive.aux.jars.path</name>
- <value>file:///opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hbase/hbase.jar,file:///opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.4.0.jar,file:///opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/zookeeper/zookeeper.jar</value>
- </property>
- <property>
- <name>hbase.zookeeper.quorum</name>
- <value>zookeeper的主機名</value>
- </property>
5、在通過JDBC訪問Hive+HBase做統計查詢時報錯HBaseSplit not found
問題:在通過JDBC訪問Hive+HBase做統計查詢時報錯HBaseSplit not found
原因:Hive集成HBase,通過JDBC訪問HBase映射Hive的表做統計查詢時報錯(報錯信息如下),
- java.io.IOException: Cannot create an instance of InputSplit class = org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplitat org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:146)
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
- at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
- at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:396)
- at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
- at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
- at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:396)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
- at org.apache.hadoop.mapred.Child.main(Child.java:249)
- Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HBaseSplit
- at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
- at java.security.AccessController.doPrivileged(Native Method)
- at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
- at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
- at java.lang.Class.forName0(Native Method)
- at java.lang.Class.forName(Class.java:249)
- at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
- at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
錯誤提示是說HBaseSplit類找不到,但是在classpath中有這個類。但是還需要提供auxpath jar包
解決:
修改一下配置文件hive-site.xml,添加以下配置,問題即解決。
- <property>
- <name>hive.aux.jars.path</name>
- <value>file:///home/用戶目錄/hive-0.10.0/lib/hive-hbase-handler-0.10.0.jar,file:///home/用戶目錄/hive-0.10.0/lib/hbase-0.92.0.jar,file:///home/用戶目錄/hive-0.10.0/lib/zookeeper-3.4.3.jar</value>
- </property>
6、org.apache.hadoop.hbase.ClockOutOfSyncException,Reported time is too far out of sync with master
在啟動Hbase的過程中,有的報出了以下的錯誤:
- org.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hadoop2,16020,1470107202883 has been rejected; Reported time is too far out of sync with master. Time difference of 53999521ms > max allowed of 30000ms
- at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:407)
- at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:273)
- at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:360)
- at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615)
- at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2180)
- at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
- at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
- at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
- at java.lang.Thread.run(Thread.java:744)
- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
- at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
- at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
- at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
- at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
- at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
- at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:329)
- at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2288)
- at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:907)
- at java.lang.Thread.run(Thread.java:744)
- Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ClockOutOfSyncException): org.apache.hadoop.hbase.ClockOutOfSyncException: Server hadoop2,16020,1470107202883 has been rejected; Reported time is too far out of sync with master. Time difference of 53999521ms > max allowed of 30000ms
- at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:407)
- at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:273)
錯誤解釋:
當一個RegionServer始終偏移太大時,master節點結將會拋出此異常.
解決方法:
1、確認幾臺機器所屬的時區 時間是否一致,不一致的情況下要同步一致
2、可以適當增加hbase.master.maxclockskew時間
- <property>
- <name>hbase.master.maxclockskew</name>
- <value>180000</value>
- </property>
【本文為51CTO專欄作者“王森豐”的原創稿件,轉載請注明出處】