怎样测试写的hive性能测试的hdf是否成功

& Hive导入10G数据的测试
Hive导入10G数据的测试
,介绍了如何整合虚拟化和Hadoop,让Hadoop集群跑在VPS虚拟主机上,通过云向用户提供存储和计算的服务。
现在硬件越来越便宜,一台非品牌服务器,2颗24核CPU,配48G内存,2T的硬盘,已经降到2万块人民币以下了。这种配置如果简单地放几个web应用,显然是奢侈的浪费。就算是用来实现单节点的hadoop,对计算资源浪费也是非常高的。对于这么高性能的计算机,如何有效利用计算资源,就成为成本控制的一项重要议题了。
通过虚拟化技术,我们可以将一台服务器,拆分成12台VPS,每台2核CPU,4G内存,40G硬盘,并且支持资源重新分配。多么伟大的技术啊!现在我们有了12个节点的hadoop集群, 让Hadoop跑在云端,让世界加速。
关于作者:
张丹(Conan), 程序员Java,R,PHP,Javascript
weibo:@Conan_Z
blog: http://blog.fens.me
转载请注明出处:
Hadoop和Hive的环境已经搭建起来了,开始导入数据进行测试。我的数据1G大概对应500W行,MySQL的查询500W行大概3.29秒,用hive同样的查询大概30秒。如果我们把数据增加到10G,100G,让我们来看看Hive的表现吧。
导出MySQL数据
导入到Hive
优化导入过程Hive Bucket
1. 导出MySQL数据
下面是我的表,每天会产生一新表,用日期的方式命名。今天是日,对应的表是cb_hft,记录数646W条记录。
+-----------------+
| Tables_in_CB
+-----------------+
| NSpremium
| cb_hft_ |
| cb_hft_ |
| cb_hft_ |
| cb_hft_ |
+-----------------+
6 rows in set (0.00 sec)
mysql> select count(1) from cb_
+----------+
| count(1) |
+----------+
+----------+
1 row in set (3.29 sec)
快速复制表:
由于这个表是离线系统的,没有线上应用,我重命名表cb_hft为cb_hft_,再复制表结构。
mysql> RENAME TABLE cb_hft TO cb_hft_;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE cb_hft like cb_hft_;
Query OK, 0 rows affected (0.02 sec)
+-----------------+
| Tables_in_CB
+-----------------+
| NSpremium
| cb_hft_ |
| cb_hft_ |
| cb_hft_ |
| cb_hft_ |
| cb_hft_ |
+-----------------+
7 rows in set (0.00 sec)
导出表到csv
以hft_表为例
mysql> SELECT
SecurityID,TradeTime,PreClosePx,OpenPx,HighPx,LowPx,LastPx,
BidSize1,BidPx1,BidSize2,BidPx2,BidSize3,BidPx3,BidSize4,BidPx4,BidSize5,BidPx5,
OfferSize1,OfferPx1,OfferSize2,OfferPx2,OfferSize3,OfferPx3,OfferSize4,OfferPx4,OfferSize5,OfferPx5,
NumTrades,TotalVolumeTrade,TotalValueTrade,PE,PE1,PriceChange1,PriceChange2,Positions
FROM cb_hft_
INTO OUTFILE '/tmp/export_cb_hft_.csv'
FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
Query OK, 6127080 rows affected (2 min 55.04 sec)
查看数据文件
~ ls -l /tmp
-rw-rw-rw- 1 mysql mysql
Jul 19 15:59 export_cb_hft_.csv
2. 导入到Hive
登陆机器,下载数据文件
~ ssh cos@
~ cd /home/cos/hadoop/sqldb
~ scp -P 10003 cos@:/tmp/export_cb_hft_.csv .
export_cb_hft_.csv
100% 1019MB
在hive上建表
~ bin/hive shell
#删除已存在的表
hive> DROP TABLE IF EXISTS t_hft_
Time taken: 4.898 seconds
#创建t_hft_tmp表
hive> CREATE TABLE t_hft_tmp(
SecurityID STRING,TradeTime STRING,
PreClosePx DOUBLE,OpenPx DOUBLE,HighPx DOUBLE,LowPx DOUBLE,LastPx DOUBLE,
BidSize1 DOUBLE,BidPx1 DOUBLE,BidSize2 DOUBLE,BidPx2 DOUBLE,BidSize3 DOUBLE,BidPx3 DOUBLE,BidSize4 DOUBLE,BidPx4 DOUBLE,BidSize5 DOUBLE,BidPx5 DOUBLE,
OfferSize1 DOUBLE,OfferPx1 DOUBLE,OfferSize2 DOUBLE,OfferPx2 DOUBLE,OfferSize3 DOUBLE,OfferPx3 DOUBLE,OfferSize4 DOUBLE,OfferPx4 DOUBLE,OfferSize5 DOUBLE,OfferPx5 DOUBLE,
NumTrades INT,TotalVolumeTrade DOUBLE,TotalValueTrade DOUBLE,PE DOUBLE,PE1 DOUBLE,PriceChange1 DOUBLE,PriceChange2 DOUBLE,Positions DOUBLE
) PARTITIONED BY (tradeDate INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
Time taken: 0.189 seconds
hive> LOAD DATA LOCAL INPATH '/home/cos/hadoop/sqldb/export_cb_hft_.csv' OVERWRITE INTO TABLE t_hft_tmp PARTITION (tradedate=);
Copying data from file:/home/cos/hadoop/sqldb/export_cb_hft_.csv
Copying file: file:/home/cos/hadoop/sqldb/export_cb_hft_.csv
Loading data to table default.t_hft_tmp partition (tradedate=)
Time taken: 16.535 seconds
当数据被加载至表中时,不会对数据进行任何转换。Load操作只是将数据复制至Hive表对应的位置,这个表只有一个文件,文件没有切分成多份。
hive> dfs -ls /user/hive/warehouse/t_hft_tmp/tradedate=;
Found 1 items
-rw-r--r--
1 cos supergroup
16:07 /user/hive/warehouse/t_hft_tmp/tradedate=/export_cb_hft_.csv
3. 优化导入过程Hive Bucket
第二步导入,我们要把刚才的一个大文件切分成多少小文件,大概按照64M一个block的要求。我们设置做16个Bucket。
新建数据表t_hft_day,并定义CLUSTERED BY,SORTED BY,16 BUCKETS
hive> CREATE TABLE t_hft_day(
SecurityID STRING,TradeTime STRING,
PreClosePx DOUBLE,OpenPx DOUBLE,HighPx DOUBLE,LowPx DOUBLE,LastPx DOUBLE,
BidSize1 DOUBLE,BidPx1 DOUBLE,BidSize2 DOUBLE,BidPx2 DOUBLE,BidSize3 DOUBLE,BidPx3 DOUBLE,BidSize4 DOUBLE,BidPx4 DOUBLE,BidSize5 DOUBLE,BidPx5 DOUBLE,
OfferSize1 DOUBLE,OfferPx1 DOUBLE,OfferSize2 DOUBLE,OfferPx2 DOUBLE,OfferSize3 DOUBLE,OfferPx3 DOUBLE,OfferSize4 DOUBLE,OfferPx4 DOUBLE,OfferSize5 DOUBLE,OfferPx5 DOUBLE,
NumTrades INT,TotalVolumeTrade DOUBLE,TotalValueTrade DOUBLE,PE DOUBLE,PE1 DOUBLE,PriceChange1 DOUBLE,PriceChange2 DOUBLE,Positions DOUBLE
) PARTITIONED BY (tradeDate INT)
CLUSTERED BY(SecurityID) SORTED BY(TradeTime) INTO 16 BUCKETS
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
从t_hft_tmp临时数据表导入到t_hft_day数据表
#强制执行装桶的操作
hive> set hive.enforce.bucketing =
hive> FROM t_hft_tmp
INSERT OVERWRITE TABLE t_hft_day
PARTITION (tradedate=)
SELECT SecurityID , TradeTime ,
PreClosePx ,OpenPx ,HighPx ,LowPx ,LastPx ,
BidSize1 ,BidPx1 ,BidSize2 ,BidPx2 ,BidSize3 ,BidPx3 ,BidSize4 ,BidPx4 ,BidSize5 ,BidPx5 ,
OfferSize1 ,OfferPx1 ,OfferSize2 ,OfferPx2 ,OfferSize3 ,OfferPx3 ,OfferSize4 ,OfferPx4 ,OfferSize5 ,OfferPx5 ,
NumTrades,TotalVolumeTrade ,TotalValueTrade ,PE ,PE1 ,PriceChange1 ,PriceChange2 ,Positions
WHERE tradedate=;
MapReduce Total cumulative CPU time: 8 minutes 5 seconds 810 msec
Ended Job = job__0016
Loading data to table default.t_hft_day partition (tradedate=)
Partition default.t_hft_day{tradedate=} stats: [num_files: 16, num_rows: 0, total_size: , raw_data_size: 0]
Table default.t_hft_day stats: [num_partitions: 11, num_files: 176, num_rows: 0, total_size: , raw_data_size: 0]
6127080 Rows loaded to t_hft_day
MapReduce Jobs Launched:
Job 0: Map: 4
Reduce: 16
Cumulative CPU: 485.81 sec
HDFS Read:
HDFS Write:
Total MapReduce CPU Time Spent: 8 minutes 5 seconds 810 msec
Time taken: 172.617 seconds
导入操作累计CPU时间是8分05秒,8*60+5=485秒。由于有4个Map并行,16个Reduce并行,所以实际消耗时间是172秒。
我们再看一下新表的文件是否被分片:
hive> dfs -ls /user/hive/warehouse/t_hft_day/tradedate=;
Found 16 items
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=000_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=001_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=002_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=003_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=004_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=005_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=006_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=007_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=008_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=009_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=010_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=011_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=012_0
-rw-r--r--
1 cos supergroup
3-07-19 16:19 /user/hive/warehouse/t_hft_day/tradedate=013_0
-rw-r--r--
1 cos supergroup
3-07-19 16:18 /user/hive/warehouse/t_hft_day/tradedate=014_0
-rw-r--r--
1 cos supergroup
3-07-19 16:19 /user/hive/warehouse/t_hft_day/tradedate=015_0
一共16个分片。
4. 执行查询
当前1G的文件,使用Hive执行一个简单的查询:34.974秒
hive> select count(1) from t_hft_day where tradedate=;
MapReduce Total cumulative CPU time: 34 seconds 670 msec
Ended Job = job__0017
MapReduce Jobs Launched:
Job 0: Map: 7
Cumulative CPU: 34.67 sec
HDFS Read:
HDFS Write: 8 SUCCESS
Total MapReduce CPU Time Spent: 34 seconds 670 msec
Time taken: 34.974 seconds
MySQL执行同样的查询,在开始时我已经测试过3.29秒。
相差了10倍的时间,不过只有1G的数据量,是发挥不出hadoop的优势的。
接下来,按照上面的方法,我们把十几天的数据都导入到hive里面,然后再进行比较。
查看已导入hive的数据集
hive> SHOW PARTITIONS t_hft_
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
tradedate=
Time taken: 0.099 seconds
在MySQL中,对5张表进行查询。(5G数据量)
#单表:由于PreClosePx不是索引列,第一次查询
mysql> select SecurityID, as tradedate,count(1) as count from cb_hft_ where PreClosePx>8.17 group by SecurityID limit 10;
+------------+-----------+-------+
| SecurityID | tradedate | count |
+------------+-----------+-------+
+------------+-----------+-------+
10 rows in set (24.60 sec)
select t.SecurityID,t.tradedate,t.count
select SecurityID, as tradedate,count(1) as count from cb_hft_ where PreClosePx>8.17 group by SecurityID
select SecurityID, as tradedate,count(1) as count from cb_hft_ group by SecurityID
select SecurityID, as tradedate,count(1) as count from cb_hft_ where PreClosePx>8.17 group by SecurityID
select SecurityID, as tradedate,count(1) as count from cb_hft_ where PreClosePx>8.17 group by SecurityID
select SecurityID, as tradedate,count(1) as count from cb_hft_ where PreClosePx>8.17 group by SecurityID ) as t
#超过3分钟,无返回结果。
在Hive中,对同样的5张表进行查询。(5G数据量)
select SecurityID,tradedate,count(1) from t_hft_day where tradedate in (15,30719) and PreClosePx>8.17 group by SecurityID,tradedate limit 10;
MapReduce Total cumulative CPU time: 3 minutes 56 seconds 540 msec
Ended Job = job__0023
MapReduce Jobs Launched:
Job 0: Map: 25
Cumulative CPU: 236.54 sec
HDFS Read:
HDFS Write: 1470 SUCCESS
Total MapReduce CPU Time Spent: 3 minutes 56 seconds 540 msec
Time taken: 66.32 seconds
#对以上14张表的查询
MapReduce Total cumulative CPU time: 8 minutes 40 seconds 380 msec
Ended Job = job__0022
MapReduce Jobs Launched:
Job 0: Map: 53
Reduce: 15
Cumulative CPU: 520.38 sec
HDFS Read:
HDFS Write: 3146 SUCCESS
Total MapReduce CPU Time Spent: 8 minutes 40 seconds 380 msec
Time taken: 119.161 seconds
我们看到hadoop对以G为单位量级的数据增长是不敏感的,多了3倍的数据(15G),执行查询的时间是原来(5G)的两倍。而MySQL数据增长到5G,查询时间几乎是不可忍受的。
1G以下的数据是单机可以处理的,MySQL会非常好的完成查询任务。Hadoop只有在数据量大的情况下才能发挥出优势,当数据量到达10G时,MySQL的单表查询就显得就会性能不足。如果数据量到达了100G,MySQL就已经解决不了了,要通过各种优化的程序才能完成查询。
测试过程已经描述的很清楚了,我们接下来的工作就是把过程自动化。
转载请注明出处:
This entry was posted in ,
Designed by8517人阅读
Java框架学习(44)
引言: Hive是一种强大的数据仓库查询语言,类似SQL,本文将介绍如何搭建Hive的开发测试环境。
1. 什么是Hive?
&& hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供简单的sql查询功能,可以将sql语句转换为MapReduce任务进行运行。 其优点是学习成本低,可以通过类SQL语句快速实现简单的MapReduce统计,不必开发专门的MapReduce应用,十分适合数据仓库的统计分析。
2.& 按照Hive的准备条件
&&& 2.1& Hadoop集群环境已经安装完毕
 2.2 本文使用Ubuntu做为开发环境(14.04)
3. 安装步骤
&&& 3.1 下载Hive包:apache-hive-0.13.1-bin.tar.gz
&&& 3.2 将其解压到/opt目录下
   tar xzvf apache-hive-0.13.1-bin.tar.gz
&&& 3.3 设置环境变量
&&&export&HIVE_HOME=/opt/apache-hive-0.13
&&&export&PATH=$PATH:$HIVE_HOME/bin
&&&export&CLASSPATH=$CLASSPATH:$HIVE_HOME/bin
&& 3.4.&修改hive-env.xml,复制hive-env.xml.template.
&&& #&Set&HADOOP_HOME&to&point&to&a&specific&hadoop&install&directory
& & HADOOP_HOME=/opt/hadoop-1.2.1
&&&&#&Hive&Configuration&Directory&can&be&controlled&by:
&&&& export&HIVE_CONF_DIR=/opt/apache-hive-0.13/conf
&& 3.5 修改hive-site.xml,主要修改数据库的连接信息.
&property&
&name&hive.metastore.uris&/name&
&value&thrift://127.0.0.1:9083&/value&
&description&Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.&/description&
&/property&
&property&
&name&javax.jdo.option.ConnectionURL&/name&
&value&jdbc:mysql://BladeStone-Laptop:3306/hive?createDatabaseIfNotExist=true&/value&
&description&JDBC connect string for a JDBC metastore&/description&
&/property&
&property&
&name&javax.jdo.option.ConnectionDriverName&/name&
&value&com.mysql.jdbc.Driver&/value&
&description&Driver class name for a JDBC metastore&/description&
&/property&
&property&
&name&javax.jdo.option.ConnectionUserName&/name&
&value&hive&/value&
&description&username to use against metastore database&/description&
&/property&
&property&
&name&javax.jdo.option.ConnectionPassword&/name&
&value&123456&/value&
&description&password to use against metastore database&/description&
&/property&3.6 安装mysql数据库(Ubuntu系统)
& sudo&apt-get&install&mysql-server
3.7 创建mysql用户hive
& 3.8 在mysql中创建hive数据库
& 3.9& 下载mysql驱动,并将驱动复制到hive_home/lib类库
  mysql-connector-java-5.1.31-bin.jar
& 3.10 启动Hive
&3.11 在Hive中创建表
& 3.12 登录mysql,访问hive数据库
&&&& 3.13 删除Hive中的表
&&& 3.14 登录mysql,查询TBLS中的数据
 通过以上的步骤,我们完整的安装了一个Hive,并通过添加和删除一张数据库表的演示,来展示了Hive和Mysql元数据库之间的关系操作。
5.& 常见问题
 5.1 直接通过hive命令来启动,则会出现一下错误信息:   
Logging initialized using configuration in jar:file:/opt/apache-hive-0.13/lib/hive-common-0.13.1.jar!/hive-log4j.properties
Exception in thread &main& java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.&init&(RetryingMetaStoreClient.java:62)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
... 7 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
... 12 more
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:336)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.&init&(HiveMetaStoreClient.java:214)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.&init&(RetryingMetaStoreClient.java:62)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 19 more
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:382)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.&init&(HiveMetaStoreClient.java:214)
... 17 more
  解决办法: hive --service metastore , 用这条命令来启动hive。
  5.2 配置 hive.metastore.uris
&&&&&&& a.不启动metastore和hiveserver服务
&&&&&&&&& 直接命令行hive进入hive shell环境,然后执行show databases&&报错如下:
&&&& && & && ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate&&& org.apache.hadoop.hive.metastore.HiveMetaStoreClient
&&&&  b.第二种种情况
&&&&&&&&& 1.配置 hive.metastore.uris
&&&&&&&&& 2.启动metastore服务&&hive&&--service metastore
&&&&&&&& 然后直接命令行hive进入hive shell环境,然后执行show databases& &
&&&&&&&& c:第三种情况
&&&&&&&&&& 1.注释配置项 hive.metastore.uris
&&&&&&&&&& 2.不启动metastore服务
&&&&&&&&&& 然后直接命令行hive进入hive shell环境,然后执行show databases
参考知识库
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
访问:578619次
积分:7209
积分:7209
排名:第1972名
原创:221篇
转载:49篇
评论:54条
(5)(5)(3)(5)(1)(7)(3)(1)(1)(7)(6)(4)(7)(2)(7)(2)(5)(1)(2)(10)(1)(9)(3)(5)(3)(2)(1)(2)(1)(4)(14)(8)(6)(9)(1)(4)(12)(1)(1)(2)(2)(1)(3)(2)(1)(2)(3)(1)(2)(1)(3)(1)(5)(4)(2)(1)(1)(4)(1)(13)(23)(26)(3)(1)(3)4220人阅读
elasticserch(4)
Hadoop(22)
ElasticSearch是一个基于Lucene构建的开源,分布式,RESTful搜索引擎。设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。
hive是一个基于hdfs的数据仓库,方便使用者可以通过一种类sql(HiveQL)的语言对hdfs上面的打数据进行访问,通过elasticsearch与hive的结合来实现对hdfs上面的数据实时访问的效果。
在上面的图中描述了日志通过Flume Collector 流到Sink 然后进入hdfs和elastic search,然后可以通过es的接口可以实时将一些趋势 比如当前用户数 请求次数等展示在图表中实现数据可视化。
要作集成需要在hive上有两个表,一个是原数据表,另外一个类似于在元数据表上面建立的view,但是并不是数据的存储& 下面是作者在邮件列表里边的描述,网址http://elasticsearch-users.115913./Elasticsearch-Hadoop-td4047293.html
There is no duplication per-se in HDFS. Hive tables are just 'views' of data - one sits unindexed, in raw format in HDFS
the other one is indexed and analyzed in Elasticsearch.
You can't combine the two since they are completely different things - one is a file-system, the other one is a search
and analytics engine.
首先 我们要获得elasticsearc-hadoop的jar包,可以通过maven方式取得:
&dependency&
&groupId&org.elasticsearch&/groupId&
&artifactId&elasticsearch-hadoop&/artifactId&
&version&2.0.1&/version&
&/dependency&
这个地址是elasticsearch-hadoop的github地址:/elasticsearch/elasticsearch-hadoop#readme
目前最新的版本是2.0.1 这个版本能支持目前所有的hadoop衍生版本。
取得这个jar包之后,可以将其拷贝到hive的lib目录中,然后以如下方式打开hive命令窗口:
&span style=&font-size:18&&bin/hive
-hiveconf hive.aux.jars.path=/home/hadoop/hive/lib/elasticsearch-hadoop-2.0.1.jar&/span&这个也可以写在hive的配置文件中,
建立view表
&span style=&font-size:18&&CREATE EXTERNAL TABLE user
(id INT, name STRING)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES('es.resource' = 'radiott/artiststt','es.index.auto.create' = 'true');&/span&
es.resource的radiott/artiststt分别是索引名和索引的类型,这个是在es访问数据时候使用的。
然后建立源数据表
CREATE TABLE user_source
(id INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
数据示例:
将数据导入到user_source表中:
LOAD DATA LOCAL INPATH '/home/hadoop/files1.txt' OVERWRITE INTO TABLE &span style=&font-size:18&&user_source&/span&;
hive& select * from user_
Time taken: 3.4 seconds, Fetched: 4 row(s)
将数据导入到user表中:
INSERT OVERWRITE TABLE user
SELECT s.id, s.name FROM user_
hive& INSERT OVERWRITE TABLE user
SELECT s.id, s.name FROM user_
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_5_0007, Tracking URL = N/A
Kill Command = /home/hadoop/hadoop/bin/hadoop job
-kill job_5_0007
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
17:44:04,121 Stage-0 map = 0%,
reduce = 0%
17:45:04,360 Stage-0 map = 0%,
reduce = 0%, Cumulative CPU 1.21 sec
17:45:05,505 Stage-0 map = 0%,
reduce = 0%, Cumulative CPU 1.21 sec
17:45:06,707 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.29 sec
17:45:07,728 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.29 sec
17:45:08,757 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.29 sec
17:45:09,778 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.29 sec
17:45:10,800 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.29 sec
17:45:11,915 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.29 sec
17:45:12,969 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:14,231 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:15,258 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:16,300 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:17,326 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:18,352 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:19,374 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:20,396 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:21,423 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:22,447 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
17:45:23,475 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.42 sec
MapReduce Total cumulative CPU time: 1 seconds 420 msec
Ended Job = job_5_0007
MapReduce Jobs Launched:
Job 0: Map: 1
Cumulative CPU: 1.42 sec
HDFS Read: 253 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 420 msec
Time taken: 113.778 seconds
这时候在elcasticsearch的目录下面就用radiott的索引目录了。。&span style=&font-size:18&&radiott&/span&
hadoop@caozw:~/elasticsearch-1.3.3/data/elasticsearch/nodes/0/indices$ ls
index1demo
通过eclasticsearch的head插件 可以看到数据:
可以通过如下java程序去访问数据了
.bhh.example.analysis.elasticsearch.
.bhh.example.analysis.elasticsearch.local.DataF
.bhh.example.analysis.elasticsearch.local.ElasticSearchH
.bhh.example.analysis.elasticsearch.local.M
import org.elasticsearch.action.search.SearchRequestB
import org.elasticsearch.action.search.SearchR
import org.elasticsearch.action.search.SearchT
import org.elasticsearch.client.C
import org.elasticsearch.client.transport.TransportC
import mon.transport.InetSocketTransportA
import org.elasticsearch.hadoop.hive.EsStorageH
import org.elasticsearch.index.query.BoolQueryB
import org.elasticsearch.index.query.QueryB
import org.elasticsearch.index.query.QueryB
import org.elasticsearch.index.query.QueryStringQueryB
import org.elasticsearch.search.SearchH
import org.elasticsearch.search.SearchH
import java.util.ArrayL
import java.util.L
* Created by caozw on 10/8/14.
public class Test {
public Test(){
//使用本机做为节点
this(&127.0.0.1&);
public Test(String ipAddress){
//集群连接超时设置
Settings settings = ImmutableSettings.settingsBuilder().put(&client.transport.ping_timeout&, &10s&).build();
client = new TransportClient(settings);
client = new TransportClient().addTransportAddress(new InetSocketTransportAddress(ipAddress, 9300));
public List&Medicine& searcher(QueryBuilder queryBuilder, String indexname, String type){
SearchRequestBuilder builder = client.prepareSearch(indexname).setTypes(type).setSearchType(SearchType.DEFAULT).setFrom(0).setSize(100);
builder.setQuery(queryBuilder);
SearchResponse response = builder.execute().actionGet();
System.out.println(&
& + response);
//System.out.println(response.getHits().getTotalHits());
List&Medicine& list = new ArrayList&Medicine&();
SearchHits hits = response.getHits();
SearchHit[] searchHists = hits.getHits();
if(searchHists.length&0){
for(SearchHit hit:searchHists){
Integer id = (Integer)hit.getSource().get(&id&);
String name =
(String) hit.getSource().get(&name&);
//String function =
(String) hit.getSource().get(&funciton&);
String function = &&;
list.add(new Medicine(id, name, function));
public static void main(String[] args) {
Test esHandler = new Test();
//List&String& jsondata = DataFactory.getInitJsonData();
//List&String& jsondata = DataFactory.getInitJsonData();
String indexname = &radiott&;
String type = &artiststt&;
//esHandler.createIndexResponse(indexname, type, jsondata);
//查询条件
/*QueryBuilder queryBuilder = QueryBuilders.fuzzyQuery(&name&, &银花 感冒 颗粒&);*/
BoolQueryBuilder qb = QueryBuilders.boolQuery().must(new QueryStringQueryBuilder(&lcdem&).field(&name&));
//.should(new QueryStringQueryBuilder(&解表&).field(&function&));
/*QueryBuilder queryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery(&id&, 1));*/
List&Medicine& result = esHandler.searcher(qb, indexname, type);
for(int i=0; i&result.size(); i++){
Medicine medicine = result.get(i);
System.out.println(&(& + medicine.getId() + &)姓名:& +medicine.getName() + &\t\t& + medicine.getFunction());
运行结果:
/home/hadoop/jdk1.7.0_67/bin/java -Didea.launcher.port=7533 -Didea.launcher.bin.path=/home/hadoop/idea-IU-135.909/bin -Dfile.encoding=UTF-8 -classpath /home/hadoop/jdk1.7.0_67/jre/lib/rt.jar:/home/hadoop/jdk1.7.0_67/jre/lib/jsse.jar:/home/hadoop/jdk1.7.0_67/jre/lib/charsets.jar:/home/hadoop/jdk1.7.0_67/jre/lib/jfxrt.jar:/home/hadoop/jdk1.7.0_67/jre/lib/resources.jar:/home/hadoop/jdk1.7.0_67/jre/lib/plugin.jar:/home/hadoop/jdk1.7.0_67/jre/lib/jce.jar:/home/hadoop/jdk1.7.0_67/jre/lib/javaws.jar:/home/hadoop/jdk1.7.0_67/jre/lib/management-agent.jar:/home/hadoop/jdk1.7.0_67/jre/lib/deploy.jar:/home/hadoop/jdk1.7.0_67/jre/lib/jfr.jar:/home/hadoop/jdk1.7.0_67/jre/lib/ext/localedata.jar:/home/hadoop/jdk1.7.0_67/jre/lib/ext/sunjce_provider.jar:/home/hadoop/jdk1.7.0_67/jre/lib/ext/zipfs.jar:/home/hadoop/jdk1.7.0_67/jre/lib/ext/sunec.jar:/home/hadoop/jdk1.7.0_67/jre/lib/ext/dnsns.jar:/home/hadoop/jdk1.7.0_67/jre/lib/ext/sunpkcs11.jar:/home/hadoop/IdeaProjects/XingXuntongDemo/target/classes:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-client/2.3.0-cdh5.1.2/hadoop-client-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-common/2.3.0-cdh5.1.2/hadoop-common-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-annotations/2.3.0-cdh5.1.2/hadoop-annotations-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/com/google/guava/guava/12.0.1/guava-12.0.1.jar:/home/hadoop/apache-maven-3.1.1/repo/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/commons/commons-math3/3.1.1/commons-math3-3.1.1.jar:/home/hadoop/apache-maven-3.1.1/repo/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-codec/commons-codec/1.7/commons-codec-1.7.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-io/commons-io/2.4/commons-io-2.4.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-net/commons-net/3.1/commons-net-3.1.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/home/hadoop/apache-maven-3.1.1/repo/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar:/home/hadoop/apache-maven-3.1.1/repo/org/slf4j/slf4j-log4j12/1.7.5/slf4j-log4j12-1.7.5.jar:/home/hadoop/apache-maven-3.1.1/repo/org/codehaus/jackson/jackson-core-asl/1.8.8/jackson-core-asl-1.8.8.jar:/home/hadoop/apache-maven-3.1.1/repo/org/codehaus/jackson/jackson-mapper-asl/1.8.8/jackson-mapper-asl-1.8.8.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/avro/avro/1.7.5-cdh5.1.2/avro-1.7.5-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/xerial/snappy/snappy-java/1.0.5/snappy-java-1.0.5.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/tukaani/xz/1.0/xz-1.0.jar:/home/hadoop/apache-maven-3.1.1/repo/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-auth/2.3.0-cdh5.1.3/hadoop-auth-2.3.0-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/httpcomponents/httpclient/4.2.5/httpclient-4.2.5.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/httpcomponents/httpcore/4.2.5/httpcore-4.2.5.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/directory/server/apacheds-kerberos-codec/2.0.0-M15/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/directory/server/apacheds-i18n/2.0.0-M15/apacheds-i18n-2.0.0-M15.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/directory/api/api-asn1-api/1.0.0-M20/api-asn1-api-1.0.0-M20.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/directory/api/api-util/1.0.0-M20/api-util-1.0.0-M20.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/zookeeper/zookeeper/3.4.5-cdh5.1.3/zookeeper-3.4.5-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-hdfs/2.3.0-cdh5.1.2/hadoop-hdfs-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/mortbay/jetty/jetty-util/6.1.26.cloudera.2/jetty-util-6.1.26.cloudera.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-mapreduce-client-app/2.3.0-cdh5.1.2/hadoop-mapreduce-client-app-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-mapreduce-client-common/2.3.0-cdh5.1.2/hadoop-mapreduce-client-common-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-yarn-common/2.3.0-cdh5.1.2/hadoop-yarn-common-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-yarn-api/2.3.0-cdh5.1.2/hadoop-yarn-api-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/javax/xml/bind/jaxb-api/2.1/jaxb-api-2.1.jar:/home/hadoop/apache-maven-3.1.1/repo/javax/activation/activation/1.1/activation-1.1.jar:/home/hadoop/apache-maven-3.1.1/repo/com/sun/jersey/jersey-core/1.8/jersey-core-1.8.jar:/home/hadoop/apache-maven-3.1.1/repo/com/sun/jersey/jersey-server/1.8/jersey-server-1.8.jar:/home/hadoop/apache-maven-3.1.1/repo/asm/asm/3.1/asm-3.1.jar:/home/hadoop/apache-maven-3.1.1/repo/com/sun/jersey/jersey-json/1.8/jersey-json-1.8.jar:/home/hadoop/apache-maven-3.1.1/repo/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar:/home/hadoop/apache-maven-3.1.1/repo/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/codehaus/jackson/jackson-jaxrs/1.8.8/jackson-jaxrs-1.8.8.jar:/home/hadoop/apache-maven-3.1.1/repo/org/codehaus/jackson/jackson-xc/1.7.1/jackson-xc-1.7.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-yarn-client/2.3.0-cdh5.1.2/hadoop-yarn-client-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-mapreduce-client-core/2.3.0-cdh5.1.2/hadoop-mapreduce-client-core-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-yarn-server-common/2.3.0-cdh5.1.2/hadoop-yarn-server-common-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-mapreduce-client-shuffle/2.3.0-cdh5.1.2/hadoop-mapreduce-client-shuffle-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-mapreduce-client-jobclient/2.3.0-cdh5.1.2/hadoop-mapreduce-client-jobclient-2.3.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/mysql/mysql-connector-java/5.1.30/mysql-connector-java-5.1.30.jar:/home/hadoop/apache-maven-3.1.1/repo/redis/clients/jedis/2.4.2/jedis-2.4.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/commons/commons-pool2/2.0/commons-pool2-2.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-server/0.98.1-cdh5.1.3/hbase-server-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-common/0.98.1-cdh5.1.3/hbase-common-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-core/2.3.0-mr1-cdh5.1.3/hadoop-core-2.3.0-mr1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/mortbay/jetty/jetty/6.1.26.cloudera.2/jetty-6.1.26.cloudera.2.jar:/home/hadoop/apache-maven-3.1.1/repo/tomcat/jasper-runtime/5.5.23/jasper-runtime-5.5.23.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-el/commons-el/1.0/commons-el-1.0.jar:/home/hadoop/apache-maven-3.1.1/repo/tomcat/jasper-compiler/5.5.23/jasper-compiler-5.5.23.jar:/home/hadoop/apache-maven-3.1.1/repo/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar:/home/hadoop/apache-maven-3.1.1/repo/net/java/dev/jets3t/jets3t/0.6.1/jets3t-0.6.1.jar:/home/hadoop/apache-maven-3.1.1/repo/hsqldb/hsqldb/1.8.0.10/hsqldb-1.8.0.10.jar:/home/hadoop/apache-maven-3.1.1/repo/org/eclipse/jdt/core/3.1.1/core-3.1.1.jar:/home/hadoop/apache-maven-3.1.1/repo/com/github/stephenc/findbugs/findbugs-annotations/1.3.9-1/findbugs-annotations-1.3.9-1.jar:/home/hadoop/apache-maven-3.1.1/repo/junit/junit/4.11/junit-4.11.jar:/home/hadoop/apache-maven-3.1.1/repo/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-protocol/0.98.1-cdh5.1.3/hbase-protocol-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-client/0.98.1-cdh5.1.3/hbase-client-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/io/netty/netty/3.6.6.Final/netty-3.6.6.Final.jar:/home/hadoop/apache-maven-3.1.1/repo/org/cloudera/htrace/htrace-core/2.04/htrace-core-2.04.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-prefix-tree/0.98.1-cdh5.1.3/hbase-prefix-tree-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-hadoop-compat/0.98.1-cdh5.1.3/hbase-hadoop-compat-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-hadoop2-compat/0.98.1-cdh5.1.3/hbase-hadoop2-compat-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/com/yammer/metrics/metrics-core/2.1.2/metrics-core-2.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/commons/commons-math/2.1/commons-math-2.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/mortbay/jetty/jetty-sslengine/6.1.26.cloudera.2/jetty-sslengine-6.1.26.cloudera.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/mortbay/jetty/jsp-2.1/6.1.14/jsp-2.1-6.1.14.jar:/home/hadoop/apache-maven-3.1.1/repo/org/mortbay/jetty/jsp-api-2.1/6.1.14/jsp-api-2.1-6.1.14.jar:/home/hadoop/apache-maven-3.1.1/repo/org/mortbay/jetty/servlet-api-2.5/6.1.14/servlet-api-2.5-6.1.14.jar:/home/hadoop/apache-maven-3.1.1/repo/org/jamon/jamon-runtime/2.3.1/jamon-runtime-2.3.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-hdfs/2.3.0-cdh5.1.3/hadoop-hdfs-2.3.0-cdh5.1.3-tests.jar:/home/hadoop/apache-maven-3.1.1/repo/commons-daemon/commons-daemon/1.0.13/commons-daemon-1.0.13.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-thrift/0.98.1-cdh5.1.3/hbase-thrift-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/thrift/libthrift/0.9.0/libthrift-0.9.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-testing-util/0.98.1-cdh5.1.3/hbase-testing-util-0.98.1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-common/0.98.1-cdh5.1.3/hbase-common-0.98.1-cdh5.1.3-tests.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-server/0.98.1-cdh5.1.3/hbase-server-0.98.1-cdh5.1.3-tests.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-hadoop-compat/0.98.1-cdh5.1.3/hbase-hadoop-compat-0.98.1-cdh5.1.3-tests.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hbase/hbase-hadoop2-compat/0.98.1-cdh5.1.3/hbase-hadoop2-compat-0.98.1-cdh5.1.3-tests.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-minicluster/2.3.0-mr1-cdh5.1.3/hadoop-minicluster-2.3.0-mr1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-test/2.3.0-mr1-cdh5.1.3/hadoop-test-2.3.0-mr1-cdh5.1.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/ftpserver/ftplet-api/1.0.0/ftplet-api-1.0.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/mina/mina-core/2.0.0-M5/mina-core-2.0.0-M5.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/ftpserver/ftpserver-core/1.0.0/ftpserver-core-1.0.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/ftpserver/ftpserver-deprecated/1.0.0-M2/ftpserver-deprecated-1.0.0-M2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hadoop/hadoop-common/2.3.0-cdh5.1.3/hadoop-common-2.3.0-cdh5.1.3-tests.jar:/home/hadoop/apache-maven-3.1.1/repo/com/jcraft/jsch/0.1.42/jsch-0.1.42.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-common/0.12.0-cdh5.1.2/hive-common-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-shims/0.12.0-cdh5.1.2/hive-shims-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/shims/hive-shims-common/0.12.0-cdh5.1.2/hive-shims-common-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/shims/hive-shims-common-secure/0.12.0-cdh5.1.2/hive-shims-common-secure-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/shims/hive-shims-0.23/0.12.0-cdh5.1.2/hive-shims-0.23-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-serde/0.12.0-cdh5.1.2/hive-serde-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-metastore/0.12.0-cdh5.1.2/hive-metastore-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/com/jolbox/bonecp/0.7.1.RELEASE/bonecp-0.7.1.RELEASE.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/derby/derby/10.4.2.0/derby-10.4.2.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/datanucleus/datanucleus-api-jdo/3.2.1/datanucleus-api-jdo-3.2.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/datanucleus/datanucleus-core/3.2.2/datanucleus-core-3.2.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/datanucleus/datanucleus-rdbms/3.2.1/datanucleus-rdbms-3.2.1.jar:/home/hadoop/apache-maven-3.1.1/repo/javax/jdo/jdo-api/3.0.1/jdo-api-3.0.1.jar:/home/hadoop/apache-maven-3.1.1/repo/javax/transaction/jta/1.1/jta-1.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/antlr/antlr-runtime/3.4/antlr-runtime-3.4.jar:/home/hadoop/apache-maven-3.1.1/repo/org/antlr/stringtemplate/3.2.1/stringtemplate-3.2.1.jar:/home/hadoop/apache-maven-3.1.1/repo/antlr/antlr/2.7.7/antlr-2.7.7.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/thrift/libfb303/0.9.0/libfb303-0.9.0.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-jdbc/0.12.0-cdh5.1.2/hive-jdbc-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-service/0.12.0-cdh5.1.2/hive-service-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-exec/0.12.0-cdh5.1.2/hive-exec-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/hive/hive-ant/0.12.0-cdh5.1.2/hive-ant-0.12.0-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/ant/ant/1.9.1/ant-1.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/ant/ant-launcher/1.9.1/ant-launcher-1.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/velocity/velocity/1.5/velocity-1.5.jar:/home/hadoop/apache-maven-3.1.1/repo/oro/oro/2.0.8/oro-2.0.8.jar:/home/hadoop/apache-maven-3.1.1/repo/com/twitter/parquet-hadoop-bundle/1.2.5-cdh5.1.2/parquet-hadoop-bundle-1.2.5-cdh5.1.2.jar:/home/hadoop/apache-maven-3.1.1/repo/org/antlr/ST4/4.0.4/ST4-4.0.4.jar:/home/hadoop/apache-maven-3.1.1/repo/org/codehaus/groovy/groovy-all/2.1.6/groovy-all-2.1.6.jar:/home/hadoop/apache-maven-3.1.1/repo/stax/stax-api/1.0.1/stax-api-1.0.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/elasticsearch/elasticsearch-hadoop/2.0.1/elasticsearch-hadoop-2.0.1.jar:/home/hadoop/apache-maven-3.1.1/repo/joda-time/joda-time/1.6/joda-time-1.6.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/pig/pig/0.13.0/pig-0.13.0.jar:/home/hadoop/apache-maven-3.1.1/repo/net/sf/kosmosfs/kfs/0.3/kfs-0.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/elasticsearch/elasticsearch/1.3.3/elasticsearch-1.3.3.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-core/4.9.1/lucene-core-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-analyzers-common/4.9.1/lucene-analyzers-common-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-codecs/4.9.1/lucene-codecs-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-queries/4.9.1/lucene-queries-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-memory/4.9.1/lucene-memory-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-highlighter/4.9.1/lucene-highlighter-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-queryparser/4.9.1/lucene-queryparser-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-sandbox/4.9.1/lucene-sandbox-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-suggest/4.9.1/lucene-suggest-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-misc/4.9.1/lucene-misc-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-join/4.9.1/lucene-join-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-grouping/4.9.1/lucene-grouping-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/apache/lucene/lucene-spatial/4.9.1/lucene-spatial-4.9.1.jar:/home/hadoop/apache-maven-3.1.1/repo/com/spatial4j/spatial4j/0.4.1/spatial4j-0.4.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/ow2/asm/asm/4.1/asm-4.1.jar:/home/hadoop/apache-maven-3.1.1/repo/org/ow2/asm/asm-commons/4.1/asm-commons-4.1.jar:/home/hadoop/idea-IU-135.909/lib/idea_rt.jar com.intellij.rt.execution.application..bhh.example.analysis.elasticsearch.hive.Test
14/10/08 18:02:24 INFO elasticsearch.plugins: [Termagaira] loaded [], sites []
&took& : 90,
&timed_out& : false,
&_shards& : {
&total& : 5,
&successful& : 5,
&failed& : 0
&hits& : {
&total& : 2,
&max_score& : 1.4054651,
&hits& : [ {
&_index& : &radiott&,
&_type& : &artiststt&,
&_id& : &Zc0L0HXxQ2m69Oif0hAwGQ&,
&_score& : 1.4054651,
&_source&:{&id&:2,&name&:&lcdem&}
&_index& : &radiott&,
&_type& : &artiststt&,
&_id& : &5bZnD4BRTjmdmCPmVM6cBw&,
&_score& : 1.0,
&_source&:{&id&:2,&name&:&lcdem&}
(2)姓名:lcdem
(2)姓名:lcdem 另外一种建表方式:
CREATE EXTERNAL TABLE artiststt (
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'radiott/artiststt', 'es.query' = '?q=me*');
导入user_source表中的数据后查询结果:
hive& select *
Time taken: 0.585 seconds, Fetched: 1 row(s)
而第一种方式使用hiveql语句查询的时候会报错:
hive& select *
Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
Time taken: 0.472 seconds
hive& CREATE EXTERNAL TABLE artiststt1 (
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'radiott1/artiststt1', 'es.query' = '?q=*');
Time taken: 0.986 seconds
hive& INSERT OVERWRITE TABLE artiststt1
SELECT s.id, s.name FROM user_
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_5_0010, Tracking URL = http://caozw:8088/proxy/application_5_0010/
Kill Command = /home/hadoop/hadoop/bin/hadoop job
-kill job_5_0010
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
18:07:21,587 Stage-0 map = 0%,
reduce = 0%
18:07:48,337 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.45 sec
18:07:49,579 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.45 sec
18:07:50,605 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.45 sec
18:07:54,561 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.45 sec
18:07:55,580 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.45 sec
18:07:56,600 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.45 sec
MapReduce Total cumulative CPU time: 1 seconds 450 msec
Ended Job = job_5_0010
MapReduce Jobs Launched:
Job 0: Map: 1
Cumulative CPU: 1.45 sec
HDFS Read: 253 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 450 msec
Time taken: 58.285 seconds
hive& select * from artiststt1;
Time taken: 0.609 seconds, Fetched: 4 row(s)
参考知识库
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
访问:96321次
积分:1378
积分:1378
排名:千里之外
原创:39篇
评论:19条
(13)(2)(9)(10)(6)(5)

我要回帖

更多关于 hive性能测试 的文章

 

随机推荐