cost–data driven 什么意思的意思

科学教育 | 学习帮助 | 出国/留学 | 工程技术科学 | 教育/科学 | 英语听力 | 梦幻西游电脑版 | 视频会议 | 口臭 | 暗黑破坏神3（游戏） | 面相 | 赛尔号 | linux | 山西省 | Xbox One | 思修 | 易经 | solidworks | 钢铁雄心4 | 休闲游戏 | 魔兽争霸3混乱之治 | 显卡 | 武汉大学 | 塞尔达传说（游戏） | 校服 | 剑侠情缘网络版叁 | 脱发 | 日本文化 | 数学建模 | 二次元 | 部落冲突（游戏） | 肖战 | 街机游戏 | 拳皇 | 马鞍山市 | 扑克 | 完美世界（游戏） | 三国志（游戏） | 热血传奇（游戏） | 意大利 | 跆拳道 | 东莞市 | 糖尿病 | 古琴 | 三国 | 电视节目 | 百度 | qq音乐 | 配音 | 电视 | 任天堂 | 科幻小说 | 虚拟专用服务器 | QQ游戏 | 大熊猫 | 微电影 | Android | 竞技游戏 | 动画制作 | QQ炫舞 | 电源 | 日语 | 魔兽争霸3冰封王座 | 产业 | ios开发 | 百度云 | 动画电影 | nba篮球 | 羽生结弦 | iOS应用 | galgame | 电吉他 | 平板电脑 | 周星驰（人物） | 离婚 | 后宫·甄嬛传（书籍） | 牙科 | 游戏开发 | 网络直播 | ios游戏 | 电子邮件 | SNH48 | 民国 | 美容 | 舰队 Collection | 心理 | Mac | 羽毛球技术 | 互联网公司 | 大学生兼职 | 烘焙 | 诸葛亮 | 跑跑卡丁车 | 武侠小说 | 微博 | 骨折 | 掌上游戏机 | 玉米 | 中国足球 | 电脑配置 | 洛奇英雄传 | 硬盘 | 张璐 | akb48 | 炉石传说 | 韩国 | 蓄电池 | QQ空间 | 房贷 | 麦克风 | 相声演员 | 抑郁 | 天下2（游戏） | 农业科学 | 神话 | 农历 | 中国足球协会超级联赛（CSL） | 流星花园 | 易烊千玺 | 火影忍者 | 日语歌曲 | 巴西 | 红酒 | 化疗 | 占地 | 网络小说 | 香烟 | 传奇世界 | 名字 | 日本电影 | 表演 | 西藏自治区 | 英雄传说：闪之轨迹（游戏） | 足球彩票 | 摩尔庄园 | 中国工商银行 | 游戏手柄 | 陈奕迅 | 联赛 | 天体物理学 | 英格兰足球超级联赛 | 超级机器人大战 | 命令与征服：红色警戒2（游戏） | 郭富城 | 一级方程式赛车（f1） | Adobe Photoshop | 英文歌曲 | 玄幻小说 | 猫和老鼠 | 杨凡 | 书籍改编电影 | 俄罗斯 | 网络赚钱 | 罗玉凤 | 刺客信条2 | 角色扮演 | 食物 | 药物 | 杨洋（演员） | 信息安全 | 胡歌（演员） | 张子枫 | 古典音乐 | 时尚 | 大片 | 电脑游戏 | 签证 | 徐佳莹 | 耽美 | 游戏攻略 | 音乐剧 | 前女友 | 男性 | 肠胃 | 刺客信条起源 | 剧场版 | 国际足联世界杯 | 彩虹六号（游戏） | 赵丽颖（演员） | 天体生物学 | 战神（游戏） | 吉他学习 | 飞机 | 三菱商事 | 关节炎 | 斗鱼直播 | 发电 | 张继科 | 华语流行音乐 | 搏击项目 | 主题曲 | 李信 | 刘德华（演员） | 即时战略游戏（RTS） | 欧阳娜娜 | 网址导航 | 海贼王 | 山地车 | 豆瓣电影 | 广场舞 |

你的位置：网站首页 >> 频道首页 >>教育 >>cost–data driven 什么意思的意思

cost–data driven 什么意思的意思

来源：蜘蛛抓取(WebSpider) 时间：2016-10-23 15:11 标签： data driven 什么意思

财务会计第四讲_图文_百度文库
两大类热门资源免费畅读
续费一年阅读会员，立省24元！
财务会计第四讲
上传于||文档简介
&&财务会计
阅读已结束，如果下载本文需要使用1下载券
想免费下载本文？
定制HR最喜欢的简历
下载文档到电脑，查找使用更方便
还剩96页未读，继续阅读
定制HR最喜欢的简历
你可能喜欢不规则动词表!_百度知道“Cost-free” joins – 2 - 推酷
“Cost-free” joins – 2
I've demonstrated an unexpected Nested Loop Join caused by an extreme data distribution. Although unexpected at first sight, the performance of the execution plan selected by the optimizer is decent - provided the estimates are in the right ballpark.
Here is another case of an unexpected execution plan, this time about
Merge Joins
Merge Joins
In order to appreciate why the execution plan encountered is unexpected, first a quick summary about how Merge Joins work:
A Merge Join is essentially a
Nested Loop operation
row source into another
row source. In contrast to a Nested Loop the join condition is not used for a possible index-driven lookup from the driving, outer row source into the inner row source, simply because Oracle usually first needs to run separate operations on each rowsource for sorting.
This means that in most cases the Merge Join requires to sort
both row sources
and therefore a Hash Join is usually preferred where possible (for example, Hash Joins are only suitable for
Equi-Joins
, whereas a Merge Join also supports non-Equi Joins), because it only needs to &prepare&
one row source
for building the
hash table
, and can then process the second row source as it is without any further start-up cost / preparation steps.
Let's have a look at some common execution plans using Merge Joins. Consider this simple setup:
create table t1asselect
rownum as id
, rpad('x', 100) as fillerfrom
dualconnect by
level &= 1e3;exec dbms_stats.gather_table_stats(null, 't1')create unique index t1_idx on t1 (id);create table t2asselect
rownum as id
, rpad('x', 100) as fillerfrom
dualconnect by
level &= 1e6;exec dbms_stats.gather_table_stats(null, 't2')
So this is what a Merge Join usually looks like:
/*+ use_merge(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_fillerfrom
t1.id (+) = t2.fk;------------------------------------------------------------------------------------| Id
| Operation
| Name | Rows
| Bytes |TempSpc| Cost (%CPU)| Time
|------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
(1)| 00:05:37 ||
MERGE JOIN OUTER
(1)| 00:05:37 ||
217M| 28067
(1)| 00:05:37 ||
TABLE ACCESS FULL| T2
(1)| 00:00:52 ||*
(15)| 00:00:01 ||
TABLE ACCESS FULL| T1
(0)| 00:00:01 |------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------
4 - access(&T1&.&ID&(+)=&T2&.&FK&)
filter(&T1&.&ID&(+)=&T2&.&FK&)
As usual I had to force the Merge Join via a
, since in my (default 11.2.0.1) setup a Hash Join would be preferred. Notice the two
operations that first create two (ideally in-memory) sorted/indexed tables from the two row sources to be joined and how the SORT JOIN on the larger row source basically determines the overall cost of this MERGE JOIN.
A corresponding Hash Join could use the smaller row source as hash table and therefore very likely would be much more efficient.
Since the MERGE JOIN usually needs to SORT both row sources it doesn't make such a big difference which of the two row sources comes first, but it is interesting to note that the MERGE JOIN is not able to
&swap& the join inputs
as the HASH JOIN is able to, which, in particular for outer joins, makes the MERGE JOIN less flexible.
Here is a variation of a MERGE JOIN that
avoids a SORT JOIN operation
. This is only supported for the
&driving& row source
/*+ use_merge(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_fillerfrom
t1.id = t2.fkand
t1.id between 1 and 10;-----------------------------------------------------------------------------------------------| Id
| Operation
| Bytes |TempSpc| Cost (%CPU)| Time
|-----------------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
(1)| 00:05:37 ||
MERGE JOIN
(1)| 00:05:37 ||
TABLE ACCESS BY INDEX ROWID| T1
(0)| 00:00:01 ||*
INDEX RANGE SCAN
| T1_IDX |
(0)| 00:00:01 ||*
217M| 28078
(1)| 00:05:37 ||*
TABLE ACCESS FULL
(2)| 00:00:53 |----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id):---------------------------------------------------
3 - access(&T1&.&ID&&=1 AND &T1&.&ID&&=10)
4 - access(&T1&.&ID&=&T2&.&FK&)
filter(&T1&.&ID&=&T2&.&FK&)
5 - filter(&T2&.&FK&&=1 AND &T2&.&FK&&=10)
The MERGE JOIN knows that the driving row source will be accessed in
sorted order
due to the suitable INDEX RANGE SCAN operation and therefore doesn't add a SORT operation on top.
If we now run the same statement using
Parallel Execution
(note that the statement level PARALLEL hint used in the example is only supported from 11g on), we'll see the following:
/*+ use_merge(t1 t2) parallel */
t1.filler as t1_filler
, t2.filler as t2_fillerfrom
t1.id = t2.fkand
t1.id between 1 and 10;------------------------------------------------------------------------------------------------------------------------------------| Id
| Operation
| Bytes |TempSpc| Cost (%CPU)| Time
|IN-OUT| PQ Distrib |------------------------------------------------------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
(1)| 00:03:08 |
PX COORDINATOR
PX SEND QC (RANDOM)
| :TQ10001 |
(1)| 00:03:08 |
Q1,01 | P-&S | QC (RAND)
MERGE JOIN
(1)| 00:03:08 |
Q1,01 | PCWP |
(0)| 00:00:01 |
Q1,01 | PCWP |
BUFFER SORT
Q1,01 | PCWC |
PX RECEIVE
(0)| 00:00:01 |
Q1,01 | PCWP |
PX SEND BROADCAST
| :TQ10000 |
(0)| 00:00:01 |
| S-&P | BROADCAST
TABLE ACCESS BY INDEX ROWID| T1
(0)| 00:00:01 |
INDEX RANGE SCAN
(0)| 00:00:01 |
217M| 15591
(1)| 00:03:08 |
Q1,01 | PCWP |
PX BLOCK ITERATOR
(1)| 00:00:29 |
Q1,01 | PCWC |
TABLE ACCESS FULL
(1)| 00:00:29 |
Q1,01 | PCWP |
|------------------------------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------
9 - access(&T1&.&ID&&=1 AND &T1&.&ID&&=10)
10 - access(&T1&.&ID&=&T2&.&FK&)
filter(&T1&.&ID&=&T2&.&FK&)
12 - filter(&T2&.&FK&&=1 AND &T2&.&FK&&=10)
So usually, due to the way things run in parallel, Oracle assumes it
cannot guarantee the order
of the row source and
includes a SORT operation
for both row sources joined.
Although there are special cases where this could be avoided even for Parallel Execution, it looks like the code adds this SORT operation
unconditionally
in case of Parallel Execution. We'll see how this can become a threat in a moment.
The Special Case
Now back to the special case I want to demonstrate here. Let's have a look at the following query:
t1.filler as t1_filler
, t2.filler as t2_fillerfrom
t1.id (+) = t2.fkand
t2.fk = 1----------------------------------------------------------------------------------------| Id
| Operation
| Bytes | Cost (%CPU)| Time
|----------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
(1)| 00:00:53 ||
MERGE JOIN OUTER
(1)| 00:00:53 ||*
TABLE ACCESS FULL
(1)| 00:00:53 ||*
(34)| 00:00:01 ||
TABLE ACCESS BY INDEX ROWID| T1
(0)| 00:00:01 ||*
INDEX UNIQUE SCAN
| T1_IDX |
(0)| 00:00:01 |---------------------------------------------------------------------------------------- Predicate Information (identified by operation id):---------------------------------------------------
2 - filter(&T2&.&FK&=1)
3 - access(&T1&.&ID&(+)=&T2&.&FK&)
filter(&T1&.&ID&(+)=&T2&.&FK&)
5 - access(&T1&.&ID&(+)=1)
Notice that I now got a
MERGE JOIN
although I haven't provided any hints to do so, so this execution plan was automatically favored by optimizer. Why?
This is a special case, because the optimizer understands that the join key is actually a
single value
, due to the predicate on T2.FK. So for a serial execution it doesn't bother to SORT the large row source (since it knows there will only be the value &1&) and hence the MERGE JOIN comes out with a (slightly) lower cost estimate than a corresponding HASH JOIN.
It's interesting to note that in this particular case here it could even be avoided to SORT the second row source, since it, too, can only return a single value. But obviously the MERGE JOIN always runs a SORT JOIN operation on the second row source, as already outlined above.
Due to the way the data is designed and the direction of the outer join a
NESTED LOOP join
isn't a reasonable alternative either here.
Note that at runtime a
seems to be slightly more efficient in this particular case here, so this is already an indication that the cost estimates do not reflect the efficiency at runtime very well, in particular the
CPU overhead
of the actual join operation seems to be underestimated for the MERGE JOIN.
Now let's see what happens if we run this query using
Parallel Execution
/*+ parallel */
t1.filler as t1_filler
, t2.filler as t2_fillerfrom
t1.id (+) = t2.fkand
t2.fk = 1-----------------------------------------------------------------------------------------------------------------------| Id
| Operation
| Bytes | Cost (%CPU)| Time
|IN-OUT| PQ Distrib |-----------------------------------------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
(1)| 00:00:29 |
MERGE JOIN OUTER
(1)| 00:00:29 |
(1)| 00:00:29 |
PX COORDINATOR
PX SEND QC (RANDOM)
| :TQ10000 |
(1)| 00:00:29 |
Q1,00 | P-&S | QC (RAND)
PX BLOCK ITERATOR
(1)| 00:00:29 |
Q1,00 | PCWC |
TABLE ACCESS FULL
(1)| 00:00:29 |
Q1,00 | PCWP |
(34)| 00:00:01 |
TABLE ACCESS BY INDEX ROWID| T1
(0)| 00:00:01 |
INDEX UNIQUE SCAN
(0)| 00:00:01 |
|-----------------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------
6 - filter(&T2&.&FK&=1)
7 - access(&T1&.&ID&(+)=&T2&.&FK&)
filter(&T1&.&ID&(+)=&T2&.&FK&)
9 - access(&T1&.&ID&(+)=1)
Look very carefully at the
order of the operations
, and what part of the execution plan runs in
and what is executed
This is where things become pretty weird and threatening: The TABLE ACCESS to the large row source T2 runs in parallel (with the corresponding lower cost), but the data is then handed over to the
Query Coordinator
operation - which wasn't there in serial execution and is in fact unnecessary since we still have a single value in the join key.
After sorting the large row source, the
MERGE JOIN
operation itself is performed by the Query Coordinator, so no
Parallel Execution
is involved here either.
serial SORT JOIN
of the large row source and the
MERGE JOIN
operation itself are
literally free of cost
here, which is clearly unreasonable, in particular if the row source is very large.
Although the SORT JOIN will basically turn into a simple
&BUFFER SORT&
operation, since there is effectively nothing to sort, it still means that a potentially very big volume of data will have to be handed over from the Parallel Worker processes scanning the row source to the Query Coordinator - in this particular case by definition an inefficient operation, because a large data volume has to be passed from
Parallel Processes to the
Query Coordinator - and then this potentially very big volume of data will have to be SORTED by the Query Coordinator, which very likely means that this operation won't fit into
PGA memory
of that single process, hence
spill to TEMP
causing potentially significant additional (and unnecessary)
read and write I/O
, all to be done
by the Query Coordinator.
textbook example
of a Parallel Execution plan that is deemed to take
than the corresponding
serial execution plan
, and it is the execution plan that is preferred by the optimizer when left unhinted.
Let's have a look at the
Parallel Execution
plan when using a
/*+ parallel use_hash(t1 t2) */
t1.filler as t1_filler
, t2.filler as t2_fillerfrom
t1.id (+) = t2.fkand
t2.fk = 1---------------------------------------------------------------------------------------------------------------------------| Id
| Operation
| Bytes | Cost (%CPU)| Time
|IN-OUT| PQ Distrib |---------------------------------------------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
(1)| 00:00:29 |
PX COORDINATOR
PX SEND QC (RANDOM)
| :TQ10001 |
(1)| 00:00:29 |
Q1,01 | P-&S | QC (RAND)
HASH JOIN RIGHT OUTER
(1)| 00:00:29 |
Q1,01 | PCWP |
BUFFER SORT
Q1,01 | PCWC |
PX RECEIVE
(0)| 00:00:01 |
Q1,01 | PCWP |
PX SEND BROADCAST
| :TQ10000 |
(0)| 00:00:01 |
| S-&P | BROADCAST
TABLE ACCESS BY INDEX ROWID| T1
(0)| 00:00:01 |
INDEX UNIQUE SCAN
(0)| 00:00:01 |
PX BLOCK ITERATOR
(1)| 00:00:29 |
Q1,01 | PCWC |
TABLE ACCESS FULL
(1)| 00:00:29 |
Q1,01 | PCWP |
|---------------------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------
3 - access(&T1&.&ID&(+)=&T2&.&FK&)
8 - access(&T1&.&ID&(+)=1)
10 - filter(&T2&.&FK&=1)
Looking at the child operations' cost estimates of the HASH JOIN it becomes obvious that it is the costing of the HASH JOIN itself that makes the whole operation more costly than the MERGE JOIN, which is clearly questionable.
So the strange thing about the MERGE JOIN Parallel Execution plan is that the
join operation
itself is done
, whereas the HASH JOIN execution plan, although it uses the
same access
to the row sources (INDEX UNIQUE SCAN and FULL TABLE SCAN), happily runs in parallel.
What causes this strange execution plan shape? Obviously it is the
UNIQUE index
on the other, smaller row source. Somehow the MERGE JOIN code is mislead by the UNIQUE index scan, which causes the join operation to run serially.
Replacing the UNIQUE index with a
NON-UNIQUE index
(and using a
UNIQUE constraint
on top to achieve the same uniqueness) gives this execution plan:
drop index t1_create index t1_idx on t1 (id);alter table t1 add constraint t1_uq unique (id) using index t1_select
/*+ parallel */
t1.filler as t1_filler
, t2.filler as t2_fillerfrom
t1.id (+) = t2.fkand
t2.fk = 1----------------------------------------------------------------------------------------------------------------------------| Id
| Operation
| Bytes | Cost (%CPU)| Time
|IN-OUT| PQ Distrib |----------------------------------------------------------------------------------------------------------------------------|
0 | SELECT STATEMENT
(1)| 00:00:29 |
PX COORDINATOR
PX SEND QC (RANDOM)
| :TQ10001 |
(1)| 00:00:29 |
Q1,01 | P-&S | QC (RAND)
MERGE JOIN OUTER
(1)| 00:00:29 |
Q1,01 | PCWP |
(1)| 00:00:29 |
Q1,01 | PCWP |
PX BLOCK ITERATOR
(1)| 00:00:29 |
Q1,01 | PCWC |
TABLE ACCESS FULL
(1)| 00:00:29 |
Q1,01 | PCWP |
(34)| 00:00:01 |
Q1,01 | PCWP |
BUFFER SORT
Q1,01 | PCWC |
PX RECEIVE
(0)| 00:00:01 |
Q1,01 | PCWP |
PX SEND BROADCAST
| :TQ10000 |
(0)| 00:00:01 |
| S-&P | BROADCAST
TABLE ACCESS BY INDEX ROWID| T1
(0)| 00:00:01 |
INDEX RANGE SCAN
(0)| 00:00:01 |
|----------------------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------
6 - filter(&T2&.&FK&=1)
7 - access(&T1&.&ID&(+)=&T2&.&FK&)
filter(&T1&.&ID&(+)=&T2&.&FK&)
12 - access(&T1&.&ID&(+)=1)
So now we still have the
unnecessary SORT JOIN
operation of the large row source, but at least the SORT JOIN and MERGE JOIN operations are now executed in
, which should make it far less threatening.
Of course, a corresponding
will still be much more efficient for larger row sources, but needs to be
in this special case here.
MERGE JOINs
there are some special cases where the current costing model doesn't properly reflect the actual work - together with some strange behaviour of the MERGE JOIN code when using
Parallel Execution
this can lead to questionable execution plans preferred by the optimizer.
Carefully check the resulting execution plans when using Parallel Execution and MERGE JOINs get preferred by the optimizer.
已发表评论数()
请填写推刊名
描述不能大于100个字符!
权限设置：公开
仅自己可见
正文不准确
标题不准确
排版有问题
主题不准确
没有分页内容
图片无法显示
视频无法显示
与原文不一致【图文】cost12eppt_01_百度文库
两大类热门资源免费畅读
续费一年阅读会员，立省24元！
评价文档：
cost12eppt_01
上传于||文档简介
&&成本会计  双语
大小：81.00KB
登录百度文库，专享文档复制特权，财富值每天免费拿！
你可能喜欢请大家帮我翻译一下吧！CFR – COST & FREIGHT (…NAMED PORT OF DESTINATION):的中文意思。_百度知道

cost–data driven 什么意思的意思

我要回帖

更多关于 data driven 什么意思的文章

随机推荐

cost–data driven 什么意思的意思

我要回帖

更多关于 data driven 什么意思 的文章

随机推荐

更多关于 data driven 什么意思的文章