能不能在.cpp文件中matlab调用cpp文件.cu里的c函数

Cuda学习笔记(四)——在vs2010中配置.cpp和.cu文件 - CSDN博客
Cuda学习笔记(四)——在vs2010中配置.cpp和.cu文件
  Cuda-c极大程度的方便了我们利用GPU并行处理来加快自己程序的运行速度,但是大多情况下我们的程序是一个极为庞大的工程项目,在这个项目中我们只需要利用cuda来加快其中某一块算法的运行效率,所以很多情况下利用cpp文件来调用cu中的kernel函数,从而完成程序的并行运算。虽然cuda5.0之后可以直接从vs中生成现有的cuda项目,但是利用cpp来调用cu文件的项目还是需要我们自己来进行配置的。
一. 文件配置
  1. 首先假定我们已经写好了调用gpu运算的.cu文件,如下图所示。在这个文件中,我们首先定义调用global设备函数的主函数:
   testghmain();
    并在其前面加上extern “C” ,这个文件定义了a、b两个数组, 并利用gpu来实现c=a+b的运算。
        
  2. 在我们的工程中,需要调用上面这个.cu文件中的函数进行并行加速的cpp文件中,对testghmain();函数进行调用,所以在文件前面需要对其进行事先声明。
  extern “C” int testghmain();
             
  3. 之后,在上面这个.cpp文件中,需要的地方直接调用testghmain()。
              
   在这里我们声明了一个MFC的控件响应函数OnBnClickedbutton来调用testghmain(),当用户点击这个button时,程序首先调用.cu文件中的testghmain函数,而testghmain函数会调用global设备函数,通过gpu实现c=a+b的运算,并把计算结果最终返回给用户。
二. 工程属性配置
  此外,在配置完文件之后,还需要对整个工程进行一些配置:
  1. 右键工程-&属性-&附加依赖项,添加cudart.lib;
  2. 右键工程-&生成自定义-&选择cuda6.5;
  3. 右键添加的cu文件,属性-&常规-&项类型,选择cuda c/c++。
  至此就全部配置完成,可以运行程序看看结果。测试程序是对数组a= { 1, 2, 3, 4, 5 };b= { 10, 20, 30, 40, 50 };相加得到c数组并计算c数组所有数据之和,正确结果应该返回165。
         
  下面是在mfc中cpp文件来调用cu文件的项目结构组成,在其他类型的vs工程中调用.cu也是如此,封装cu文件,对主函数添加extern “C”,更改整个项目的属性配置。
                  
  1. kernel函数是需要进行封装才能添加extern “C”来进行声明并调用的,cpp文件中是不能直接调用 __global__函数的,因为编译器是无法解析符号&&&……&&&以及blockIdx、threadIdx等,因此 __global__ 函数只能放在.cu文件函数里面定义和被调用。
  2. 用extern “C” 修饰的原因,就是.cu是扩展C,但由于cuda-c只是部分属于c++,所以需要更改工程的配置属性,将.cu文件定义为类C语言编译(NVCC.exe编译)。
  3. Cpp中是不能include“.cu”文件的,就好比cpp中不能include“.cpp”一样。
本文已收录于以下专栏:
相关文章推荐
原文地址:【CUDA学习-原创】包含C和.cu的工程实现!作者:又束起黑色领带现有一纯C的工程,想在里面调用.cu文件的函数。Project里面有一个cppIntegeration的例子,不过这个是在...
入门cuda遇到的不错点拨文,故转载如下:
本篇文章谈的是cpp文件如何调用CUDA的.cu文件实现显卡加速的相关编程。当然,这是在默认已经配置好CUDA的情况下进行的,如果对于如何配置CUDA...
说明:CU文件是CUDA部分,由nvcc编译器来编译的;C++或者C的部分就是由C或者C++的编译器编译,各自编译各自的部分。
(1)一定先安装Visual Studio,再安装CUDA。...
原文地址:http://www.haogongju.net/art/94989
CUDA C/C++关键字和函数高亮显示:
在上面HelloWorldCuda.cu文件中发现CUDA C/C+...
VS2010下CUDA4.2工程的创建
1.新建win32控制台项目;
2.在向导的应用程序设置中选择空项目复选框;
3.然后向项目添加文件,文件夹类型可选择cpp文件夹,写名称时可直接写....
CUDA是一个基于NVIDIA GPU的并行计算平台和编程模型,通过调用CUDA提供的API,可以开发高性能的并行程序。CUDA安装好之后,会自动配置好VS编译环境,按照UCDA模板新建一个工程“He...
一、环境:
家庭版 64位
Virtual Stadio 2013
二、工程建立
1、打开vs2013并创建一个空win32程序,创建一个cuda_sam...
初用cuda,使用VS2008win32控制台平台,遇到一个问题,当工程中同时有.cpp文件与.cu文件,尝试编译运行时会出现:
1>MSVCRTD.lib(ti_inst.obj) : err...
梗概如果要生成动态链接库,就需要把源码,无论是.c .cpp .cu还是其他的语言写的程序,都通过编译器变成.o文件,之后把相应的.o文件进行链接成为.so动态链接库。这样就可以直接调用其中的函数了。...
众所周知,*.cu和*.cuh后缀文件是CUDA的专门后缀格式,使用*.cu后缀的主要目的是使得CUDA的rules可以识别它并在编译时做一个预编译,生成用于CPU代码的cpp文件和用于GPU代码的c...
他的最新文章
讲师:董岩
您举报文章:
举报原因:
原文地址:
原因补充:
(最多只允许输入30个字)《转载》 cpp文件调用CUDA .cu文件实现显卡加速相关编程 - lz亢龙有悔 - 博客园
转自: & http://m.blog.csdn.net/blog/oHanTanYanYing/
本篇文章谈的是cpp文件如何调用CUDA的.cu文件实现显卡加速的相关编程。当然,这是在默认已经配置好CUDA的情况下进行的,如果对于如何配置CUDA还有疑问可以看之前写的。另外,现在CUDA已经放出了支持VS版本,所以还是建议用最新的,毕竟VS2013好用太多,配置起来也没什么区别。关于那篇配置文章,并没有解决CUDA相关函数偶有错误提示的问题,虽然对于编译没有影响,但是对于有强迫症的人来说还是比较纠结的,本人研究过后会更新,望周知。
& & 关于如何通过cpp文件调用CUDA的.cu文件实现显卡加速相关编程的问题,有两种方法。本篇先谈的是根据VS2013模板创建CUDA工程(安装6.5版本CUDA后可看到)然后再加入cpp文件的这一种方法。至于另外的在MFC或者win32工程等添加.cu文件再进行调用这种其实本质上是差不多的,会比较麻烦,本人后面有时间再更新。
& & 在主题开始之前,先说下如何调用CUDA进行显卡加速,其实大的方向是十分简单的。流程大致如下:
& & 初始化显卡内存-&将主机待处理的内存数据拷贝到显卡内存中-&利用显卡处理相关的数据-&将处理完成的显卡内存数据拷回主机内存
& & OK,下面进入主题
& & 首先创建CUDA工程,工程创建完成之后会有一个.cu文件,将文件的内容替换成如下内容
1 #include "cuda_runtime.h"
2 #include "device_launch_parameters.h"
3 #include "main.h"
inline void checkCudaErrors(cudaError err)//错误处理函数
if (cudaSuccess != err)
fprintf(stderr, "CUDA Runtime API error: %s.\n", cudaGetErrorString(err));
14 __global__ void add(int *a,int *b,int *c)//处理核函数
int tid = blockIdx.x*blockDim.x+threadIdx.x;
for (size_t k = 0; k & 50000; k++)
c[tid] = a[tid] + b[tid];
23 extern "C" int runtest(int *host_a, int *host_b, int *host_c)
int *dev_a, *dev_b, *dev_c;
checkCudaErrors(cudaMalloc((void**)&dev_a, sizeof(int)* datasize));//分配显卡内存
checkCudaErrors(cudaMalloc((void**)&dev_b, sizeof(int)* datasize));
checkCudaErrors(cudaMalloc((void**)&dev_c, sizeof(int)* datasize));
checkCudaErrors(cudaMemcpy(dev_a, host_a, sizeof(int)* datasize, cudaMemcpyHostToDevice));//将主机待处理数据内存块复制到显卡内存中
checkCudaErrors(cudaMemcpy(dev_b, host_b, sizeof(int)* datasize, cudaMemcpyHostToDevice));
add && &datasize / 100, 100 && &(dev_a, dev_b, dev_c);//调用显卡处理数据
checkCudaErrors(cudaMemcpy(host_c, dev_c, sizeof(int)* datasize, cudaMemcpyDeviceToHost));//将显卡处理完数据拷回来
cudaFree(dev_a);//清理显卡内存
cudaFree(dev_b);
cudaFree(dev_c);
然后在工程中添加main.h文件,添加如下内容
1 #include&time.h&//时间相关头文件,可用其中函数计算图像处理速度
2 #include &iostream&
3 #define datasize 50000
下面添加main的实现文件cpp,在cpp中实现对于CUDA的.cu文件的调用。内容如下
#include "main.h"
extern "C" int runtest(int *host_a, int *host_b, int *host_c);//显卡处理函数
int main()
int a[datasize], b[datasize], c[datasize];
for (size_t i = 0; i & i++)
b[i] = i*i;
long now1 = clock();//存储图像处理开始时间
runtest(a,b,c);//调用显卡加速
printf("GPU运行时间为:%dms\n", int(((double)(clock() - now1)) / CLOCKS_PER_SEC * 1000));//输出GPU处理时间
long now2 = clock();//存储图像处理开始时间
for (size_t i = 0; i & i++)
for (size_t k = 0; k & 50000; k++)
c[i] = (a[i] + b[i]);
printf("CPU运行时间为:%dms\n", int(((double)(clock() - now2)) / CLOCKS_PER_SEC * 1000));//输出GPU处理时间
/*for (size_t i = 0; i & 100; i++)//查看计算结果
printf("%d+%d=%d\n", a[i], b[i], c[i]);
getchar();
需要注意的是,在用来被调用的CUDA函数中要加上extern "C" 的声明,并在cpp文件中进行声明(extern "C" int runtest(int *host_a, int *host_b, int *host_c);)后再调用。
& & 到此本篇的第一大部分就做完了,编译运行可以看到GPU在处理复杂并行计算的时候的确比CPU快的多。关于前面提到的另外一种方法下次再谈吧,假期要结束了,额。。。
& & 好吧,距上面文章完成已经半年之久,来填坑了,另一种方法的。
【本博主】注:我试验过了,我的情况可用:visual studio2010 &+ &cuda 6.0HM-10.0rc1
org.eclipse.cdt.core.prefs
org.eclipse.cdt.ui.prefs
TAppDecoder
TAppEncoder
makefile.base
TAppCommon
TLibCommon
TLibDecoder
TLibEncoder
TLibVideoIO
annexBbytecount
convert_NtoMbit_YCbCr
TAppCommon_vc10.vcxproj
TAppDecoder_vc10.vcxproj
TAppEncoder_vc10.vcxproj
TLibCommon_vc10.vcxproj
TLibDecoder_vc10.vcxproj
TLibEncoder_vc10.vcxproj
TLibVideoIO_vc10.vcxproj
TAppCommon_vc8.vcproj
TAppDecoder_vc8.vcproj
TAppEncoder_vc8.vcproj
TLibCommon_vc8.vcproj
TLibDecoder_vc8.vcproj
TLibEncoder_vc8.vcproj
TLibVideoIO_vc8.vcproj
TAppCommon_vc9.vcproj
TAppDecoder_vc9.vcproj
TAppEncoder_vc9.vcproj
TLibCommon_vc9.vcproj
TLibDecoder_vc9.vcproj
TLibEncoder_vc9.vcproj
TLibVideoIO_vc9.vcproj
HM_vc10.sln
HM_vc8.sln
HM_vc9.sln
per-sequence
BasketballDrill.cfg
BasketballDrillText.cfg
BasketballDrive.cfg
BasketballPass.cfg
BlowingBubbles.cfg
BQMall.cfg
BQSquare.cfg
BQTerrace.cfg
Cactus.cfg
ChinaSpeed.cfg
FourPeople.cfg
Johnny.cfg
Kimono.cfg
KristenAndSara.cfg
NebutaFestival_10bit.cfg
ParkScene.cfg
PartyScene.cfg
PeopleOnStreet.cfg
RaceHorses.cfg
RaceHorsesC.cfg
SlideEditing.cfg
SlideShow.cfg
SteamLocomotiveTrain_10bit.cfg
Traffic.cfg
Vidyo1.cfg
Vidyo3.cfg
Vidyo4.cfg
encoder_intra_main.cfg
encoder_intra_main10.cfg
encoder_lowdelay_main.cfg
encoder_lowdelay_main10.cfg
encoder_lowdelay_P_main.cfg
encoder_lowdelay_P_main10.cfg
encoder_randomaccess_main.cfg
encoder_randomaccess_main10.cfg
gop-structure-example.pdf
README_data-structure.ppt
README_software-manual.txt
software-manual.pdf
software-manual.tex
HM.xcodeproj
project.pbxproj
TAppDecoder
TAppEncoder
BitrateTargeting
encode.shl
encodeCommand.sh
QuickStartGuide.pdf
targetBitrates.sh
TAppCommon
TLibCommon
TLibDecoder
TLibEncoder
TLibVideoIO
README-newconfig.txt
源代码说明.txt
/* The copyright in this software is being made available under the BSD
* License, included below. This software may be subject to other third party
* and contributor rights, including patent rights, and no such rights are
* granted under this license.
* Copyright (c) , ITU/ISO/IEC
* All rights reserved.
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the ITU/ISO/IEC nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS &AS IS&
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS
* BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
* THE POSSIBILITY OF SUCH DAMAGE.
TEncSlice.cpp
slice encoder class
#include &TEncTop.h&
#include &TEncSlice.h&
#include &math.h&
//! \ingroup TLibEncoder
// ====================================================================================================================
// Constructor / destructor / create / destroy
// ====================================================================================================================
TEncSlice::TEncSlice()
m_apcPicYuvPred = NULL;
m_apcPicYuvResi = NULL;
m_pdRdPicLambda = NULL;
m_pdRdPicQp
m_piRdPicQp
m_pcBufferSbacCoders
m_pcBufferBinCoderCABACs
m_pcBufferLowLatSbacCoders
m_pcBufferLowLatBinCoderCABACs
TEncSlice::~TEncSlice()
for (std::vector&TEncSbac*&::iterator i = CTXMem.begin(); i != CTXMem.end(); i++)
delete (*i);
Void TEncSlice::initCtxMem(
for (std::vector&TEncSbac*&::iterator j = CTXMem.begin(); j != CTXMem.end(); j++)
delete (*j);
CTXMem.clear();
CTXMem.resize(i);
Void TEncSlice::create( Int iWidth, Int iHeight, UInt iMaxCUWidth, UInt iMaxCUHeight, UChar uhTotalDepth )
// create prediction picture
if ( m_apcPicYuvPred == NULL )
m_apcPicYuvPred
= new TComPicY
m_apcPicYuvPred-&create( iWidth, iHeight, iMaxCUWidth, iMaxCUHeight, uhTotalDepth );
// create residual picture
if( m_apcPicYuvResi == NULL )
m_apcPicYuvResi
= new TComPicY
m_apcPicYuvResi-&create( iWidth, iHeight, iMaxCUWidth, iMaxCUHeight, uhTotalDepth );
Void TEncSlice::destroy()
// destroy prediction picture
if ( m_apcPicYuvPred )
m_apcPicYuvPred-&destroy();
delete m_apcPicYuvP
m_apcPicYuvPred
// destroy residual picture
if ( m_apcPicYuvResi )
m_apcPicYuvResi-&destroy();
delete m_apcPicYuvR
m_apcPicYuvResi
// free lambda and QP arrays
if ( m_pdRdPicLambda ) { xFree( m_pdRdPicLambda ); m_pdRdPicLambda = NULL; }
if ( m_pdRdPicQp
) { xFree( m_pdRdPicQp
); m_pdRdPicQp
if ( m_piRdPicQp
) { xFree( m_piRdPicQp
); m_piRdPicQp
if ( m_pcBufferSbacCoders )
delete[] m_pcBufferSbacC
if ( m_pcBufferBinCoderCABACs )
delete[] m_pcBufferBinCoderCABACs;
if ( m_pcBufferLowLatSbacCoders )
delete[] m_pcBufferLowLatSbacC
if ( m_pcBufferLowLatBinCoderCABACs )
delete[] m_pcBufferLowLatBinCoderCABACs;
Void TEncSlice::init( TEncTop* pcEncTop )
m_pcListPic
= pcEncTop-&getListPic();
m_pcGOPEncoder
= pcEncTop-&getGOPEncoder();
m_pcCuEncoder
= pcEncTop-&getCuEncoder();
m_pcPredSearch
= pcEncTop-&getPredSearch();
m_pcEntropyCoder
= pcEncTop-&getEntropyCoder();
m_pcCavlcCoder
= pcEncTop-&getCavlcCoder();
m_pcSbacCoder
= pcEncTop-&getSbacCoder();
m_pcBinCABAC
= pcEncTop-&getBinCABAC();
m_pcTrQuant
= pcEncTop-&getTrQuant();
m_pcBitCounter
= pcEncTop-&getBitCounter();
m_pcRdCost
= pcEncTop-&getRdCost();
m_pppcRDSbacCoder
= pcEncTop-&getRDSbacCoder();
m_pcRDGoOnSbacCoder = pcEncTop-&getRDGoOnSbacCoder();
// create lambda and QP arrays
m_pdRdPicLambda
= (Double*)xMalloc( Double, m_pcCfg-&getDeltaQpRD() * 2 + 1 );
m_pdRdPicQp
= (Double*)xMalloc( Double, m_pcCfg-&getDeltaQpRD() * 2 + 1 );
m_piRdPicQp
)xMalloc( Int,
m_pcCfg-&getDeltaQpRD() * 2 + 1 );
m_pcRateCtrl
= pcEncTop-&getRateCtrl();
- non-referenced frame marking
- QP computation based on temporal structure
- lambda computation based on QP
- set temporal layer ID and the parameter sets
\param pcPic
picture class
\param pocLast
POC of last picture
\param pocCurr
current POC
\param iNumPicRcvd
number of received pictures
\param iTimeOffset
POC offset for hierarchical structure
\param iDepth
temporal layer depth
\param rpcSlice
slice header class
\param pSPS
SPS associated with the slice
\param pPPS
PPS associated with the slice
Void TEncSlice::initEncSlice( TComPic* pcPic, Int pocLast, Int pocCurr, Int iNumPicRcvd, Int iGOPid, TComSlice*& rpcSlice, TComSPS* pSPS, TComPPS *pPPS )
Double dQP;
rpcSlice = pcPic-&getSlice(0);
rpcSlice-&setSPS( pSPS );
rpcSlice-&setPPS( pPPS );
rpcSlice-&setSliceBits(0);
rpcSlice-&setPic( pcPic );
rpcSlice-&initSlice();
rpcSlice-&setPicOutputFlag( true );
rpcSlice-&setPOC( pocCurr );
// depth computation based on GOP size
Int poc = rpcSlice-&getPOC()%m_pcCfg-&getGOPSize();
if ( poc == 0 )
depth = 0;
Int step = m_pcCfg-&getGOPSize();
for( Int i=step&&1; i&=1; i&&=1 )
for ( Int j=i; j&m_pcCfg-&getGOPSize(); j+=step )
if ( j == poc )
step &&= 1;
// slice type
SliceType eSliceT
eSliceType=B_SLICE;
eSliceType = (pocLast == 0 || pocCurr % m_pcCfg-&getIntraPeriod() == 0 || m_pcGOPEncoder-&getGOPSize() == 0) ? I_SLICE : eSliceT
rpcSlice-&setSliceType
( eSliceType );
// ------------------------------------------------------------------------------------------------------------------
// Non-referenced frame marking
// ------------------------------------------------------------------------------------------------------------------
if(pocLast == 0)
rpcSlice-&setTemporalLayerNonReferenceFlag(false);
rpcSlice-&setTemporalLayerNonReferenceFlag(!m_pcCfg-&getGOPEntry(iGOPid).m_refPic);
rpcSlice-&setReferenced(true);
// ------------------------------------------------------------------------------------------------------------------
// QP setting
// ------------------------------------------------------------------------------------------------------------------
dQP = m_pcCfg-&getQP();
if(eSliceType!=I_SLICE)
if (!(( m_pcCfg-&getMaxDeltaQP() == 0 ) && (dQP == -rpcSlice-&getSPS()-&getQpBDOffsetY() ) && (rpcSlice-&getSPS()-&getUseLossless())))
dQP += m_pcCfg-&getGOPEntry(iGOPid).m_QPO
// modify QP
Int* pdQPs = m_pcCfg-&getdQPs();
if ( pdQPs )
dQP += pdQPs[ rpcSlice-&getPOC() ];
#if !RATE_CONTROL_LAMBDA_DOMAIN
if ( m_pcCfg-&getUseRateCtrl())
dQP = m_pcRateCtrl-&getFrameQP(rpcSlice-&isReferenced(), rpcSlice-&getPOC());
// ------------------------------------------------------------------------------------------------------------------
// Lambda computation
// ------------------------------------------------------------------------------------------------------------------
Double dOrigQP = dQP;
// pre-compute lambda and QP values for all possible QP candidates
for ( Int iDQpIdx = 0; iDQpIdx & 2 * m_pcCfg-&getDeltaQpRD() + 1; iDQpIdx++ )
// compute QP value
dQP = dOrigQP + ((iDQpIdx+1)&&1)*(iDQpIdx%2 ? -1 : 1);
// compute lambda value
NumberBFrames = ( m_pcCfg-&getGOPSize() - 1 );
SHIFT_QP = 12;
Double dLambda_scale = 1.0 - Clip3( 0.0, 0.5, 0.05*(Double)NumberBFrames );
#if FULL_NBIT
bitdepth_luma_qp_scale = 6 * (g_bitDepth - 8);
bitdepth_luma_qp_scale = 0;
Double qp_temp = (Double) dQP + bitdepth_luma_qp_scale - SHIFT_QP;
#if FULL_NBIT
Double qp_temp_orig = (Double) dQP - SHIFT_QP;
// Case #1: I or P-slices (key-frame)
Double dQPFactor = m_pcCfg-&getGOPEntry(iGOPid).m_QPF
if ( eSliceType==I_SLICE )
dQPFactor=0.57*dLambda_
dLambda = dQPFactor*pow( 2.0, qp_temp/3.0 );
if ( depth&0 )
#if FULL_NBIT
dLambda *= Clip3( 2.00, 4.00, (qp_temp_orig / 6.0) ); // (j == B_SLICE && p_cur_frm-&layer != 0 )
dLambda *= Clip3( 2.00, 4.00, (qp_temp / 6.0) ); // (j == B_SLICE && p_cur_frm-&layer != 0 )
// if hadamard is used in ME process
if ( !m_pcCfg-&getUseHADME() && rpcSlice-&getSliceType( ) != I_SLICE )
dLambda *= 0.95;
iQP = max( -pSPS-&getQpBDOffsetY(), min( MAX_QP, (Int) floor( dQP + 0.5 ) ) );
m_pdRdPicLambda[iDQpIdx] = dL
m_pdRdPicQp
[iDQpIdx] = dQP;
m_piRdPicQp
[iDQpIdx] = iQP;
// obtain dQP = 0 case
dLambda = m_pdRdPicLambda[0];
= m_pdRdPicQp
= m_piRdPicQp
if( rpcSlice-&getSliceType( ) != I_SLICE )
dLambda *= m_pcCfg-&getLambdaModifier( m_pcCfg-&getGOPEntry(iGOPid).m_temporalId );
// store lambda
m_pcRdCost -&setLambda( dLambda );
#if WEIGHTED_CHROMA_DISTORTION
// for RDO
// in RdCost there is only one lambda because the luma and chroma bits are not separated, instead we weight the distortion of chroma.
Double weight = 1.0;
Int chromaQPO
chromaQPOffset = rpcSlice-&getPPS()-&getChromaCbQpOffset() + rpcSlice-&getSliceQpDeltaCb();
qpc = Clip3( 0, 57, iQP + chromaQPOffset);
weight = pow( 2.0, (iQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCbDistortionWeight(weight);
chromaQPOffset = rpcSlice-&getPPS()-&getChromaCrQpOffset() + rpcSlice-&getSliceQpDeltaCr();
qpc = Clip3( 0, 57, iQP + chromaQPOffset);
weight = pow( 2.0, (iQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCrDistortionWeight(weight);
#if RDOQ_CHROMA_LAMBDA
// for RDOQ
m_pcTrQuant-&setLambda( dLambda, dLambda / weight );
m_pcTrQuant-&setLambda( dLambda );
#if SAO_CHROMA_LAMBDA
// For SAO
-&setLambda( dLambda, dLambda / weight );
-&setLambda( dLambda );
#if HB_LAMBDA_FOR_LDC
// restore original slice type
eSliceType = (pocLast == 0 || pocCurr % m_pcCfg-&getIntraPeriod() == 0 || m_pcGOPEncoder-&getGOPSize() == 0) ? I_SLICE : eSliceT
rpcSlice-&setSliceType
( eSliceType );
if (m_pcCfg-&getUseRecalculateQPAccordingToLambda())
dQP = xGetQPValueAccordingToLambda( dLambda );
iQP = max( -pSPS-&getQpBDOffsetY(), min( MAX_QP, (Int) floor( dQP + 0.5 ) ) );
rpcSlice-&setSliceQp
#if ADAPTIVE_QP_SELECTION
rpcSlice-&setSliceQpBase
rpcSlice-&setSliceQpDelta
rpcSlice-&setSliceQpDeltaCb
rpcSlice-&setSliceQpDeltaCr
rpcSlice-&setNumRefIdx(REF_PIC_LIST_0,m_pcCfg-&getGOPEntry(iGOPid).m_numRefPicsActive);
rpcSlice-&setNumRefIdx(REF_PIC_LIST_1,m_pcCfg-&getGOPEntry(iGOPid).m_numRefPicsActive);
if (rpcSlice-&getPPS()-&getDeblockingFilterControlPresentFlag())
rpcSlice-&getPPS()-&setDeblockingFilterOverrideEnabledFlag( !m_pcCfg-&getLoopFilterOffsetInPPS() );
rpcSlice-&setDeblockingFilterOverrideFlag( !m_pcCfg-&getLoopFilterOffsetInPPS() );
rpcSlice-&getPPS()-&setPicDisableDeblockingFilterFlag( m_pcCfg-&getLoopFilterDisable() );
rpcSlice-&setDeblockingFilterDisable( m_pcCfg-&getLoopFilterDisable() );
if ( !rpcSlice-&getDeblockingFilterDisable())
if ( !m_pcCfg-&getLoopFilterOffsetInPPS() && eSliceType!=I_SLICE)
rpcSlice-&getPPS()-&setDeblockingFilterBetaOffsetDiv2( m_pcCfg-&getGOPEntry(iGOPid).m_betaOffsetDiv2 + m_pcCfg-&getLoopFilterBetaOffset() );
rpcSlice-&getPPS()-&setDeblockingFilterTcOffsetDiv2( m_pcCfg-&getGOPEntry(iGOPid).m_tcOffsetDiv2 + m_pcCfg-&getLoopFilterTcOffset() );
rpcSlice-&setDeblockingFilterBetaOffsetDiv2( m_pcCfg-&getGOPEntry(iGOPid).m_betaOffsetDiv2 + m_pcCfg-&getLoopFilterBetaOffset()
rpcSlice-&setDeblockingFilterTcOffsetDiv2( m_pcCfg-&getGOPEntry(iGOPid).m_tcOffsetDiv2 + m_pcCfg-&getLoopFilterTcOffset() );
rpcSlice-&getPPS()-&setDeblockingFilterBetaOffsetDiv2( m_pcCfg-&getLoopFilterBetaOffset() );
rpcSlice-&getPPS()-&setDeblockingFilterTcOffsetDiv2( m_pcCfg-&getLoopFilterTcOffset() );
rpcSlice-&setDeblockingFilterBetaOffsetDiv2( m_pcCfg-&getLoopFilterBetaOffset() );
rpcSlice-&setDeblockingFilterTcOffsetDiv2( m_pcCfg-&getLoopFilterTcOffset() );
rpcSlice-&setDeblockingFilterOverrideFlag( false );
rpcSlice-&setDeblockingFilterDisable( false );
rpcSlice-&setDeblockingFilterBetaOffsetDiv2( 0 );
rpcSlice-&setDeblockingFilterTcOffsetDiv2( 0 );
rpcSlice-&setDepth
( depth );
pcPic-&setTLayer( m_pcCfg-&getGOPEntry(iGOPid).m_temporalId );
if(eSliceType==I_SLICE)
pcPic-&setTLayer(0);
rpcSlice-&setTLayer( pcPic-&getTLayer() );
assert( m_apcPicYuvPred );
assert( m_apcPicYuvResi );
pcPic-&setPicYuvPred( m_apcPicYuvPred );
pcPic-&setPicYuvResi( m_apcPicYuvResi );
rpcSlice-&setSliceMode
( m_pcCfg-&getSliceMode()
rpcSlice-&setSliceArgument
( m_pcCfg-&getSliceArgument()
rpcSlice-&setSliceSegmentMode
( m_pcCfg-&getSliceSegmentMode()
rpcSlice-&setSliceSegmentArgument ( m_pcCfg-&getSliceSegmentArgument() );
rpcSlice-&setMaxNumMergeCand
( m_pcCfg-&getMaxNumMergeCand()
xStoreWPparam( pPPS-&getUseWP(), pPPS-&getWPBiPred() );
#if RATE_CONTROL_LAMBDA_DOMAIN
Void TEncSlice::resetQP( TComPic* pic, Int sliceQP, Double lambda )
TComSlice* slice = pic-&getSlice(0);
// store lambda
slice-&setSliceQp( sliceQP );
#if L0033_RC_BUGFIX
slice-&setSliceQpBase ( sliceQP );
m_pcRdCost -&setLambda( lambda );
#if WEIGHTED_CHROMA_DISTORTION
// for RDO
// in RdCost there is only one lambda because the luma and chroma bits are not separated, instead we weight the distortion of chroma.
Int chromaQPO
chromaQPOffset = slice-&getPPS()-&getChromaCbQpOffset() + slice-&getSliceQpDeltaCb();
qpc = Clip3( 0, 57, sliceQP + chromaQPOffset);
weight = pow( 2.0, (sliceQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCbDistortionWeight(weight);
chromaQPOffset = slice-&getPPS()-&getChromaCrQpOffset() + slice-&getSliceQpDeltaCr();
qpc = Clip3( 0, 57, sliceQP + chromaQPOffset);
weight = pow( 2.0, (sliceQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCrDistortionWeight(weight);
#if RDOQ_CHROMA_LAMBDA
// for RDOQ
m_pcTrQuant-&setLambda( lambda, lambda / weight );
m_pcTrQuant-&setLambda( lambda );
#if SAO_CHROMA_LAMBDA
// For SAO
-&setLambda( lambda, lambda / weight );
-&setLambda( lambda );
- lambda re-computation based on rate control QP
Void TEncSlice::xLamdaRecalculation(Int changeQP, Int idGOP, Int depth, SliceType eSliceType, TComSPS* pcSPS, TComSlice* pcSlice)
Double recalQP= (Double)changeQP;
Double origQP = (Double)recalQP;
// pre-compute lambda and QP values for all possible QP candidates
for ( Int deltqQpIdx = 0; deltqQpIdx & 2 * m_pcCfg-&getDeltaQpRD() + 1; deltqQpIdx++ )
// compute QP value
recalQP = origQP + ((deltqQpIdx+1)&&1)*(deltqQpIdx%2 ? -1 : 1);
// compute lambda value
NumberBFrames = ( m_pcCfg-&getGOPSize() - 1 );
SHIFT_QP = 12;
Double dLambda_scale = 1.0 - Clip3( 0.0, 0.5, 0.05*(Double)NumberBFrames );
#if FULL_NBIT
bitdepth_luma_qp_scale = 6 * (g_bitDepth - 8);
bitdepth_luma_qp_scale = 0;
Double qp_temp = (Double) recalQP + bitdepth_luma_qp_scale - SHIFT_QP;
#if FULL_NBIT
Double qp_temp_orig = (Double) recalQP - SHIFT_QP;
// Case #1: I or P-slices (key-frame)
Double dQPFactor = m_pcCfg-&getGOPEntry(idGOP).m_QPF
if ( eSliceType==I_SLICE )
dQPFactor=0.57*dLambda_
lambda = dQPFactor*pow( 2.0, qp_temp/3.0 );
if ( depth&0 )
#if FULL_NBIT
lambda *= Clip3( 2.00, 4.00, (qp_temp_orig / 6.0) ); // (j == B_SLICE && p_cur_frm-&layer != 0 )
lambda *= Clip3( 2.00, 4.00, (qp_temp / 6.0) ); // (j == B_SLICE && p_cur_frm-&layer != 0 )
// if hadamard is used in ME process
if ( !m_pcCfg-&getUseHADME() )
lambda *= 0.95;
qp = max( -pcSPS-&getQpBDOffsetY(), min( MAX_QP, (Int) floor( recalQP + 0.5 ) ) );
m_pdRdPicLambda[deltqQpIdx] =
m_pdRdPicQp
[deltqQpIdx] = recalQP;
m_piRdPicQp
[deltqQpIdx] =
// obtain dQP = 0 case
= m_pdRdPicLambda[0];
recalQP = m_pdRdPicQp
= m_piRdPicQp
if( pcSlice-&getSliceType( ) != I_SLICE )
lambda *= m_pcCfg-&getLambdaModifier( depth );
// store lambda
m_pcRdCost -&setLambda( lambda );
#if WEIGHTED_CHROMA_DISTORTION
// for RDO
// in RdCost there is only one lambda because the luma and chroma bits are not separated, instead we weight the distortion of chroma.
Double weight = 1.0;
Int chromaQPO
chromaQPOffset = pcSlice-&getPPS()-&getChromaCbQpOffset() + pcSlice-&getSliceQpDeltaCb();
qpc = Clip3( 0, 57, qp + chromaQPOffset);
weight = pow( 2.0, (qp-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCbDistortionWeight(weight);
chromaQPOffset = pcSlice-&getPPS()-&getChromaCrQpOffset() + pcSlice-&getSliceQpDeltaCr();
qpc = Clip3( 0, 57, qp + chromaQPOffset);
weight = pow( 2.0, (qp-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCrDistortionWeight(weight);
#if RDOQ_CHROMA_LAMBDA
// for RDOQ
m_pcTrQuant-&setLambda( lambda, lambda / weight );
m_pcTrQuant-&setLambda( lambda );
#if SAO_CHROMA_LAMBDA
// For SAO
-&setLambda( lambda, lambda / weight );
-&setLambda( lambda );
// ====================================================================================================================
// Public member functions
// ====================================================================================================================
Void TEncSlice::setSearchRange( TComSlice* pcSlice )
Int iCurrPOC = pcSlice-&getPOC();
Int iRefPOC;
Int iGOPSize = m_pcCfg-&getGOPSize();
Int iOffset = (iGOPSize && 1);
Int iMaxSR = m_pcCfg-&getSearchRange();
Int iNumPredDir = pcSlice-&isInterP() ? 1 : 2;
for (Int iDir = 0; iDir &= iNumPredD iDir++)
//RefPicList e = (RefPicList)iD
RefPicList
e = ( iDir ? REF_PIC_LIST_1 : REF_PIC_LIST_0 );
for (Int iRefIdx = 0; iRefIdx & pcSlice-&getNumRefIdx(e); iRefIdx++)
iRefPOC = pcSlice-&getRefPic(e, iRefIdx)-&getPOC();
Int iNewSR = Clip3(8, iMaxSR, (iMaxSR*ADAPT_SR_SCALE*abs(iCurrPOC - iRefPOC)+iOffset)/iGOPSize);
m_pcPredSearch-&setAdaptiveSearchRange(iDir, iRefIdx, iNewSR);
- multi-loop slice encoding for different slice QP
\param rpcPic
picture class
Void TEncSlice::precompressSlice( TComPic*& rpcPic )
// if deltaQP RD is not used, simply return
if ( m_pcCfg-&getDeltaQpRD() == 0 )
#if RATE_CONTROL_LAMBDA_DOMAIN
if ( m_pcCfg-&getUseRateCtrl() )
printf( &\nMultiple QP optimization is not allowed when rate control is enabled.& );
assert(0);
TComSlice* pcSlice
= rpcPic-&getSlice(getSliceIdx());
dPicRdCostBest = MAX_DOUBLE;
uiQpIdxBest = 0;
Double dFrameL
#if FULL_NBIT
SHIFT_QP = 12 + 6 * (g_bitDepth - 8);
SHIFT_QP = 12;
// set frame lambda
if (m_pcCfg-&getGOPSize() & 1)
dFrameLambda = 0.68 * pow (2, (m_piRdPicQp[0]
- SHIFT_QP) / 3.0) * (pcSlice-&isInterB()? 2 : 1);
dFrameLambda = 0.68 * pow (2, (m_piRdPicQp[0] - SHIFT_QP) / 3.0);
m_pcRdCost
-&setFrameLambda(dFrameLambda);
// for each QP candidate
for ( UInt uiQpIdx = 0; uiQpIdx & 2 * m_pcCfg-&getDeltaQpRD() + 1; uiQpIdx++ )
-&setSliceQp
( m_piRdPicQp
[uiQpIdx] );
#if ADAPTIVE_QP_SELECTION
-&setSliceQpBase
( m_piRdPicQp
[uiQpIdx] );
m_pcRdCost
-&setLambda
( m_pdRdPicLambda[uiQpIdx] );
#if WEIGHTED_CHROMA_DISTORTION
// for RDO
// in RdCost there is only one lambda because the luma and chroma bits are not separated, instead we weight the distortion of chroma.
Int iQP = m_piRdPicQp
[uiQpIdx];
Double weight = 1.0;
Int chromaQPO
chromaQPOffset = pcSlice-&getPPS()-&getChromaCbQpOffset() + pcSlice-&getSliceQpDeltaCb();
qpc = Clip3( 0, 57, iQP + chromaQPOffset);
weight = pow( 2.0, (iQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCbDistortionWeight(weight);
chromaQPOffset = pcSlice-&getPPS()-&getChromaCrQpOffset() + pcSlice-&getSliceQpDeltaCr();
qpc = Clip3( 0, 57, iQP + chromaQPOffset);
weight = pow( 2.0, (iQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCrDistortionWeight(weight);
#if RDOQ_CHROMA_LAMBDA
// for RDOQ
m_pcTrQuant
-&setLambda( m_pdRdPicLambda[uiQpIdx], m_pdRdPicLambda[uiQpIdx] / weight );
m_pcTrQuant
-&setLambda
( m_pdRdPicLambda[uiQpIdx] );
#if SAO_CHROMA_LAMBDA
// For SAO
-&setLambda
( m_pdRdPicLambda[uiQpIdx], m_pdRdPicLambda[uiQpIdx] / weight );
-&setLambda
( m_pdRdPicLambda[uiQpIdx] );
// try compress
compressSlice
( rpcPic );
Double dPicRdC
UInt64 uiPicDist
= m_uiPicD
UInt64 uiALFBits
m_pcGOPEncoder-&preLoopFilterPicAll( rpcPic, uiPicDist, uiALFBits );
// compute RD cost and choose the best
dPicRdCost = m_pcRdCost-&calcRdCost64( m_uiPicTotalBits + uiALFBits, uiPicDist, true, DF_SSE_FRAME);
if ( dPicRdCost & dPicRdCostBest )
uiQpIdxBest
dPicRdCostBest = dPicRdC
// set best values
-&setSliceQp
( m_piRdPicQp
[uiQpIdxBest] );
#if ADAPTIVE_QP_SELECTION
-&setSliceQpBase
( m_piRdPicQp
[uiQpIdxBest] );
m_pcRdCost
-&setLambda
( m_pdRdPicLambda[uiQpIdxBest] );
#if WEIGHTED_CHROMA_DISTORTION
// in RdCost there is only one lambda because the luma and chroma bits are not separated, instead we weight the distortion of chroma.
Int iQP = m_piRdPicQp
[uiQpIdxBest];
Double weight = 1.0;
Int chromaQPO
chromaQPOffset = pcSlice-&getPPS()-&getChromaCbQpOffset() + pcSlice-&getSliceQpDeltaCb();
qpc = Clip3( 0, 57, iQP + chromaQPOffset);
weight = pow( 2.0, (iQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCbDistortionWeight(weight);
chromaQPOffset = pcSlice-&getPPS()-&getChromaCrQpOffset() + pcSlice-&getSliceQpDeltaCr();
qpc = Clip3( 0, 57, iQP + chromaQPOffset);
weight = pow( 2.0, (iQP-g_aucChromaScale[qpc])/3.0 );
// takes into account of the chroma qp mapping and chroma qp Offset
m_pcRdCost-&setCrDistortionWeight(weight);
#if RDOQ_CHROMA_LAMBDA
// for RDOQ
m_pcTrQuant
-&setLambda( m_pdRdPicLambda[uiQpIdxBest], m_pdRdPicLambda[uiQpIdxBest] / weight );
m_pcTrQuant
-&setLambda
( m_pdRdPicLambda[uiQpIdxBest] );
#if SAO_CHROMA_LAMBDA
// For SAO
-&setLambda
( m_pdRdPicLambda[uiQpIdxBest], m_pdRdPicLambda[uiQpIdxBest] / weight );
-&setLambda
( m_pdRdPicLambda[uiQpIdxBest] );
/** \param rpcPic
picture class
Void TEncSlice::compressSlice( TComPic*& rpcPic )
uiStartCUA
uiBoundingCUA
rpcPic-&getSlice(getSliceIdx())-&setSliceSegmentBits(0);
TEncBinCABAC* pppcRDSbacCoder = NULL;
TComSlice* pcSlice
= rpcPic-&getSlice(getSliceIdx());
xDetermineStartAndBoundingCUAddr ( uiStartCUAddr, uiBoundingCUAddr, rpcPic, false );
// initialize cost values
m_uiPicTotalBits
m_dPicRdCost
m_uiPicDist
// set entropy coder
if( m_pcCfg-&getUseSBACRD() )
m_pcSbacCoder-&init( m_pcBinCABAC );
m_pcEntropyCoder-&setEntropyCoder
( m_pcSbacCoder, pcSlice );
m_pcEntropyCoder-&resetEntropy
m_pppcRDSbacCoder[0][CI_CURR_BEST]-&load(m_pcSbacCoder);
pppcRDSbacCoder = (TEncBinCABAC *) m_pppcRDSbacCoder[0][CI_CURR_BEST]-&getEncBinIf();
pppcRDSbacCoder-&setBinCountingEnableFlag( false );
pppcRDSbacCoder-&setBinsCoded( 0 );
m_pcEntropyCoder-&setEntropyCoder ( m_pcCavlcCoder, pcSlice );
m_pcEntropyCoder-&resetEntropy
m_pcEntropyCoder-&setBitstream
( m_pcBitCounter );
//------------------------------------------------------------------------------
Weighted Prediction parameters estimation.
//------------------------------------------------------------------------------
// calculate AC/DC values for current picture
if( pcSlice-&getPPS()-&getUseWP() || pcSlice-&getPPS()-&getWPBiPred() )
xCalcACDCParamSlice(pcSlice);
Bool bWp_explicit = (pcSlice-&getSliceType()==P_SLICE && pcSlice-&getPPS()-&getUseWP()) || (pcSlice-&getSliceType()==B_SLICE && pcSlice-&getPPS()-&getWPBiPred());
if ( bWp_explicit )
//------------------------------------------------------------------------------
Weighted Prediction implemented at Slice level. SliceMode=2 is not supported yet.
//------------------------------------------------------------------------------
if ( pcSlice-&getSliceMode()==2 || pcSlice-&getSliceSegmentMode()==2 )
printf(&Weighted Prediction is not supported with slice mode determined by max number of bins.\n&); exit(0);
xEstimateWPParamSlice( pcSlice );
pcSlice-&initWpScaling();
// check WP on/off
xCheckWPEnable( pcSlice );
#if ADAPTIVE_QP_SELECTION
if( m_pcCfg-&getUseAdaptQpSelect() )
m_pcTrQuant-&clearSliceARLCnt();
if(pcSlice-&getSliceType()!=I_SLICE)
Int qpBase = pcSlice-&getSliceQpBase();
pcSlice-&setSliceQp(qpBase + m_pcTrQuant-&getQpDelta(qpBase));
TEncTop* pcEncTop = (TEncTop*) m_pcC
TEncSbac**** ppppcRDSbacCoders
= pcEncTop-&getRDSbacCoders();
TComBitCounter* pcBitCounters
= pcEncTop-&getBitCounters();
iNumSubstreams = 1;
UInt uiTilesAcross
if( m_pcCfg-&getUseSBACRD() )
iNumSubstreams = pcSlice-&getPPS()-&getNumSubstreams();
uiTilesAcross = rpcPic-&getPicSym()-&getNumColumnsMinus1()+1;
delete[] m_pcBufferSbacC
delete[] m_pcBufferBinCoderCABACs;
m_pcBufferSbacCoders
= new TEncSbac
[uiTilesAcross];
m_pcBufferBinCoderCABACs = new TEncBinCABAC[uiTilesAcross];
for (Int ui = 0; ui & uiTilesA ui++)
m_pcBufferSbacCoders[ui].init( &m_pcBufferBinCoderCABACs[ui] );
for (UInt ui = 0; ui & uiTilesA ui++)
m_pcBufferSbacCoders[ui].load(m_pppcRDSbacCoder[0][CI_CURR_BEST]);
//init. state
for ( UInt ui = 0 ; ui & iNumS ui++ ) //init all sbac coders for RD optimization
ppppcRDSbacCoders[ui][0][CI_CURR_BEST]-&load(m_pppcRDSbacCoder[0][CI_CURR_BEST]);
//if( m_pcCfg-&getUseSBACRD() )
delete[] m_pcBufferLowLatSbacC
delete[] m_pcBufferLowLatBinCoderCABACs;
m_pcBufferLowLatSbacCoders
= new TEncSbac
[uiTilesAcross];
m_pcBufferLowLatBinCoderCABACs = new TEncBinCABAC[uiTilesAcross];
for (Int ui = 0; ui & uiTilesA ui++)
m_pcBufferLowLatSbacCoders[ui].init( &m_pcBufferLowLatBinCoderCABACs[ui] );
for (UInt ui = 0; ui & uiTilesA ui++)
m_pcBufferLowLatSbacCoders[ui].load(m_pppcRDSbacCoder[0][CI_CURR_BEST]);
//init. state
UInt uiWidthInLCUs
= rpcPic-&getPicSym()-&getFrameWidthInCU();
//UInt uiHeightInLCUs = rpcPic-&getPicSym()-&getFrameHeightInCU();
UInt uiCol=0, uiLin=0, uiSubStrm=0;
UInt uiTileCol
UInt uiTileStartLCU = 0;
UInt uiTileLCUX
Bool depSliceSegmentsEnabled = pcSlice-&getPPS()-&getDependentSliceSegmentsEnabledFlag();
uiCUAddr = rpcPic-&getPicSym()-&getCUOrderMap( uiStartCUAddr /rpcPic-&getNumPartInCU());
uiTileStartLCU = rpcPic-&getPicSym()-&getTComTile(rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))-&getFirstCUAddr();
if( depSliceSegmentsEnabled )
if((pcSlice-&getSliceSegmentCurStartCUAddr()!= pcSlice-&getSliceCurStartCUAddr())&&(uiCUAddr != uiTileStartLCU))
if( m_pcCfg-&getWaveFrontsynchro() )
uiTileCol = rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr) % (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1);
m_pcBufferSbacCoders[uiTileCol].loadContexts( CTXMem[1] );
Int iNumSubstreamsPerTile = iNumSubstreams/rpcPic-&getPicSym()-&getNumTiles();
uiCUAddr = rpcPic-&getPicSym()-&getCUOrderMap( uiStartCUAddr /rpcPic-&getNumPartInCU());
= uiCUAddr / uiWidthInLCUs;
uiSubStrm = rpcPic-&getPicSym()-&getTileIdxMap(rpcPic-&getPicSym()-&getCUOrderMap(uiCUAddr))*iNumSubstreamsPerTile
+ uiLin%iNumSubstreamsPerT
if ( (uiCUAddr%uiWidthInLCUs+1) &= uiWidthInLCUs
uiTileLCUX = uiTileStartLCU % uiWidthInLCUs;
= uiCUAddr % uiWidthInLCUs;
if(uiCol==uiTileStartLCU)
CTXMem[0]-&loadContexts(m_pcSbacCoder);
m_pppcRDSbacCoder[0][CI_CURR_BEST]-&loadContexts( CTXMem[0] );
ppppcRDSbacCoders[uiSubStrm][0][CI_CURR_BEST]-&loadContexts( CTXMem[0] );
if(m_pcCfg-&getWaveFrontsynchro())
CTXMem[1]-&loadContexts(m_pcSbacCoder);
CTXMem[0]-&loadContexts(m_pcSbacCoder);
// for every CU in slice
UInt uiEncCUO
for( uiEncCUOrder = uiStartCUAddr/rpcPic-&getNumPartInCU();
uiEncCUOrder & (uiBoundingCUAddr+(rpcPic-&getNumPartInCU()-1))/rpcPic-&getNumPartInCU();
uiCUAddr = rpcPic-&getPicSym()-&getCUOrderMap(++uiEncCUOrder) )
// initialize CU encoder
TComDataCU*& pcCU = rpcPic-&getCU( uiCUAddr );
pcCU-&initCU( rpcPic, uiCUAddr );
#if !RATE_CONTROL_LAMBDA_DOMAIN
if(m_pcCfg-&getUseRateCtrl())
if(m_pcRateCtrl-&calculateUnitQP())
xLamdaRecalculation(m_pcRateCtrl-&getUnitQP(), m_pcRateCtrl-&getGOPId(), pcSlice-&getDepth(), pcSlice-&getSliceType(), pcSlice-&getSPS(), pcSlice );
// inherit from TR if necessary, select substream to use.
if( m_pcCfg-&getUseSBACRD() )
uiTileCol = rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr) % (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1); // what column of tiles are we in?
uiTileStartLCU = rpcPic-&getPicSym()-&getTComTile(rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))-&getFirstCUAddr();
uiTileLCUX = uiTileStartLCU % uiWidthInLCUs;
//UInt uiSliceStartLCU = pcSlice-&getSliceCurStartCUAddr();
= uiCUAddr % uiWidthInLCUs;
= uiCUAddr / uiWidthInLCUs;
if (pcSlice-&getPPS()-&getNumSubstreams() & 1)
// independent tiles =& substreams are &per tile&.
iNumSubstreams has already been multiplied.
Int iNumSubstreamsPerTile = iNumSubstreams/rpcPic-&getPicSym()-&getNumTiles();
uiSubStrm = rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr)*iNumSubstreamsPerTile
+ uiLin%iNumSubstreamsPerT
// dependent tiles =& substreams are &per frame&.
uiSubStrm = uiLin % iNumS
if ( ((pcSlice-&getPPS()-&getNumSubstreams() & 1) || depSliceSegmentsEnabled ) && (uiCol == uiTileLCUX) && m_pcCfg-&getWaveFrontsynchro())
// We'll sync if the TR is available.
TComDataCU *pcCUUp = pcCU-&getCUAbove();
UInt uiWidthInCU = rpcPic-&getFrameWidthInCU();
UInt uiMaxParts = 1&&(pcSlice-&getSPS()-&getMaxCUDepth()&&1);
TComDataCU *pcCUTR = NULL;
if ( pcCUUp && ((uiCUAddr%uiWidthInCU+1) & uiWidthInCU)
pcCUTR = rpcPic-&getCU( uiCUAddr - uiWidthInCU + 1 );
if ( ((pcCUTR==NULL) || (pcCUTR-&getSlice()==NULL) ||
(pcCUTR-&getSCUAddr()+uiMaxParts-1 & pcSlice-&getSliceCurStartCUAddr()) ||
((rpcPic-&getPicSym()-&getTileIdxMap( pcCUTR-&getAddr() ) != rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr)))
// TR not available.
// TR is available, we use it.
ppppcRDSbacCoders[uiSubStrm][0][CI_CURR_BEST]-&loadContexts( &m_pcBufferSbacCoders[uiTileCol] );
m_pppcRDSbacCoder[0][CI_CURR_BEST]-&load( ppppcRDSbacCoders[uiSubStrm][0][CI_CURR_BEST] ); //this load is used to simplify the code
// reset the entropy coder
if( uiCUAddr == rpcPic-&getPicSym()-&getTComTile(rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))-&getFirstCUAddr() &&
// must be first CU of tile
uiCUAddr!=0 &&
// cannot be first CU of picture
uiCUAddr!=rpcPic-&getPicSym()-&getPicSCUAddr(rpcPic-&getSlice(rpcPic-&getCurrSliceIdx())-&getSliceSegmentCurStartCUAddr())/rpcPic-&getNumPartInCU() &&
uiCUAddr!=rpcPic-&getPicSym()-&getPicSCUAddr(rpcPic-&getSlice(rpcPic-&getCurrSliceIdx())-&getSliceCurStartCUAddr())/rpcPic-&getNumPartInCU())
// cannot be first CU of slice
SliceType sliceType = pcSlice-&getSliceType();
if (!pcSlice-&isIntra() && pcSlice-&getPPS()-&getCabacInitPresentFlag() && pcSlice-&getPPS()-&getEncCABACTableIdx()!=I_SLICE)
sliceType = (SliceType) pcSlice-&getPPS()-&getEncCABACTableIdx();
m_pcEntropyCoder-&updateContextTables ( sliceType, pcSlice-&getSliceQp(), false );
m_pcEntropyCoder-&setEntropyCoder
( m_pppcRDSbacCoder[0][CI_CURR_BEST], pcSlice );
m_pcEntropyCoder-&updateContextTables ( sliceType, pcSlice-&getSliceQp() );
m_pcEntropyCoder-&setEntropyCoder
( m_pcSbacCoder, pcSlice );
// if RD based on SBAC is used
if( m_pcCfg-&getUseSBACRD() )
// set go-on entropy coder
m_pcEntropyCoder-&setEntropyCoder ( m_pcRDGoOnSbacCoder, pcSlice );
m_pcEntropyCoder-&setBitstream( &pcBitCounters[uiSubStrm] );
((TEncBinCABAC*)m_pcRDGoOnSbacCoder-&getEncBinIf())-&setBinCountingEnableFlag(true);
#if RATE_CONTROL_LAMBDA_DOMAIN
Double oldLambda = m_pcRdCost-&getLambda();
if ( m_pcCfg-&getUseRateCtrl() )
= pcSlice-&getSliceQp();
Double estLambda = -1.0;
Double bpp
if ( rpcPic-&getSlice( 0 )-&getSliceType() == I_SLICE || !m_pcCfg-&getLCULevelRC() )
estQP = pcSlice-&getSliceQp();
= m_pcRateCtrl-&getRCPic()-&getLCUTargetBpp();
estLambda = m_pcRateCtrl-&getRCPic()-&getLCUEstLambda( bpp );
= m_pcRateCtrl-&getRCPic()-&getLCUEstQP
( estLambda, pcSlice-&getSliceQp() );
= Clip3( -pcSlice-&getSPS()-&getQpBDOffsetY(), MAX_QP, estQP );
m_pcRdCost-&setLambda(estLambda);
m_pcRateCtrl-&setRCQP( estQP );
#if L0033_RC_BUGFIX
pcCU-&getSlice()-&setSliceQpBase( estQP );
// run CU encoder
m_pcCuEncoder-&compressCU( pcCU );
#if RATE_CONTROL_LAMBDA_DOMAIN
if ( m_pcCfg-&getUseRateCtrl() )
= m_pcCuEncoder-&getLCUPredictionSAD();
Int height
= min( pcSlice-&getSPS()-&getMaxCUHeight(),pcSlice-&getSPS()-&getPicHeightInLumaSamples() - uiCUAddr / rpcPic-&getFrameWidthInCU() * pcSlice-&getSPS()-&getMaxCUHeight() );
= min( pcSlice-&getSPS()-&getMaxCUWidth(),pcSlice-&getSPS()-&getPicWidthInLumaSamples() - uiCUAddr % rpcPic-&getFrameWidthInCU() * pcSlice-&getSPS()-&getMaxCUWidth() );
Double MAD = (Double)SAD / (Double)(height * width);
MAD = MAD * MAD;
( m_pcRateCtrl-&getRCPic()-&getLCU(uiCUAddr) ).m_MAD = MAD;
Int actualQP
= g_RCInvalidQPV
Double actualLambda = m_pcRdCost-&getLambda();
Int actualBits
= pcCU-&getTotalBits();
Int numberOfEffectivePixels
for ( Int idx = 0; idx & rpcPic-&getNumPartInCU(); idx++ )
if ( pcCU-&getPredictionMode( idx ) != MODE_NONE && ( !pcCU-&isSkipped( idx ) ) )
numberOfEffectivePixels = numberOfEffectivePixels + 16;
if ( numberOfEffectivePixels == 0 )
actualQP = g_RCInvalidQPV
actualQP = pcCU-&getQP( 0 );
m_pcRdCost-&setLambda(oldLambda);
m_pcRateCtrl-&getRCPic()-&updateAfterLCU( m_pcRateCtrl-&getRCPic()-&getLCUCoded(), actualBits, actualQP, actualLambda, m_pcCfg-&getLCULevelRC() );
// restore entropy coder to an initial stage
m_pcEntropyCoder-&setEntropyCoder ( m_pppcRDSbacCoder[0][CI_CURR_BEST], pcSlice );
m_pcEntropyCoder-&setBitstream( &pcBitCounters[uiSubStrm] );
m_pcCuEncoder-&setBitCounter( &pcBitCounters[uiSubStrm] );
m_pcBitCounter = &pcBitCounters[uiSubStrm];
pppcRDSbacCoder-&setBinCountingEnableFlag( true );
m_pcBitCounter-&resetBits();
pppcRDSbacCoder-&setBinsCoded( 0 );
m_pcCuEncoder-&encodeCU( pcCU );
pppcRDSbacCoder-&setBinCountingEnableFlag( false );
if (m_pcCfg-&getSliceMode()==FIXED_NUMBER_OF_BYTES && ( ( pcSlice-&getSliceBits() + m_pcEntropyCoder-&getNumberOfWrittenBits() ) ) & m_pcCfg-&getSliceArgument()&&3)
pcSlice-&setNextSlice( true );
if (m_pcCfg-&getSliceSegmentMode()==FIXED_NUMBER_OF_BYTES && pcSlice-&getSliceSegmentBits()+m_pcEntropyCoder-&getNumberOfWrittenBits() & (m_pcCfg-&getSliceSegmentArgument() && 3) &&pcSlice-&getSliceCurEndCUAddr()!=pcSlice-&getSliceSegmentCurEndCUAddr())
pcSlice-&setNextSliceSegment( true );
if( m_pcCfg-&getUseSBACRD() )
ppppcRDSbacCoders[uiSubStrm][0][CI_CURR_BEST]-&load( m_pppcRDSbacCoder[0][CI_CURR_BEST] );
//Store probabilties of second LCU in line into buffer
if ( ( uiCol == uiTileLCUX+1) && (depSliceSegmentsEnabled || (pcSlice-&getPPS()-&getNumSubstreams() & 1)) && m_pcCfg-&getWaveFrontsynchro())
m_pcBufferSbacCoders[uiTileCol].loadContexts(ppppcRDSbacCoders[uiSubStrm][0][CI_CURR_BEST]);
// other case: encodeCU is not called
m_pcCuEncoder-&compressCU( pcCU );
m_pcCuEncoder-&encodeCU( pcCU );
if (m_pcCfg-&getSliceMode()==FIXED_NUMBER_OF_BYTES && ( ( pcSlice-&getSliceBits()+ m_pcEntropyCoder-&getNumberOfWrittenBits() ) ) & m_pcCfg-&getSliceArgument()&&3)
pcSlice-&setNextSlice( true );
if (m_pcCfg-&getSliceSegmentMode()==FIXED_NUMBER_OF_BYTES && pcSlice-&getSliceSegmentBits()+ m_pcEntropyCoder-&getNumberOfWrittenBits()& m_pcCfg-&getSliceSegmentArgument()&&3 &&pcSlice-&getSliceCurEndCUAddr()!=pcSlice-&getSliceSegmentCurEndCUAddr())
pcSlice-&setNextSliceSegment( true );
m_uiPicTotalBits += pcCU-&getTotalBits();
m_dPicRdCost
+= pcCU-&getTotalCost();
m_uiPicDist
+= pcCU-&getTotalDistortion();
#if !RATE_CONTROL_LAMBDA_DOMAIN
if(m_pcCfg-&getUseRateCtrl())
m_pcRateCtrl-&updateLCUData(pcCU, pcCU-&getTotalBits(), pcCU-&getQP(0));
m_pcRateCtrl-&updataRCUnitStatus();
if ((pcSlice-&getPPS()-&getNumSubstreams() & 1) && !depSliceSegmentsEnabled)
pcSlice-&setNextSlice( true );
if( depSliceSegmentsEnabled )
if (m_pcCfg-&getWaveFrontsynchro())
CTXMem[1]-&loadContexts( &m_pcBufferSbacCoders[uiTileCol] );//ctx 2.LCU
CTXMem[0]-&loadContexts( m_pppcRDSbacCoder[0][CI_CURR_BEST] );//ctx end of dep.slice
xRestoreWPparam( pcSlice );
#if !RATE_CONTROL_LAMBDA_DOMAIN
if(m_pcCfg-&getUseRateCtrl())
m_pcRateCtrl-&updateFrameData(m_uiPicTotalBits);
picture class
\retval rpcBitstream
bitstream class
Void TEncSlice::encodeSlice
( TComPic*& rpcPic, TComOutputBitstream* pcBitstream, TComOutputBitstream* pcSubstreams )
uiStartCUA
uiBoundingCUA
TComSlice* pcSlice = rpcPic-&getSlice(getSliceIdx());
uiStartCUAddr=pcSlice-&getSliceSegmentCurStartCUAddr();
uiBoundingCUAddr=pcSlice-&getSliceSegmentCurEndCUAddr();
// choose entropy coder
m_pcSbacCoder-&init( (TEncBinIf*)m_pcBinCABAC );
m_pcEntropyCoder-&setEntropyCoder ( m_pcSbacCoder, pcSlice );
m_pcCuEncoder-&setBitCounter( NULL );
m_pcBitCounter = NULL;
// Appropriate substream bitstream is switched later.
// for every CU
#if ENC_DEC_TRACE
g_bJustDoIt = g_bEncDecTraceE
DTRACE_CABAC_VL( g_nSymbolCounter++ );
DTRACE_CABAC_T( &\tPOC: & );
DTRACE_CABAC_V( rpcPic-&getPOC() );
DTRACE_CABAC_T( &\n& );
#if ENC_DEC_TRACE
g_bJustDoIt = g_bEncDecTraceD
TEncTop* pcEncTop = (TEncTop*) m_pcC
TEncSbac* pcSbacCoders = pcEncTop-&getSbacCoders(); //coder for each substream
Int iNumSubstreams = pcSlice-&getPPS()-&getNumSubstreams();
UInt uiBitsOriginallyInSubstreams = 0;
UInt uiTilesAcross = rpcPic-&getPicSym()-&getNumColumnsMinus1()+1;
for (UInt ui = 0; ui & uiTilesA ui++)
m_pcBufferSbacCoders[ui].load(m_pcSbacCoder); //init. state
for (Int iSubstrmIdx=0; iSubstrmIdx & iNumS iSubstrmIdx++)
uiBitsOriginallyInSubstreams += pcSubstreams[iSubstrmIdx].getNumberOfWrittenBits();
for (UInt ui = 0; ui & uiTilesA ui++)
m_pcBufferLowLatSbacCoders[ui].load(m_pcSbacCoder);
//init. state
UInt uiWidthInLCUs
= rpcPic-&getPicSym()-&getFrameWidthInCU();
UInt uiCol=0, uiLin=0, uiSubStrm=0;
UInt uiTileCol
UInt uiTileStartLCU = 0;
UInt uiTileLCUX
Bool depSliceSegmentsEnabled = pcSlice-&getPPS()-&getDependentSliceSegmentsEnabledFlag();
uiCUAddr = rpcPic-&getPicSym()-&getCUOrderMap( uiStartCUAddr /rpcPic-&getNumPartInCU());
/* for tiles, uiStartCUAddr is NOT the real raster scan address, it is actually
an encoding order index, so we need to convert the index (uiStartCUAddr)
into the real raster scan address (uiCUAddr) via the CUOrderMap */
uiTileStartLCU = rpcPic-&getPicSym()-&getTComTile(rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))-&getFirstCUAddr();
if( depSliceSegmentsEnabled )
if( pcSlice-&isNextSlice()||
uiCUAddr == rpcPic-&getPicSym()-&getTComTile(rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))-&getFirstCUAddr())
if(m_pcCfg-&getWaveFrontsynchro())
CTXMem[1]-&loadContexts(m_pcSbacCoder);
CTXMem[0]-&loadContexts(m_pcSbacCoder);
if(m_pcCfg-&getWaveFrontsynchro())
uiTileCol = rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr) % (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1);
m_pcBufferSbacCoders[uiTileCol].loadContexts( CTXMem[1] );
Int iNumSubstreamsPerTile = iNumSubstreams/rpcPic-&getPicSym()-&getNumTiles();
= uiCUAddr / uiWidthInLCUs;
uiSubStrm = rpcPic-&getPicSym()-&getTileIdxMap(rpcPic-&getPicSym()-&getCUOrderMap( uiCUAddr))*iNumSubstreamsPerTile
+ uiLin%iNumSubstreamsPerT
if ( (uiCUAddr%uiWidthInLCUs+1) &= uiWidthInLCUs
= uiCUAddr % uiWidthInLCUs;
uiTileLCUX = uiTileStartLCU % uiWidthInLCUs;
if(uiCol==uiTileLCUX)
CTXMem[0]-&loadContexts(m_pcSbacCoder);
pcSbacCoders[uiSubStrm].loadContexts( CTXMem[0] );
UInt uiEncCUO
for( uiEncCUOrder = uiStartCUAddr /rpcPic-&getNumPartInCU();
uiEncCUOrder & (uiBoundingCUAddr+rpcPic-&getNumPartInCU()-1)/rpcPic-&getNumPartInCU();
uiCUAddr = rpcPic-&getPicSym()-&getCUOrderMap(++uiEncCUOrder) )
if( m_pcCfg-&getUseSBACRD() )
uiTileCol = rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr) % (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1); // what column of tiles are we in?
uiTileStartLCU = rpcPic-&getPicSym()-&getTComTile(rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))-&getFirstCUAddr();
uiTileLCUX = uiTileStartLCU % uiWidthInLCUs;
//UInt uiSliceStartLCU = pcSlice-&getSliceCurStartCUAddr();
= uiCUAddr % uiWidthInLCUs;
= uiCUAddr / uiWidthInLCUs;
if (pcSlice-&getPPS()-&getNumSubstreams() & 1)
// independent tiles =& substreams are &per tile&.
iNumSubstreams has already been multiplied.
Int iNumSubstreamsPerTile = iNumSubstreams/rpcPic-&getPicSym()-&getNumTiles();
uiSubStrm = rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr)*iNumSubstreamsPerTile
+ uiLin%iNumSubstreamsPerT
// dependent tiles =& substreams are &per frame&.
uiSubStrm = uiLin % iNumS
m_pcEntropyCoder-&setBitstream( &pcSubstreams[uiSubStrm] );
// Synchronize cabac probabilities with upper-right LCU if it's available and we're at the start of a line.
if (((pcSlice-&getPPS()-&getNumSubstreams() & 1) || depSliceSegmentsEnabled) && (uiCol == uiTileLCUX) && m_pcCfg-&getWaveFrontsynchro())
// We'll sync if the TR is available.
TComDataCU *pcCUUp = rpcPic-&getCU( uiCUAddr )-&getCUAbove();
UInt uiWidthInCU = rpcPic-&getFrameWidthInCU();
UInt uiMaxParts = 1&&(pcSlice-&getSPS()-&getMaxCUDepth()&&1);
TComDataCU *pcCUTR = NULL;
if ( pcCUUp && ((uiCUAddr%uiWidthInCU+1) & uiWidthInCU)
pcCUTR = rpcPic-&getCU( uiCUAddr - uiWidthInCU + 1 );
if ( (true/*bEnforceSliceRestriction*/ &&
((pcCUTR==NULL) || (pcCUTR-&getSlice()==NULL) ||
(pcCUTR-&getSCUAddr()+uiMaxParts-1 & pcSlice-&getSliceCurStartCUAddr()) ||
((rpcPic-&getPicSym()-&getTileIdxMap( pcCUTR-&getAddr() ) != rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr)))
// TR not available.
// TR is available, we use it.
pcSbacCoders[uiSubStrm].loadContexts( &m_pcBufferSbacCoders[uiTileCol] );
m_pcSbacCoder-&load(&pcSbacCoders[uiSubStrm]);
//this load is used to simplify the code (avoid to change all the call to m_pcSbacCoder)
// reset the entropy coder
if( uiCUAddr == rpcPic-&getPicSym()-&getTComTile(rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))-&getFirstCUAddr() &&
// must be first CU of tile
uiCUAddr!=0 &&
// cannot be first CU of picture
uiCUAddr!=rpcPic-&getPicSym()-&getPicSCUAddr(rpcPic-&getSlice(rpcPic-&getCurrSliceIdx())-&getSliceSegmentCurStartCUAddr())/rpcPic-&getNumPartInCU() &&
uiCUAddr!=rpcPic-&getPicSym()-&getPicSCUAddr(rpcPic-&getSlice(rpcPic-&getCurrSliceIdx())-&getSliceCurStartCUAddr())/rpcPic-&getNumPartInCU())
// cannot be first CU of slice
// We're crossing into another tile, tiles are independent.
// When tiles are independent, we have &substreams per tile&.
Each substream has already been terminated, and we no longer
// have to perform it here.
if (pcSlice-&getPPS()-&getNumSubstreams() & 1)
// do nothing.
SliceType sliceType
= pcSlice-&getSliceType();
if (!pcSlice-&isIntra() && pcSlice-&getPPS()-&getCabacInitPresentFlag() && pcSlice-&getPPS()-&getEncCABACTableIdx()!=I_SLICE)
sliceType = (SliceType) pcSlice-&getPPS()-&getEncCABACTableIdx();
m_pcEntropyCoder-&updateContextTables( sliceType, pcSlice-&getSliceQp() );
// Byte-alignment in slice_data() when new tile
pcSubstreams[uiSubStrm].writeByteAlignment();
UInt uiCounter = 0;
vector&uint8_t&& rbsp
= pcSubstreams[uiSubStrm].getFIFO();
for (vector&uint8_t&::iterator it = rbsp.begin(); it != rbsp.end();)
/* 1) find the next emulated 00 00 {00,01,02,03}
* 2a) if not found, write all remaining bytes out, stop.
* 2b) otherwise, write all non-emulated bytes out
* 3) insert emulation_prevention_three_byte
vector&uint8_t&::iterator found =
/* NB, end()-1, prevents finding a trailing two byte sequence */
found = search_n(found, rbsp.end()-1, 2, 0);
/* if not found, found == end, otherwise found = second zero byte */
if (found == rbsp.end())
if (*(++found) &= 3)
} while (true);
if (found != rbsp.end())
uiCounter++;
UInt uiAccumulatedSubstreamLength = 0;
for (Int iSubstrmIdx=0; iSubstrmIdx & iNumS iSubstrmIdx++)
uiAccumulatedSubstreamLength += pcSubstreams[iSubstrmIdx].getNumberOfWrittenBits();
// add bits coded in previous dependent slices + bits coded so far
// add number of emulation prevention byte count in the tile
pcSlice-&addTileLocation( ((pcSlice-&getTileOffstForMultES() + uiAccumulatedSubstreamLength - uiBitsOriginallyInSubstreams) && 3) + uiCounter );
TComDataCU*& pcCU = rpcPic-&getCU( uiCUAddr );
if ( pcSlice-&getSPS()-&getUseSAO() && (pcSlice-&getSaoEnabledFlag()||pcSlice-&getSaoEnabledFlagChroma()) )
SAOParam *saoParam = pcSlice-&getPic()-&getPicSym()-&getSaoParam();
Int iNumCuInWidth
= saoParam-&numCuInW
Int iCUAddrInSlice
= uiCUAddr - rpcPic-&getPicSym()-&getCUOrderMap(pcSlice-&getSliceCurStartCUAddr()/rpcPic-&getNumPartInCU());
Int iCUAddrUpInSlice
= iCUAddrInSlice - iNumCuInW
Int rx = uiCUAddr % iNumCuInW
Int ry = uiCUAddr / iNumCuInW
Int allowMergeLeft = 1;
Int allowMergeUp
if (rx!=0)
if (rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr-1) != rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))
allowMergeLeft = 0;
if (ry!=0)
if (rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr-iNumCuInWidth) != rpcPic-&getPicSym()-&getTileIdxMap(uiCUAddr))
allowMergeUp = 0;
Int addr = pcCU-&getAddr();
allowMergeLeft = allowMergeLeft && (rx&0) && (iCUAddrInSlice!=0);
allowMergeUp = allowMergeUp && (ry&0) && (iCUAddrUpInSlice&=0);
if( saoParam-&bSaoFlag[0] || saoParam-&bSaoFlag[1] )
Int mergeLeft = saoParam-&saoLcuParam[0][addr].mergeLeftF
Int mergeUp = saoParam-&saoLcuParam[0][addr].mergeUpF
if (allowMergeLeft)
m_pcEntropyCoder-&m_pcEntropyCoderIf-&codeSaoMerge(mergeLeft);
mergeLeft = 0;
if(mergeLeft == 0)
if (allowMergeUp)
m_pcEntropyCoder-&m_pcEntropyCoderIf-&codeSaoMerge(mergeUp);
mergeUp = 0;
if(mergeUp == 0)
for (Int compIdx=0;compIdx&3;compIdx++)
if( (compIdx == 0 && saoParam-&bSaoFlag[0]) || (compIdx & 0 && saoParam-&bSaoFlag[1]))
m_pcEntropyCoder-&encodeSaoOffset(&saoParam-&saoLcuParam[compIdx][addr], compIdx);
else if (pcSlice-&getSPS()-&getUseSAO())
Int addr = pcCU-&getAddr();
SAOParam *saoParam = pcSlice-&getPic()-&getPicSym()-&getSaoParam();
for (Int cIdx=0; cIdx&3; cIdx++)
SaoLcuParam *saoLcuParam = &(saoParam-&saoLcuParam[cIdx][addr]);
if ( ((cIdx == 0) && !pcSlice-&getSaoEnabledFlag()) || ((cIdx == 1 || cIdx == 2) && !pcSlice-&getSaoEnabledFlagChroma()))
saoLcuParam-&mergeUpFlag
saoLcuParam-&mergeLeftFlag = 0;
saoLcuParam-&subTypeIdx
saoLcuParam-&typeIdx
saoLcuParam-&offset[0]
saoLcuParam-&offset[1]
saoLcuParam-&offset[2]
saoLcuParam-&offset[3]
#if ENC_DEC_TRACE
g_bJustDoIt = g_bEncDecTraceE
if ( (m_pcCfg-&getSliceMode()!=0 || m_pcCfg-&getSliceSegmentMode()!=0) &&
uiCUAddr == rpcPic-&getPicSym()-&getCUOrderMap((uiBoundingCUAddr+rpcPic-&getNumPartInCU()-1)/rpcPic-&getNumPartInCU()-1) )
m_pcCuEncoder-&encodeCU( pcCU );
m_pcCuEncoder-&encodeCU( pcCU );
#if ENC_DEC_TRACE
g_bJustDoIt = g_bEncDecTraceD
if( m_pcCfg-&getUseSBACRD() )
pcSbacCoders[uiSubStrm].load(m_pcSbacCoder);
//load back status of the entropy coder after encoding the LCU into relevant bitstream entropy coder
//Store probabilties of second LCU in line into buffer
if ( (depSliceSegmentsEnabled || (pcSlice-&getPPS()-&getNumSubstreams() & 1)) && (uiCol == uiTileLCUX+1) && m_pcCfg-&getWaveFrontsynchro())
m_pcBufferSbacCoders[uiTileCol].loadContexts( &pcSbacCoders[uiSubStrm] );
if( depSliceSegmentsEnabled )
if (m_pcCfg-&getWaveFrontsynchro())
CTXMem[1]-&loadContexts( &m_pcBufferSbacCoders[uiTileCol] );//ctx 2.LCU
CTXMem[0]-&loadContexts( m_pcSbacCoder );//ctx end of dep.slice
#if ADAPTIVE_QP_SELECTION
if( m_pcCfg-&getUseAdaptQpSelect() )
m_pcTrQuant-&storeSliceQpNext(pcSlice);
if (pcSlice-&getPPS()-&getCabacInitPresentFlag())
(pcSlice-&getPPS()-&getDependentSliceSegmentsEnabledFlag())
pcSlice-&getPPS()-&setEncCABACTableIdx( pcSlice-&getSliceType() );
m_pcEntropyCoder-&determineCabacInitIdx();
/** Determines the starting and bounding LCU address of current slice / dependent slice
* \param bEncodeSlice Identifies if the calling function is compressSlice() [false] or encodeSlice() [true]
* \returns Updates uiStartCUAddr, uiBoundingCUAddr with appropriate LCU address
Void TEncSlice::xDetermineStartAndBoundingCUAddr
( UInt& startCUAddr, UInt& boundingCUAddr, TComPic*& rpcPic, Bool bEncodeSlice )
TComSlice* pcSlice = rpcPic-&getSlice(getSliceIdx());
UInt uiStartCUAddrSlice, uiBoundingCUAddrS
UInt tileIdxI
UInt tileI
UInt tileWidthInL
UInt tileHeightInL
UInt tileTotalC
uiStartCUAddrSlice
= pcSlice-&getSliceCurStartCUAddr();
UInt uiNumberOfCUsInFrame = rpcPic-&getNumCUsInFrame();
uiBoundingCUAddrSlice
= uiNumberOfCUsInF
if (bEncodeSlice)
UInt uiCUAddrI
switch (m_pcCfg-&getSliceMode())
case FIXED_NUMBER_OF_LCU:
uiCUAddrIncrement
= m_pcCfg-&getSliceArgument();
uiBoundingCUAddrSlice
= ((uiStartCUAddrSlice + uiCUAddrIncrement) & uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU()) ? (uiStartCUAddrSlice + uiCUAddrIncrement) : uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
case FIXED_NUMBER_OF_BYTES:
uiCUAddrIncrement
= rpcPic-&getNumCUsInFrame();
uiBoundingCUAddrSlice
= pcSlice-&getSliceCurEndCUAddr();
case FIXED_NUMBER_OF_TILES:
= rpcPic-&getPicSym()-&getTileIdxMap(
rpcPic-&getPicSym()-&getCUOrderMap(uiStartCUAddrSlice/rpcPic-&getNumPartInCU())
uiCUAddrIncrement
tileTotalCount
= (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1) * (rpcPic-&getPicSym()-&getNumRowsMinus1()+1);
for(tileIdxIncrement = 0; tileIdxIncrement & m_pcCfg-&getSliceArgument(); tileIdxIncrement++)
if((tileIdx + tileIdxIncrement) & tileTotalCount)
tileWidthInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileWidth();
tileHeightInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileHeight();
uiCUAddrIncrement += (tileWidthInLcu * tileHeightInLcu * rpcPic-&getNumPartInCU());
uiBoundingCUAddrSlice
= ((uiStartCUAddrSlice + uiCUAddrIncrement) & uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU()) ? (uiStartCUAddrSlice + uiCUAddrIncrement) : uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
uiCUAddrIncrement
= rpcPic-&getNumCUsInFrame();
uiBoundingCUAddrSlice
= uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
// WPP: if a slice does not start at the beginning of a CTB row, it must end within the same CTB row
if (pcSlice-&getPPS()-&getNumSubstreams() & 1 && (uiStartCUAddrSlice % (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU()) != 0))
uiBoundingCUAddrSlice = min(uiBoundingCUAddrSlice, uiStartCUAddrSlice - (uiStartCUAddrSlice % (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU())) + (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU()));
pcSlice-&setSliceCurEndCUAddr( uiBoundingCUAddrSlice );
UInt uiCUAddrI
switch (m_pcCfg-&getSliceMode())
case FIXED_NUMBER_OF_LCU:
uiCUAddrIncrement
= m_pcCfg-&getSliceArgument();
uiBoundingCUAddrSlice
= ((uiStartCUAddrSlice + uiCUAddrIncrement) & uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU()) ? (uiStartCUAddrSlice + uiCUAddrIncrement) : uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
case FIXED_NUMBER_OF_TILES:
= rpcPic-&getPicSym()-&getTileIdxMap(
rpcPic-&getPicSym()-&getCUOrderMap(uiStartCUAddrSlice/rpcPic-&getNumPartInCU())
uiCUAddrIncrement
tileTotalCount
= (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1) * (rpcPic-&getPicSym()-&getNumRowsMinus1()+1);
for(tileIdxIncrement = 0; tileIdxIncrement & m_pcCfg-&getSliceArgument(); tileIdxIncrement++)
if((tileIdx + tileIdxIncrement) & tileTotalCount)
tileWidthInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileWidth();
tileHeightInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileHeight();
uiCUAddrIncrement += (tileWidthInLcu * tileHeightInLcu * rpcPic-&getNumPartInCU());
uiBoundingCUAddrSlice
= ((uiStartCUAddrSlice + uiCUAddrIncrement) & uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU()) ? (uiStartCUAddrSlice + uiCUAddrIncrement) : uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
uiCUAddrIncrement
= rpcPic-&getNumCUsInFrame();
uiBoundingCUAddrSlice
= uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
// WPP: if a slice does not start at the beginning of a CTB row, it must end within the same CTB row
if (pcSlice-&getPPS()-&getNumSubstreams() & 1 && (uiStartCUAddrSlice % (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU()) != 0))
uiBoundingCUAddrSlice = min(uiBoundingCUAddrSlice, uiStartCUAddrSlice - (uiStartCUAddrSlice % (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU())) + (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU()));
pcSlice-&setSliceCurEndCUAddr( uiBoundingCUAddrSlice );
Bool tileBoundary =
if ((m_pcCfg-&getSliceMode() == FIXED_NUMBER_OF_LCU || m_pcCfg-&getSliceMode() == FIXED_NUMBER_OF_BYTES) &&
(m_pcCfg-&getNumRowsMinus1() & 0 || m_pcCfg-&getNumColumnsMinus1() & 0))
UInt lcuEncAddr = (uiStartCUAddrSlice+rpcPic-&getNumPartInCU()-1)/rpcPic-&getNumPartInCU();
UInt lcuAddr = rpcPic-&getPicSym()-&getCUOrderMap(lcuEncAddr);
UInt startTileIdx = rpcPic-&getPicSym()-&getTileIdxMap(lcuAddr);
UInt tileBoundingCUAddrSlice = 0;
while (lcuEncAddr & uiNumberOfCUsInFrame && rpcPic-&getPicSym()-&getTileIdxMap(lcuAddr) == startTileIdx)
lcuEncAddr++;
lcuAddr = rpcPic-&getPicSym()-&getCUOrderMap(lcuEncAddr);
tileBoundingCUAddrSlice = lcuEncAddr*rpcPic-&getNumPartInCU();
if (tileBoundingCUAddrSlice & uiBoundingCUAddrSlice)
uiBoundingCUAddrSlice = tileBoundingCUAddrS
pcSlice-&setSliceCurEndCUAddr( uiBoundingCUAddrSlice );
tileBoundary =
// Dependent slice
UInt startCUAddrSliceSegment, boundingCUAddrSliceS
startCUAddrSliceSegment
= pcSlice-&getSliceSegmentCurStartCUAddr();
boundingCUAddrSliceSegment = uiNumberOfCUsInF
if (bEncodeSlice)
UInt uiCUAddrI
switch (m_pcCfg-&getSliceSegmentMode())
case FIXED_NUMBER_OF_LCU:
uiCUAddrIncrement
= m_pcCfg-&getSliceSegmentArgument();
boundingCUAddrSliceSegment
= ((startCUAddrSliceSegment + uiCUAddrIncrement) & uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU() ) ? (startCUAddrSliceSegment + uiCUAddrIncrement) : uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
case FIXED_NUMBER_OF_BYTES:
uiCUAddrIncrement
= rpcPic-&getNumCUsInFrame();
boundingCUAddrSliceSegment
= pcSlice-&getSliceSegmentCurEndCUAddr();
case FIXED_NUMBER_OF_TILES:
= rpcPic-&getPicSym()-&getTileIdxMap(
rpcPic-&getPicSym()-&getCUOrderMap(pcSlice-&getSliceSegmentCurStartCUAddr()/rpcPic-&getNumPartInCU())
uiCUAddrIncrement
tileTotalCount
= (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1) * (rpcPic-&getPicSym()-&getNumRowsMinus1()+1);
for(tileIdxIncrement = 0; tileIdxIncrement & m_pcCfg-&getSliceSegmentArgument(); tileIdxIncrement++)
if((tileIdx + tileIdxIncrement) & tileTotalCount)
tileWidthInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileWidth();
tileHeightInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileHeight();
uiCUAddrIncrement += (tileWidthInLcu * tileHeightInLcu * rpcPic-&getNumPartInCU());
boundingCUAddrSliceSegment
= ((startCUAddrSliceSegment + uiCUAddrIncrement) & uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU() ) ? (startCUAddrSliceSegment + uiCUAddrIncrement) : uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
uiCUAddrIncrement
= rpcPic-&getNumCUsInFrame();
boundingCUAddrSliceSegment
= uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
// WPP: if a slice segment does not start at the beginning of a CTB row, it must end within the same CTB row
if (pcSlice-&getPPS()-&getNumSubstreams() & 1 && (startCUAddrSliceSegment % (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU()) != 0))
boundingCUAddrSliceSegment = min(boundingCUAddrSliceSegment, startCUAddrSliceSegment - (startCUAddrSliceSegment % (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU())) + (rpcPic-&getFrameWidthInCU()*rpcPic-&getNumPartInCU()));
pcSlice-&setSliceSegmentCurEndCUAddr( boundingCUAddrSliceSegment );
UInt uiCUAddrI
switch (m_pcCfg-&getSliceSegmentMode())
case FIXED_NUMBER_OF_LCU:
uiCUAddrIncrement
= m_pcCfg-&getSliceSegmentArgument();
boundingCUAddrSliceSegment
= ((startCUAddrSliceSegment + uiCUAddrIncrement) & uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU() ) ? (startCUAddrSliceSegment + uiCUAddrIncrement) : uiNumberOfCUsInFrame*rpcPic-&getNumPartInCU();
case FIXED_NUMBER_OF_TILES:
= rpcPic-&getPicSym()-&getTileIdxMap(
rpcPic-&getPicSym()-&getCUOrderMap(pcSlice-&getSliceSegmentCurStartCUAddr()/rpcPic-&getNumPartInCU())
uiCUAddrIncrement
tileTotalCount
= (rpcPic-&getPicSym()-&getNumColumnsMinus1()+1) * (rpcPic-&getPicSym()-&getNumRowsMinus1()+1);
for(tileIdxIncrement = 0; tileIdxIncrement & m_pcCfg-&getSliceSegmentArgument(); tileIdxIncrement++)
if((tileIdx + tileIdxIncrement) & tileTotalCount)
tileWidthInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileWidth();
tileHeightInLcu
= rpcPic-&getPicSym()-&getTComTile(tileIdx + tileIdxIncrement)-&getTileHeight();
uiCUAddrIncrement += (tileWidthInLcu * tileHeightInLcu * rpcPic-&getNumPartInCU());
bounding

我要回帖

更多关于 调用另一个cpp的函数 的文章

 

随机推荐