Screen Content Coding with Primary and Secondary Reference Buffers for String Matching and Copying

2015-10-11 07:02TaoLinKailunZhouandLipingZhao
ZTE Communications 2015年4期

Tao Lin,Kailun Zhou,and Liping Zhao

(Tongji University,Shanghai 200092,China)

Screen Content Coding with Primary and Secondary Reference Buffers for String Matching and Copying

Tao Lin,Kailun Zhou,and Liping Zhao

(Tongji University,Shanghai 200092,China)

A screen content coding(SCC)algorithm that uses a primary reference buffer(PRB)and a secondary reference buffer(SRB)for string matching and string copying is proposed. PRB is typically the traditional reconstructed picture buffer which provides

tring pixels for the current pixels being coded.SRB stores a few of recently and frequently ref⁃erenced pixels for repetitive reference by the current pixels being coded.In the encoder,searching of optimal reference string is performed in both PRB and SRB,and either a PRB or SRB string is selected as an optimal reference string on a string⁃by⁃string basis.Compared with HM⁃16.4+SCM⁃40 refer⁃ence software,the proposed SCC algorithm can improve cod⁃ing performance measured by bit⁃distortion rate reduction of average 4.19%in all⁃intra configuration for text and graphics with motion category of test sequences defined by JCT⁃VC common test condition.

HEVC;Image Coding;Screen Content Coding;String Match⁃ing;Video Coding

1 Introduction

W ith continuous and rapid advancements in communications,networking,computers,semi⁃conductors,and displays technologies,screen content coding(SCC)becomes a key technolo⁃gy for some emerging popular applications such as cloud com⁃puting,cloud⁃mobile computing,Wi⁃Fi display,smartphone and tablet second display,ultra⁃thin clients,remote desktops,and screen sharing[1]-[3].The challenging requirement of SCC is to achieve both ultra⁃high visually lossless quality and ultra⁃high compression ratio up to 300:1-3000:1.In recent years,SCC has attracted increasing attention of researchers from both academia and industry[4]-[30].ISO/IEC MPEG and ITU⁃T VCEG have set up the Joint Collaborative Team(JCT)for a joint project of developing a SCC version of High Efficien⁃cy Video Coding(HEVC,also known as H.265)standard and a joint call⁃for⁃proposal has been issued in January 2014[14].

Characteristics of computer screen pictures are quite differ⁃ent from those of traditional pictures,for which traditional block⁃matching and transform⁃based hybrid coding technique can compress efficiently.In general,computer screen pictures are essentially a mix of two⁃type contents:discontinuous⁃tone content such as text,chart,graphics,and icon,and continuous⁃tone content such as photograph,movie/TV clips,natural pic⁃ture video sequences,and sophisticated light⁃shaded and tex⁃ture⁃mapped virtual⁃reality scenes.Continuous⁃tone content features relatively smooth edges,complicated textures,and thick lines with virtually unlimited colors.In contrast,discon⁃tinuous⁃tone content features very sharp edges,uncomplicated shapes,and thin lines with few colors,even one⁃pixel⁃wide sin⁃gle⁃color lines.General screen content pictures are often seen in web browsing,video conferencing with document sharing,of⁃fice document editing,engineering drawing,hardware design engineering,software programming,gaming,map navigating,address direction searching,and other computer use cases.

Actually,typical computer screens for daily use are often rich in small and sharp bitmap structures such as text,menu,icon,button,slide⁃bar,and grid.There are usually many simi⁃lar or identical patterns in a screen picture.These similar or identical patterns may have very different sizes from a few pix⁃els to a few thousands of pixels and very different shapes.Ex⁃isting techniques such as intra⁃prediction and intra block copy(IBC)[15]-[17]is not always efficient to code similar or identi⁃cal pattern within a picture,because they use only 1⁃D pattern or 2⁃D pattern of a few fixed sizes and a few fixed rectangle or square shapes.

In order to address the issue and improve the coding effi⁃ciency of repeated patterns with different sizes and shapes,this paper proposes a coding technique based on string⁃match⁃ing(also called intra string copy or ISC).It provides a very ge⁃neric and flexible solution to string matching of variable sizes and different shapes.

Two reference buffers are used in the proposed technique. One is primary reference buffer(PRB)that is typically the tra⁃ditional reconstructed picture buffer to provide reference string pixels for the current pixels being coded.The other is second⁃ary reference buffer(SRB),a lookup table(LUT)storing a few of recently and frequently referenced pixels for repetitive refer⁃ence by the current pixels being coded.For any starting pixel in an encoding process,searching of optimal reference string is performed in both PRB and SRB.Either a PRB string or a SRBstring is selected as an optimal reference string on a string⁃by⁃string basis.A PRB string has two string⁃matching parameters. One is an offset that specifies the relative position(2D coordi⁃nates),i.e.2D displacement from the reference string to the current string.The other is a length that specifies the number of pixels in the reference string.It is obvious that the reference string and the current string have the same length.A SRB string also has two string⁃matching parameters:an index that specifies the address of the reference pixel in the SRB and a length that specifies the duplication count of the reference pix⁃el to reconstruct the current string.The string⁃matching param⁃eters are then entropy⁃coded and put into the final bit⁃stream. At the decoder side,the string⁃matching parameters are actual⁃ly string⁃copying parameters.In the decoding process,the string⁃matching(i.e.string⁃copying)parameters are parsed,de⁃coded,and used to obtain the reference string pixels from ei⁃ther PRB or SRB.The values of the reference string pixels are then assigned to the current pixels being decoded to recon⁃struct the current string.

In Section 2,a general architecture of string⁃matching en⁃hanced coding system using both PRB and SRB is proposed and the details of horizontally and vertically scanned 2D⁃shape⁃preserved matching modes are described.Section 3 presents a flexible hash⁃table framework specially designed for speeding up reference string searching in PRB.Section 4 describes a method to seamlessly mix PRB search with SRB search inside a coding unit(CU),and to select one⁃by⁃one either an optimal PRB string or an optimal SRB string based on average rate⁃dis⁃tortion(RD)cost evaluation.In section 5,the proposed tech⁃nique is compared with HM⁃16.4+SCM⁃40 reference software[31]that is developed based on HEVC Software Model HM⁃16.4 by JCT for SCC testing.The experimental results show that the proposed technique can achieve significant bit⁃distor⁃tion rate(BD⁃rate)[32]-[33]reduction in lossy coding and bit⁃rate saving in lossless coding using the test suite defined by JCT Common Test Condition[30].We conclude the paper and introduce the future work on string⁃matching coding technique in section 6.In this paper,the proposed SCC algorithm is de⁃scribed for pictures of YUV4:4:4 and RGB formats,but it can also be applied to pictures of YUV4:2:2 and YUV4:2:0 formats with some modifications.This paper is partially based on JCT⁃video coding(VC)documents[18]-[27].

2 String-Matching Enhanced Coding System

A general architecture and major components of string⁃matching enhanced coding system are shown in Fig.1.

In the encoder of the string⁃matching enhanced coding sys⁃tem,the string⁃matching encoding subsystem performs color clustering,PRB string searching,and SRB string searching. The rest of the encoding system performs other traditional en⁃coding operations such as intra⁃prediction,inter⁃prediction,transform,quantization,entropy encoding,IBC,and palette coding.The input CU O is fed to both string⁃matching encod⁃ing subsystem and the rest of the encoding system.The string⁃matching encoding subsystem codes O to generate coded bit⁃stream b1and reconstructed CU P1.The rest of the encoding system also codes O to generate coded bit-stream b2and recon⁃structed CU P2.O,P1,b1,P2,and b2are sent to RD⁃cost⁃based selector that calculates the RD cost of two results and selects the one with the best RD performance as the final coding result for the CU.The corresponding coded bit-stream b1or b2is se⁃lected and put into the output bit-stream.Also,the correspond⁃ing reconstructed CU P1or P2is stored in reconstructed picture buffer,which is shared by both string⁃matching encoding sub⁃system as PRB for string⁃matching and the rest of the encoding system for inter prediction and IBC.The input CU O is also fed to a color cluster unit to obtain a few of most frequently oc⁃curred pixel colors.Some of the colors are put into the SRB LUT for SRB string searching.

In the decoder,the input bit-stream is first parsed by CU cod⁃ing type parser to get the CU_coding_type_flag that indicates whehter the current CU being decoded is coded by the pro⁃posed string⁃matching technique or by traditional coding tech⁃niques.If the CU is coded by the string⁃matching technique,the CU layer bit-stream is sent to the string⁃matching decoding subsystem that decodes and reconstructs the CU P1,which is stored in reconstructed picture buffer and is also the final re⁃constructed CU output Õ.If the CU is coded by traditional cod⁃ing techniques,the CU layer bit-stream is sent to the rest of the decoding system that performs traditional decoding tasks such as intra⁃prediction,inter⁃prediction,inverse⁃transform,de⁃quantization,IBC,and palette decoding.The CU P2is eventual⁃ly reconstructed,which is stored in reconstructed picture buf⁃fer and is also the final reconstructed CU output Õ.

The string⁃matching coding subsystem has two CU level matching modes:horizontally scanned 2D⁃shape⁃preserved matching and vertically scanned 2D⁃shape⁃preserved matching(Fig.2).A CU is pre⁃coded by both the modes.The mode mini⁃mizing RD cost is selected as the final string⁃matching mode for the CU.The CU size in Fig.2 is 16x16 pixels in the packed pixel format(e.g.a Y sample is followed by a U sample and then a V sample,or a G sample is followed by a B sample and then an R sample).

The vertically scanned 2D⁃shape⁃preserved matching mode is used to code CU m in Fig.2.In this mode,PRB is treated as a 2D plane while CU m is considered a vertically scanned 1D pixel string.Starting from the 1st pixel of CU m,the encoder searches the optimal(e.g.longest in lossless case)string in the PRB searching range that matches the pixel string in CU and also keeps exactly the same 2D shape as the pixel string in CU m.The string found in the searching range is called a refer⁃ence string and the pixel string in CU m is called current string.Fig.2 shows the first two current strings in CU m and their corresponding reference strings:

▲Figure 1.String⁃matching enhanced coding system architecture.

▲Figure 2.Two matching modes of string⁃matching coding.

1)The 1st reference(and current)string in yellow has 15 pix⁃ els and the 2D⁃shape⁃preserved reference string is located across CU 1 and CU h+1;

2)The 2nd reference(and current)string in cyan has 20 pixels and the 2D⁃shape⁃preserved reference string is located across CU 1,CU 2,and CU h+1.

The horizontally scanned 2D⁃shape⁃preserved matching mode is used to code CU m+1 in Fig.2.In this mode,PRB is treated as a 2D plane while CU m+1 is considered to be a hori⁃ zontally scanned 1D pixel string.Starting from the 1st pixel of CU m+1,the encoder searches the optimal string in the PRB searching range that matches the pixel string in CU m+1 and also keeps exactly the same 2D shape as the pixel string in CU m+1.The string found in the searching range is called a refer⁃ence string and the pixel string in CU m+1 is called the cur⁃rent string.Fig.2 shows the first three current strings in CU m+ 1 and their corresponding reference strings:

1)The 1st reference(and current)string in pink has 24 pixels and the 2D⁃shape⁃preserved reference string is located across CU 1 and CU 2;

2)The 2nd reference(and current)string in black has 15 pix⁃els and the 2D⁃shape⁃preserved reference string is located across CU h and CU h+1;

3)The 3rd reference(and current)string in orange has 18 pix⁃els and the 2D⁃shape⁃preserved reference string is located across CU 1 and CU 2.

There are usually two types of pixel scanning methods(and paths)available for a CU or a largest coding unit(LCU).One is raster⁃scan:a CU or LCU is scanned line by line,either hori⁃zontally or vertically,and all lines have the same scan direc⁃tion(Fig.2).The other is traverse⁃scan:a CU or LCU is also scanned line by line,either horizontally or vertically,but odd lines and even lines have opposite scan direction.Once a scan⁃ning method(and path)is determined,all pixels inside a CU or LCU are lined up following the scanning path.Thus,starting from a current pixel being coded inside a CU Pm,n,a current string curSm,ncan be defined following the scanning path.The pixels of curSm,nare designated as Sm,n(0),Sm,n(1),...,Sm,n(L⁃1),following the order of the pixels on the scanning path,where L is the length of the string in unit of pixel.Using the designa⁃tion,curSm,nand its pixels can be expressed as curSm,n={Sm,n(k):0≤k<L}.Given a current string curSm,nand a reference pixel Pi,j,a reference string refSi,j={Si,j(k):0≤k<L}is the string that starts from Pi,j,i.e.Si,j(0)=Pi,j,and has exactly the same 2D shape and scanning order as curSm,n.

Fig.3 illustrates an example of the current string curSm,n={Sm,n(k):0≤k<L}starting from the current pixel Pm,ninside a 8×8 CU of horizontal traverse⁃scan and its reference string refSi,j={Si,j(k):0≤k<L}starting from the reference pixel Pi,j.Both curSm,nand refSi,jhave identical 2D shape and length of L=9.

Obviously,given a current pixel Pm,n,the current string curSm,nof length L can be uniquely specified within a CU with known size and scanning method,and a reference string refSi,jis uniquely specified by a displacement vector(DVh,DVv)=(i⁃m,j⁃n).In Fig.3,DVh=i⁃m is the horizontal displacement be⁃tween curSm,nand refSi,j,and DVv=j⁃n is the vertical displace⁃ment between curSm,nand refSi,j.

reference string refSi,j={Si,j(k):0≤k<L}for a given current string curSm,n={Sm,n(k):0≤k<L}satisfies the constraint that the difference between Si,j(k)and Sm,n(k)is within a predetermined threshold for all k.One common option used for difference measurement is the absolute difference|Si,j(k)Y⁃Sm,n(k)Y|,|Si,j(k)U⁃Sm,n(k)U|,and|Si,j(k)V⁃Sm,n(k)V|,where the sub⁃scripts Y,U,V designate the Y color component,U col⁃or component,and V color component for the corresponding pixels Si,j(k)or Sm,n(k).

3 String-Matching Oriented Hash-Table Framework

▲Figure 3.An example of current string and its reference string.

String⁃matching coding performance heavily depends on the PRB searching range.The larger the searching range is,the better the coding performance can achieve.However,the lon⁃ ger the searching time is,the more the computing power re⁃quires.Hash⁃table can be used to speed up reference string searching.Therefore,the key to design a practical string⁃matching encoding subsystem is to have a single and efficient string searching oriented hash⁃table that should work and speed up the searching in all two matching modes.

In the string⁃matching oriented hash⁃table framework,the basic role of a hash⁃table is to quickly find the first matching reference pixel in the PRB searching range by table⁃look⁃up. First of all,we need to define a pixel⁃order for all pixels in the PRB searching range.Naturally,we use the order defined in the horizontally scanned string matching mode,that is,all pix⁃els are ordered one LCU followed by another LCU in LCU cod⁃ing order.Within an LCU,horizontal scanning is employed to order pixels.All pixels with the same hash⁃value are chained together,following the pixel⁃order.The hash⁃value of any cur⁃rent pixel being coded is calculated and the encoder only needs to search through the hash chain of the same hash⁃value,instead of all pixels of the entire PRB searching range,to find a potential matching reference pixel.This potential pixel is the first pixel of a potential reference string.In this way,the searching time can be significantly reduced.

In Fig.4,the pixels(in the searching range)lined up in pixel⁃order are P0,0,P1,0,P2,0,P3,0,...,Pi,j,Pi+1,j,Pi+2,j,Pi+3,j,...,Pi,j+1,...,followed by the current pixel being coded,Pm,n.The hash val⁃ues ha_val0,0,ha_val1,0,ha_val2,0,ha_val3,0,...,ha_vali,j,ha_vali+1,j,ha_vali+2,j,ha_vali+3,j,...,ha_vali,j+1are computed from the corre⁃sponding pixels.There are three hash chain examples in this figure.

1)Example 1

Three pixels P0,0,Pi,j,and Pi+2,jhave the same hash value,i.e. ha_val0,0=ha_vali,j=ha_vali+2,j=x.Coordinates(0,0),(i,j),and(i+2,j)of the three pixels are stored in the hash⁃table and form a hash chain of hash⁃value x.The hash⁃table actually has two parts:chain⁃head and chain⁃body.Among the three coordi⁃ nates,the coordinates(i+2,j)having the largest pix⁃el⁃order is stored in the chain⁃head slot of address x.The coordinates(i,j)having the second largest pixel⁃order is stored in chain⁃body location of coor⁃dinates(i+2,j).The coordinates(0,0)having the third largest pixel⁃order is stored in the chain⁃body location of coordinates(i,j).Finally,⁃1 is stored in the chain⁃body location of coordinates(0,0)to indi⁃cate the end of the chain.

2)Example 2

Two pixels P2,0and Pi+1,jhave the same hash val⁃ue,i.e.ha_val2,0=ha_vali+1,j=y.Coordinates(2,0)and(i+1,j)form a hash chain of hash⁃value y in the hash⁃table.In the hash chain,the coordinates(i+1,j)having the largest pixel⁃order is stored in the chain⁃head slot of address y.The coordinates(2,0)having the second largest pixel⁃order is stored in the chain⁃body location of coordinates(i+1,j).Finally,⁃1 isstored in the chain⁃body location of coordinates(2,0)to indi⁃cate the end of the chain.

▲Figure 4.Hash⁃table structure and contents.

3)Example 3

The third hash chain has only one pixel P1,0whose hash val⁃ue is ha_val1,0=z.Therefore,the coordinates(1,0)is stored in the chain⁃head slot of address z and⁃1 is stored in the chain⁃body location of coordinates(1,0)to indicate the end of the chain.

When the current pixel Pm,nis encoded and the current string starts from Pm,n,the hash value ha_valm,nof Pm,nis first computed and then used as the chain⁃head address to find the 1st coordinates in the hash chain of the same hash value.In the hash chain examples in 0,if ha_valm,n=x,the 1st coordi⁃nates is(i+2,j);If ha_valm,n=y,the 1st coordinates is(i+1,j);If ha_valm,n=z,the 1st coordinates is(1,0).The content of the 1st coordinates in the chain⁃body is the 2nd coordinates in the hash chain of the same hash value,and the content of the 2nd coordinates in the chain⁃body is the 3rd coordinates in the hash chain of the same hash value,and so on.Therefore,the string⁃matching encoder can use the hash value of the current pixel being coded to find all pixels and their locations that have the same hash value in the searching range.Moreover,a hash chain starts from a hash⁃head slot,the 1st coordinates have the largest pixel⁃order,the 2nd coordinates have the sec⁃ond largest pixel⁃order,and so on.

For a pixel P=(Y,U,V)or(G,B,R),where Y,U,V(or G,B,R)are three 8⁃bit color components of P,a 12⁃bit hash val⁃ ue ha_val is computed.For lossy coding,ha_val is computed by concatenating 4⁃bit most significant bit(MSB)of Y,U,V(or G,B,R)to form a 12⁃bit value.For lossless coding,ha_val is computed by cyclic redundancy check(CRC)⁃12 algorithm[34]to get a 12⁃bit value.Obviously,all the pixels are always located in the same hash chain,no matter they have the identi⁃cal 4⁃bit MSB component value in a lossy coding case or the identical 8⁃bit component value in a lossless coding case. Therefore,a hash chain with the same hash value provides a small and efficient candidate set of reference pixels for the starting pixel of a potential reference string.

4 PRB and SRB Based String-Matching

In the proposed string⁃matching technique,a current string being coded gets reference pixels from either PRB or SRB. When a current string gets the reference pixels from PRB,the reference pixels form a reference string that has exactly the same 2D shape and length(number of pixels)as the current string.The reference string can be in any valid location inside PRB.When a current string gets the reference pixels from SRB,it actually gets only one single reference pixel color from SRB LUT and all pixels of the entire string has the same color.

▲Figure 5.A CU coded by string⁃matching technique.

A CU coded by the string⁃matching technique can have both PRB and SRB strings(Fig.5).In the figure,the 1st string in purple dots is a 4⁃pixel PRB string.Its reference string is just above it and has the same horizontal line shape and length of 4pixels.The 2nd string in red dots is a 2⁃pixel SRB string.Its reference pixels are the 1st SRB LUT pixel color duplicated twice.The 3rd string in purple dots is a 43⁃pixel PRB string. Its reference string is located across the boundary between cur⁃rent coding tree unit(CTU)and above CTU.The PRB string and its reference string have exactly the same horizontal tra⁃verse⁃scan shape and length of 43 pixels.The 4th string in red dots is a 6⁃pixel SRB string.Its reference pixels are the 15th SRB LUT pixel color duplicated six times.The 5th string in purple dots is a 5⁃pixel PRB string.Its reference string is locat⁃ed across the boundary between current CTU and left CTU. The PRB string and its reference string have exactly the same horizontal line shape and length of 5 pixels.

In a string⁃matching encoder,the best matching reference string for a current starting pixel Pm,nbeing coded is found via the following steps:

1)Searching the best reference PRB string for a current string curSm,nvia three sub⁃steps:

a)Calculating the hash value ha_valm,nof Pm,n.

b)For the pixel coordinates(i,j)obtained from the hash chain with the same hash value ha_valm,n,determining the lon⁃gest matching reference string refSi,j={Si,j(k):0≤k<Li,j},i.e.de⁃termining the largest Li,jthat satisfies the constraint|Si,j(k)Y⁃Sm,n(k)Y|≤E,|Si,j(k)U⁃Sm,n(k)U|≤E,|Si,j(k)V⁃Sm,n(k)V|≤E for all k<Li,j,where E is a predetermined threshold.After going through all the pixel coordinates(i,j)on the hash chain of hash value ha_valm,n,multiple matching reference strings are found as best reference PRB string candidates.

c)The best reference PRB string is selected from the best reference PRB string candidates,based on average RD cost evaluation.For a given current string curSm,nand its reference string refSi,jof length L,the average RD cost is calculated by(1)

where bits(refSi,j)is the number of bits for coding the reference string refSi,j,λ is a weighting factor,and distortion(curSm,n,refSi,j)is the distortion between the current string curSm,nand the refer⁃ence string refSi,j.

2)Searching the best reference SRB string,which is straightfor⁃ward because a reference SRB string is a simple duplication of an SRB pixel color.

3)Either the best reference PRB string or the best reference SRB string is selected as the best matching reference string based on average RD cost evaluation.

If no PRB string or SRB string can be found for the current pixel Pm,nbeing coded,the pixel itself is coded directly as an unmatched pixel.

5 Experimental Results

As an implementation example,the proposed string⁃match⁃ing technique is integrated into HM⁃16.4+SCM⁃4.0 reference software[31].All experimental results were generated under the common test conditions and configurations defined in[30]for HEVC SCC project.

Thirteen test sequences(Table 1)are used in the experi⁃ment.The test sequences are classified into four categories:text and graphics with motion(TGM),mixed content(MC),camera captured(CC),and animation(ANI).Each test se⁃quence has a RGB color format version and a YCbCr(YUV)color format version.Therefore,there are 26 sequences in total used in the experiment.

To evaluate the overall coding performance,the HEVC BD⁃rate metric[32],[33]is used for lossy coding and bit⁃rate sav⁃ing metric is used for lossless coding.In lossy coding,an aver⁃age BD⁃rate reduction is calculated by three color components G/Y,B/U,and R/V for each color format and category.In loss⁃less coding,four bit⁃rate saving values(total,average,mini⁃mum,and maximum)are calculated for each color format and category.Both lossy coding and lossless coding experiments used three configurations:all intra(AI),random access(RA),and low delay B(LB).Encoding and decoding software runtime were also compared for evaluating the complexity of the encod⁃er and decoder.

Two coding methods were compared:

1)HM⁃16.4+SCM⁃4.0(SCM)with default setting.In particular,the IBC search range is full frame.

2)HM⁃16.4+SCM⁃4.0 integrated with string⁃matching tech⁃nique(SCM⁃SM).The string⁃matching search range is the current LCU and left LCU.

Table 2 shows coding performance comparison(BD⁃rate re⁃duction percentage in negative numbers)between SCM and SCM⁃SM in the lossy coding case.Table 3 shows coding per⁃formance comparison(bit⁃rate saving percentage in negative numbers)between SCM and SCM⁃SM in the lossless coding case.Each row of data in the two tables corresponds to a com⁃bination of one color format and one category.There are totally eight combinations.Each combination contains 1-7 sequenc⁃es.The encoding and decoding runtime ratios are also shown in the tables.

▼Table 1.Four⁃category test sequences

▼Table 2.Performance comparison between SCM and SCM⁃SM in the lossy coding case

▼Table 3.Performance comparison between SCM and SCM⁃SM in lossless coding case

The experimental results include:

1)In the lossy coding case,SCM⁃SM can achieve up to 4.33% for AI,2.68%for RA,2.04%for LB average BD⁃rate reduc⁃tion over SCM.

2)In the lossless coding case,SCM⁃SM can achieve up to 14.5%for AI,9.9%for RA,6.46%for LB maximum bit⁃rate saving and 5.17%for AI,3.47%for RA,2.76%for LB aver⁃age bit⁃rate saving over SCM.

3)The improvement of SCM⁃SM over SCM depends on the con⁃tents.In YUV TGM case,the average bit⁃rate saving is 5.17%in lossless coding case and the average BD⁃rate re⁃duction of components Y,U and V are 4.19%,4.31%and 4.33%,respectively in lossy coding case for AI configura⁃tion.SCM⁃SM also has good BD⁃rate reduction over SCM for mixed content in all configurations,but no coding perfor⁃mance improvement for camera captured and animation con⁃tents.

4)In the lossy coding case,encoding runtime increase is about 22%for AI,11%for RA,7%for LB.In lossless coding case,encoding runtime increase is about 52%for AI,11% for RA,7%for LB

6 Conclusions

This paper proposes a string⁃matching coding technique for SCC.Both PRB and SRB are used for string⁃matching.The re⁃sulting bit-stream is a combination of PRB string coding param⁃eters,SRB string coding parameters,and unmatched pixelmixed on a string⁃by⁃string basis.The experiments using com⁃mon test condition[30]show that the string⁃matching coding technique can achieve a lossy coding BD⁃rate reduction of up to 4.33%,2.68%,2.04%for AI,RA,and LB configurations re⁃spectively,and a lossless coding average bit⁃rate saving of up to 5.17%,3.47%,2.76%for AI,RA,and LB configurations re⁃spectively.

Our future work includes:1)implementing rate⁃control in string⁃matching enhanced coding system;2)optimizing string⁃matching coding to improve coding performance and reduce coding complexity further;3)achieving idempotent(re⁃loss⁃free)[35]coding in repetitive(multi⁃generation)compression using the string⁃matching enhanced coding system.

References

[1]T.Lin,K.Zhou,and S.Wang.“Cloudlet⁃screen computing:a client⁃server ar⁃chitecture with top graphics performance,”International Journal of Ad Hoc and Ubiquitous Computing,vol.13,no.2,pp.96-108,Jun.2013.doi:10.1504/IJA⁃HUC.2013.054174.

[2]Y.Lu,S.Li,and H.Shen,“Virtualized screen:a third element for cloud_mobile convergence,”IEEE Multimedia,vol.18,no.2,pp.4-11,Apr.2011.doi:10.1109/MMUL.2011.33.

[3]T.Lin and S.Wang.“Cloudlet⁃screen computing:a multi⁃core⁃based,cloud⁃computing⁃oriented,traditional⁃computing⁃compatible parallel computing par⁃adigm for the masses,”in IEEE International Conference on Multimedia and Ex⁃po,New York,USA,Jul.2009,pp.1805-1808.doi:10.1109/ICME.2009. 5202873.

[4]T.Lin,K.Zhou,X.Chen,and S.Wang,“Arbitrary shape matching for screen content coding,”Picture Coding Symposium,San Jose,USA,Dec.2013,pp. 369-372.doi:10.1109/PCS.2013.6737760.

[5]W.Zhu,J.Xu,W.Ding,et al.,“Adaptive LZMA⁃based coding for screen con⁃tent,”Picture Coding Symposium,San Jose,USA,Dec.2013,pp.373-376.doi:10.1109/PCS.2013.6737761.

[6]T.Lin,X.Chen,and S.Wang,“Pseudo⁃2⁃D⁃matching based dual⁃coder archi⁃tecture for screen contents coding,”IEEE International Conference on Multime⁃dia and Expo Workshops,San Jose,USA,Jul.2013,pp.1-4.doi:10.1109/IC⁃MEW.2013.6618315.

[7]S.Wang and T.Lin,“Compound image compression based on unified LZ and hy⁃brid coding,”IET Image Processing,vol.7,no.5,pp.484-499,May 2013.doi:10.1049/iet⁃ipr.2012.0439.

[8]T.Lin,P.Zhang,S.Wang,et al.,“Mixed chroma sampling⁃rate high efficiency video coding for full⁃chroma screen content,”IEEE Transactions on Circuits and Systems for Video Technology,vol.23,no.1,pp.173-185,Jan.2013.doi:10.1109/TCSVT.2012.2223871.

[9]W.Zhu,W.Ding,R.Xiong,et al.,“Compound image compression by multi⁃stage prediction,”IEEE Visual Communications and Image Processing Confer⁃ence,San Diego,USA,Nov.2012,pp.1-6.doi:10.1109/VCIP.2012.6410758.

[10]S.Wang and T.Lin,“United coding method for compound image compres⁃sion,”Multimedia Tools and Applications,vol.71,no.3,pp.1263-1282,Aug. 2014.doi:10.1007/s11042⁃012⁃1274⁃y.

[11]S.Wang and T.Lin,“United coding for compound image compression,”3rd In⁃ternational Congress on Image and Signal Processing,Yantai,China,pp.566-570,Oct.2010.doi:10.1109/CISP.2010.5647270.

[12]S.Wang and T.Lin,“A unified LZ and hybrid coding for compound image par⁃tial⁃lossless compression,”2nd International Conference on Image and Signal Processing,Tianjin,China,Oct.2009,pp.1-5.doi:10.1109/CISP.2009. 5301019.

[13]W.Ding,Y.Lu,and F.Wu,“Enable efficient compound image compression in h.264/AVC intra coding,”IEEE International Conference on Image Processing,San Antonio,USA,Oct.2007,vol.2,pp.337-340.doi:10.1109/ICIP.2007. 4379161.

[14]ISO/IEC JTC1/SC29/WG11 and ITU⁃T Q6/16,“Joint call for proposals for cod⁃ing of screen content,”ISO/IEC JTC 1/SC 29/WG 11(MPEG),San Jose,USA,Doc.N14175,Jan.2014.

[15]D.⁃K.Kwon and M.Budagavi,“Intra motion compensation with variable length intra MV coding,”JCT⁃VC,Vienna,Austria,JCTVC⁃N0206,Jul.2013.

[16]C.Pang,J.Sole,L.Guo,et al.,“Intra motion compensation with 2⁃D MVs,”JCT⁃VC,Vienna,Austria,JCTVC⁃N0256,Jul.2013.

[17]C.Pang,J.Sole,L.Guo,et al.,“Displacement vector signaling for intra block copying,”JCT⁃VC,Geneva,Switzerland,JCTVC⁃O0154,Oct.2013.

[18]T.Lin,S.Wang,P.Zhang,and K.Zhou,“AHG7:full⁃chroma(YUV444)dic⁃tionary+hybrid dual⁃coder extension of HEVC,”JCT⁃VC,Shanghai,China,JCTVC⁃K0133,Oct.2012.

[19]M.Budagavi,“Cross⁃check of JCTVC⁃K0133(dictionary+hybrid dual⁃coder extension of HEVC),”JCT⁃VC,Shanghai,China,JCTVC⁃K0329,Oct.2012.

[20]T.Lin,S.Wang,P.Zhang,and K.Zhou,“AHG8:P2M based dual⁃coder exten⁃sion of HEVC,”JCT⁃VC,Geneva,Switzerland,JCTVC⁃L0303,Jan.2013.

[21]J.Ye,S.Liu,S.Lei,et al.,“Improvements on 1D dictionary coding,”JCT⁃VC,Valencia,Spain,JCTVC⁃Q0124,2014.

[22]R.Cohen,“Crosscheck for JCTVC⁃Q0124 improvements on 1D dictionary cod⁃ing mode,”JCT⁃VC,Valencia,Spain,JCTVC⁃Q0125,2014.

[23]L.Zhao,X.Chen,T.Lin,et al.,“SCCE4:Results of subtest 3.3,”JCT⁃VC,Sapporo,Japan,JCTVC⁃R0060,2014.

[24]B.Li and J.Xu,“Cross⁃check of test 3.3(JCTVC⁃R0060),”JCT⁃VC,Sappo⁃ro,Japan,JCTVC⁃R0109,2014.

[25]K.Zhou,L.Zhao,X.Chen,and T.Lin,“Non⁃CE10:improvement on coding of ISC parameters and comparison to Palette coding,”JCT⁃VC,Strasbourg,France,JCTVC⁃S0250,Oct.2014.

[26]L.Zhao,K.Zhou,and T.Lin,“CE3:results of test B.4.2(minimum string length of 20 pixels)on intra string copy,”JCT⁃VC,Geneva,Switzerland,JCT⁃VC⁃T0136,Feb.2015.

[27]C.⁃H.Hung,Y.⁃J.Chang,J.⁃S.Tu,et al.,“CE3:crosscheck of CE3 test B.4.2(JCTVC⁃T0136),”JCT⁃VC,Geneva,Switzerland,JCTVC⁃T0179,Feb.2015.

[28]L.Zhao,K.Zhou,S.Wang,and T.Lin,“Non⁃CE3:improvement on intra string copy,”JCT⁃VC,Geneva,Switzerland,JCTVC⁃T0139,Feb.2015.

[29]R.⁃L.Liao,C.⁃C.Chen,W.⁃H.Peng,and H.⁃M.Hang,“Crosscheck of non⁃CE3:improvement on intra string copy(JCTVC⁃T0139),”JCT⁃VC,Geneva,Switzerland,JCTVC⁃T0200,Feb.2015.

[30]H.Yu,R.Cohen,K.Rapaka,and J.Xu,“Common conditions for screen con⁃tent coding tests,”JCT⁃VC,Strasbourg,France,JCTVC⁃S1015,Oct.2014.

[31]Heinrich Hertz Institute.(2015).Rec.ITU⁃T H.265|ISO/IEC 23008⁃2 High Efficiency Video Coding[Online].Available:https://hevc.hhi.fraunhofer.de/svn/ svn_HEVCSoftware/tags/HM⁃16.4+SCM⁃4.0

[32]G.Bjøntegaard,“Calculation of average PSNR differences between RD⁃Curves,”ITU⁃T SG16 Q.6 Document,VCEG⁃M33,Austin,USA,Apr.2001.

[33]G.Bjøntegaard,“Improvements of the BD⁃PSNR model,”ITU⁃T SG16 Q.6 Document,VCEG⁃AI11,Berlin,Germany,Jul.2008.

[34]T.V.Ramabadran and S.S.Gaitonde,“A tutorial on CRC computations,”IEEE Micro,vol.8,no.4,pp.62-75,Aug.1988.doi:10.1109/40.7773.

[35]T.Lin,“Achieving re⁃loss⁃free video coding,”IEEE Signal Processing Letters,vol.16,no.4,pp.323-326,Apr.2009.doi:10.1109/LSP.2009.2014285.

Manuscript received:2015⁃05⁃13

Biographies

Tao Lino Lin(lintao@tongji.edu.cn)received his MS and PhD degrees from Tohoku Uni⁃versity,Japan,in 1985 and 1989.He has been with VLSI Lab,Tongji University,China since 2003.In 2005,he was awarded“Chang Jiang Scholars”,the highest honor given by China Ministry of Education.From 1988 to 2002,he was a postdoc⁃toral researcher with University of California,Berkeley,and developed multimedia ICs and products at several companies in Silicon Valley,including Integrated De⁃vice Technology,Inc.,PMC⁃Sierra Inc.,Cypress Semiconductor Corp.,and NeoMag⁃ic Corp.He has been granted 24 US patents and 14 China patents.His current re⁃search interests include cloud⁃mobile computing,digital signal processing,audiovi⁃sual coding,and multimedia SoC design.

Kailun Zhou Zhou(vlsi@tongji.edu.cn)received his MS degree from Shanghai Jiaotong University,China in 2003.He is currently pursuing the PhD degree with Tongji Uni⁃versity,China.His current research interests include embedded system design,vid⁃eo coding,and ASIC architecture,design and verification.

Liping Zhao Zhao(vlsi@tongji.edu.cn)received her MS degree in computer science and technology from Hunan University,China in 2009.She is currently a PhD candidate in control science and engineering at VLSI lab of Tongji University,China.Her cur⁃rent research interests include screen content coding and video coding.

This work was supported in part by National Natural Science Foundation of China under Grant No.61201226 and 61271096,Natural Science Foundation of Shanghai under Grant No.12ZR1433800,and Specialized Research Fund for the Doctoral Program under Grant No.20130072110054.