summaryrefslogtreecommitdiff
path: root/libavcodec/mips
Commit message (Collapse)AuthorAge
* avutil/mips: [loongson] simplify macro TRANSPOSE_4H and TRANSPOSE_8BShiyou Yin2018-09-09
| | | | | | Simplify macro TRANSPOSE_4H in mmiutils.h and add TRANSPOSE_8B as a common macro. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: [loongson] optimize vp8 decoding in vp8dsp.gxw2018-09-09
| | | | | | | | | | | | | Optimize vp8 loop filter with mmi, four functions optimized: 1. ff_vp8_h_loop_filter8uv_mmi. 2. ff_vp8_v_loop_filter8uv_mmi. 3. ff_vp8_h_loop_filter16_mmi. 4. ff_vp8_v_loop_filter16_mmi. Vp8 decoding speed improved about 50%(from 73fps to 110fps, Tested on loongson 3A3000). Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: [loongson] fix improper use of register constraints.Shiyou Yin2018-09-07
| | | | | | | | Constraint "g" means compiler can store variable in memory or register. When we use constraint "g" for a variable and this variable was operated by instruction which only support register operands may lead "invalid operands" error. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: [loongson] reoptimize put and add pixels clamped functions.Shiyou Yin2018-09-05
| | | | | | | | | | | | Simplify the usage of intermediate variable addr and remove unused variable all64 in following functions: 1. ff_put_pixels_clamped_mmi 2. ff_put_signed_pixels_clamped_mmi 3. ff_add_pixels_clamped_mmi This optimization speed up mpeg4 decode about 2% on loongson platform(tested with 3A3000). Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: [loongson] simplify the usage of intermediate variable addr.Shiyou Yin2018-09-04
| | | | | | | | | | Simplify the usage of intermediate variable addr in following functions: 1. ff_put_pixels4_8_mmi 2. ff_put_pixels8_8_mmi 3. ff_put_pixels16_8_mmi 4. ff_avg_pixels16_8_mmi. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec: [loongson] fix bug of mss2-wmv failed in fate test.Shiyou Yin2018-09-04
| | | | | | | | | | | | Failed case: mss2-wmv In following functions, pmullh was used to multiply two 16-bit data, this will cause data overflow. 1. ff_vc1_inv_trans_8x8_dc_mmi 2. ff_vc1_inv_trans_8x8_mmi 3. ff_vc1_inv_trans_8x4_mmi 4. ff_vc1_inv_trans_4x8_mmi 5. ff_vc1_inv_trans_4x4_mmi Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: [loongson] optimize memset in h264dsp.Shiyou Yin2018-09-02
| | | | | | | | | | | Optimized memset with mmi in following functions: 1. ff_h264_add_pixels4_8_mmi. 2. ff_h264_idct_add_8_mmi. 3. ff_h264_idct8_add_8_mmi. This optimization improved h264 decoding performance about 1.3%(tested on loongson 3A3000). Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: [loongson] reoptimize h264_chroma_mc8_mmi v2.Shiyou Yin2018-09-02
| | | | | | | Reoptimize function ff_put_h264_chroma_mc8_mmi and ff_avg_h264_chroma_mc8_mmi. Performance of h264 decoding improved about 5%(from 69fps to 73fps, tested on loongson 3A3000). Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: [loongson] reoptimize simple idct with mmi.Shiyou Yin2018-09-02
| | | | | | | | | | Performance of mpeg4 decoding improved about 23%(from 128fps to 158fps, tested on loongson 3A3000). Reoptimized following functions with mmi. 1. ff_simple_idct_put_8_mmi 2. ff_simple_idct_add_8_mmi 3. ff_simple_idct_8_mmi Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: fix conflicting types error of ff_vc1_h_s_overlap_mmi.Shiyou Yin2018-07-14
| | | | | | | | | | In commit 975a1a8,function ff_vc1_h_s_overlap_mmi was refactored, but the declaration in libavcodec/mips/vc1dsp_mips.h was unchanged. Change-Id: I90beae683511622a0cc1130ab1660ac8669ec3ef Signed-off-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Reviewed-by: Jerome Borsboom <jerome.borsboom@carpalis.nl> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/vc1: fix overlap filter for frame interlaced picturesJerome Borsboom2018-06-29
| | | | | | | | | | The overlap filter is not correct for vertical edges in frame interlaced I and P pictures. When filtering macroblocks with different FIELDTX values, we have to match the lines at both sides of the vertical border. In addition, we have to use the correct rounding values, depending on the line we are filtering. Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
* avcodec/mips: Improve hevc non-uni hz and vt mc msa functionsKaustubh Raste2017-11-14
| | | | | | | | Use mask buffer. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: cleanup unused macrosKaustubh Raste2017-11-14
| | | | | | Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc non-uni hv mc msa functionsKaustubh Raste2017-11-08
| | | | | | | | Use mask buffer. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni weighted 4 tap vt mc msa functionsKaustubh Raste2017-11-08
| | | | | | | | | | Use global mask buffer for appropriate mask load. Use immediate unsigned saturation for clip to max saving one vector register. Remove unused macro. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni 4 tap hv mc msa functionsKaustubh Raste2017-11-08
| | | | | | | | | Use global mask buffer for appropriate mask load. Remove unused macro and table. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi wgt 4 tap hv mc msa functionsKaustubh Raste2017-11-08
| | | | | | | | | Use global mask buffer for appropriate mask load. Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi 4 tap hv mc msa functionsKaustubh Raste2017-11-07
| | | | | | | | | Use global mask buffer for appropriate mask load. Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc avg mc 10, 30, 01 and 03 msa functionsKaustubh Raste2017-11-07
| | | | | | | | | | Align the mask buffer to 64 bytes. Load the specific destination bytes instead of MSA load and pack. Remove unused macros and functions. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni weighted 4 tap hz mc msa functionsKaustubh Raste2017-11-05
| | | | | | | | | | Use global mask buffer for appropriate mask load. Use immediate unsigned saturation for clip to max saving one vector register. Remove unused macro. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni 4 tap hz and vt mc msa functionsKaustubh Raste2017-11-05
| | | | | | | | Use global mask buffer for appropriate mask load. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi wgt 4 tap hz and vt mc msa functionsKaustubh Raste2017-11-04
| | | | | | | | Use global mask buffer for appropriate mask load. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi 4 tap hz and vt mc msa functionsKaustubh Raste2017-11-04
| | | | | | | | Use global mask buffer for appropriate mask load. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc avg mc 20, 21 and 23 msa functionsKaustubh Raste2017-11-04
| | | | | | | | | Load the specific destination bytes instead of MSA load and pack. Remove unused macros and functions. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni weighted hv mc msa functionsKaustubh Raste2017-11-03
| | | | | | | | Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc avg mc 02, 12 and 32 msa functionsKaustubh Raste2017-11-03
| | | | | | | | | | Remove loops and unroll as block sizes are known. Load the specific destination bytes instead of MSA load and pack. Remove unused macro and functions. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni vt and hv mc msa functionsKaustubh Raste2017-11-01
| | | | | | | | Remove unused macro. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi hz and hv mc msa functionsKaustubh Raste2017-11-01
| | | | | | | | Align the mask buffer. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi weighted copy, hz and vt mc msa functionsKaustubh Raste2017-11-01
| | | | | | | | | Pack the data to half word before clipping. Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc chroma avg hv mc msa functionsKaustubh Raste2017-10-30
| | | | | | | | | Replace generic with block size specific function. Load the specific destination bytes instead of MSA load and pack. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc avg mc 22, 11, 31, 13 and 33 msa functionsKaustubh Raste2017-10-30
| | | | | | | | | | Remove loops and unroll as block sizes are known. Load the specific destination bytes instead of MSA load and pack. Remove unused macro and functions. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi weighted hv mc msa functionsKaustubh Raste2017-10-25
| | | | | | | | Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc chroma copy and avg vert mc msa functionsKaustubh Raste2017-10-25
| | | | | | | | | Replace generic with block size specific function. Load the specific destination bytes instead of MSA load and pack. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc put mc 11, 31, 13 and 33 msa functionsKaustubh Raste2017-10-25
| | | | | | | | Remove loops and unroll as block sizes are known. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni weighted vert mc msa functionsKaustubh Raste2017-10-13
| | | | | | | | Pack the data to half word before clipping. Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni horiz mc msa functionsKaustubh Raste2017-10-13
| | | | | | | Update macros to remove adds. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi copy mc msa functionsKaustubh Raste2017-10-13
| | | | | | | | Load the specific destination bytes instead of MSA load and pack. Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc put mc 12, 32 and 22 msa functionsKaustubh Raste2017-10-13
| | | | | | | | Remove loops and unroll as block sizes are known. Removed unused functions. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc chroma avg horiz mc msa functionsKaustubh Raste2017-10-13
| | | | | | | | Replace generic with block size specific function. Load the specific destination bytes instead of MSA load and pack. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc uni copy mc msa functionsKaustubh Raste2017-10-10
| | | | | | | | Load the specific bytes instead of MSA load. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni-w horiz mc msa functionsKaustubh Raste2017-10-10
| | | | | | | | | | Load the specific destination bytes instead of MSA load and pack. Pack the data to half word before clipping. Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc put mc 21, 23 and 02 msa functionsKaustubh Raste2017-10-10
| | | | | | | | Remove loops and unroll as block sizes are known. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc chroma hv mc msa functionsKaustubh Raste2017-10-10
| | | | | | | | Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc bi-weighted mc msa functionsKaustubh Raste2017-10-10
| | | | | | | | Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: preload data in hevc sao edge 135 degree filter msa functionsKaustubh Raste2017-10-10
| | | | | | Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Cleanup unused functionsKaustubh Raste2017-10-06
| | | | | | Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc put mc 20, 01 and 03 msa functionsKaustubh Raste2017-09-27
| | | | | | | | Remove loops and unroll as block sizes are known. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc chroma vert mc msa functionsKaustubh Raste2017-09-27
| | | | | | | | Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc weighted mc msa functionsKaustubh Raste2017-09-27
| | | | | | | | Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Removed generic function call in avc intra msa functionsKaustubh Raste2017-09-27
| | | | | | Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>