summaryrefslogtreecommitdiff
path: root/libavutil/mips/generic_macros_msa.h
Commit message (Collapse)AuthorAge
* mips: Fix potential illegal instruction error.Shiyou Yin2021-05-07
| | | | | | | | MSA2 optimizations are attached to MSA macros in generic_macros_msa.h. It's difficult to do runtime check for them. Remove this part of code can make it more robust. H264 1080p decoding: 5.13x==>5.12x. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/mips/generic_macros_msa: Fix prob that 'ulw' and 'uld' unsupported by ↵Shiyou Yin2020-07-30
| | | | | | | | | clang. GCC support these two synthesized instruction, but clang does not yet. Use machine instruction instead to adapt clang compiler. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: msa optimizations for vc1dspgxw2019-10-30
| | | | | | | Performance of WMV3 decoding has speed up from 3.66x to 5.23x tested on 3A4000. Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/mips: refactor msa SLDI_Bn_0 and SLDI_Bn macros.gxw2019-09-16
| | | | | | | | | | | | Changing details as following: 1. The previous order of parameters are irregular and difficult to understand. Adjust the order of the parameters according to the rule: (RTYPE, input registers, input mask/input index/..., output registers). Most of the existing msa macros follow the rule. 2. Remove the redundant macro SLDI_Bn_0 and use SLDI_Bn instead. Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/mips: remove redundant code in TRANSPOSE16x8_UB_UB.Shiyou Yin2019-08-15
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/mips: refine msa macros CLIP_*.gxw2019-08-13
| | | | | | | | | | | | | | | Changing details as following: 1. Remove the local variable 'out_m' in 'CLIP_SH' and store the result in source vector. 2. Refine the implementation of macro 'CLIP_SH_0_255' and 'CLIP_SW_0_255'. Performance of VP8 decoding has speed up about 1.1%(from 7.03x to 7.11x). Performance of H264 decoding has speed up about 0.5%(from 4.35x to 4.37x). Performance of Theora decoding has speed up about 0.7%(from 5.79x to 5.83x). 3. Remove redundant macro 'CLIP_SH/Wn_0_255_MAX_SATU' and use 'CLIP_SH/Wn_0_255' instead, because there are no difference in the effect of this two macros. Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/mips: refactor msa load and store macros.Shiyou Yin2019-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace STnxm_UB and LDnxm_SH with new macros ST_{H/W/D}{1/2/4/8}. The old macros are difficult to use because they don't follow the same parameter passing rules. Changing details as following: 1. remove LD4x4_SH. 2. replace ST2x4_UB with ST_H4. 3. replace ST4x2_UB with ST_W2. 4. replace ST4x4_UB with ST_W4. 5. replace ST4x8_UB with ST_W8. 6. replace ST6x4_UB with ST_W2 and ST_H2. 7. replace ST8x1_UB with ST_D1. 8. replace ST8x2_UB with ST_D2. 9. replace ST8x4_UB with ST_D4. 10. replace ST8x8_UB with ST_D8. 11. replace ST12x4_UB with ST_D4 and ST_W4. Examples of new macro: ST_H4(in, idx0, idx1, idx2, idx3, pdst, stride) ST_H4 store four half-word elements in vector 'in' to pdst with stride. About the macro name: 1) 'ST' means store operation. 2) 'H/W/D' means type of vector element is 'half-word/word/double-word'. 3) Number '1/2/4/8' means how many elements will be stored. About the macro parameter: 1) 'in0, in1...' 128-bits vector. 2) 'idx0, idx1...' elements index. 3) 'pdst' destination pointer to store to 4) 'stride' stride of each store operation. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/mips: optimize UNPCK&SAD macros with MSA2.0 instruction.Shiyou Yin2019-07-10
| | | | | | | Loongson 3A4000 and 2k1000 has supported MSA2.0. This patch optimized SAD_UB2_UH,UNPCK_R_SH_SW,UNPCK_SB_SH and UNPCK_SH_SW with MSA2.0 instruction. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc bi weighted hv mc msa functionsKaustubh Raste2017-10-25
| | | | | | | | Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc bi-weighted mc msa functionsKaustubh Raste2017-10-10
| | | | | | | | Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve avc weighted mc msa functionsKaustubh Raste2017-09-27
| | | | | | | | Replace generic with block size specific function. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc uni-w copy mc msa functionsKaustubh Raste2017-09-24
| | | | | | | | | | Load the specific destination bytes instead of MSA load and pack. Pack the data to half word before clipping. Use immediate unsigned saturation for clip to max saving one vector register. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve hevc sao band filter msa functionsKaustubh Raste2017-09-15
| | | | | | | | Preload data in band filter 0-8 for better pipeline parallelization. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: Improve vp9 mc msa functionsKaustubh Raste2017-09-08
| | | | | | | | Load the specific destination bytes instead of MSA load and pack. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* libavcodec/mips: Optimize avc idct 4x4 for msaKaustubh Raste2017-07-25
| | | | | | | | Removed memset call and improved performance. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* libavutil/mips: Updated msa generic macrosKaustubh Raste2017-07-21
| | | | | | | | | | | Reduced msa load-store code. Removed inline asm of GP load-store for 64 bit. Updated variable names in GP load-store macros for naming consistency. Corrected macro descriptions. Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com> Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/mips/generic_macros_msa: rename macro variable which causes segfault ↵Shivraj Patil2016-10-05
| | | | | | | for mips r6 Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 lpf functionsShivraj Patil2015-07-23
| | | | | | Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for idctdsp functionsShivraj Patil2015-07-07
| | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for idctdsp functions in new file idctdsp_msa.c and simple_idct_msa.c Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for me_cmp functionsShivraj Patil2015-07-06
| | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for me_cmp functions in new file me_cmp_msa.c Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for mpegvideoencdsp functionsShivraj Patil2015-07-06
| | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideoencdsp functions in new file mpegvideoencdsp_msa.c Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for mpegvideo functionsShivraj Patil2015-07-01
| | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideo functions in new file mpegvideo_msa.c Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for pixblock functionsShivraj Patil2015-06-29
| | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for pixblock functions in new file pixblockdsp_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for hpel functionsShivraj Patil2015-06-19
| | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for hpel functions in new file hpeldsp_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for qpel functionsShivraj Patil2015-06-18
| | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for qpel functions in new file qpeldsp_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC qpel functionsShivraj Patil2015-06-13
| | | | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC qpel functions in new file h264qpel_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Added const to local static array. Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC idct functionsShivraj Patil2015-06-11
| | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC idct functions in new file h264idct_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC intra prediction ↵Shivraj Patil2015-06-11
| | | | | | | | | | functions This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC intra prediction functions in new file h264pred_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for AVC chroma mc functionsShivraj Patil2015-06-11
| | | | | | | | s patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC chroma mc functions in new file h264chroma_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC intra prediction ↵Shivraj Patil2015-06-10
| | | | | | | | | | functions This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC intra predition functions in new file hevcpred_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC loop filter and ↵Shivraj Patil2015-06-10
| | | | | | | | | | | | sao functions This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC loop filter and sao functions in new file hevc_lpf_sao_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h In this patch, in comparision with previous patch, duplicated c functions are removed. Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC idct functionsShivraj Patil2015-06-04
| | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC idct functions in new file hevc_idct_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC uni mc epel functionsShivraj Patil2015-06-03
| | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uni mc epel functions. Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC uniw mc functionsShivraj Patil2015-06-03
| | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uniw mc functions (qpel as well as epel) in new file hevc_mc_uniw_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC bi mc functionsShivraj Patil2015-06-02
| | | | | | | | | This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC bi mc functions (qpel as well as epel) in new file hevc_mc_bi_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Adds HEVC specific macros (needed for this patch) in libavcodec/mips/hevc_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: Split uni mc optimizations to new fileShivraj Patil2015-05-28
| | | | | | | | | This patch moves HEVC code of uni mc cases to new file hevc_mc_uni_msa.c. (There are total 5 sub-modules of HEVC mc functions, if we add all these modules in one single file, its size would be huge (~750k) & difficult to maintain, so splitting it in multiple files) This patch also adds new HEVC header file libavcodec/mips/hevc_macros_msa.h Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avutil/mips: Restructure of generic macrosShivraj Patil2015-05-28
| | | | | | | | | | | | | This patch includes restructuring of existing macros and addition of more generic macros. This change was necessary to avoid repeated review comments in remaining patches which we were about to submit. Also this patch reduces number of code lines due to maximum use of generic macros, allows better code alignment & readability etc. These modifications in commonly used .libavutil/mips/generic_macros_msa.h. impacts the already accepted code, hence re-submitting it in 2/4,3/4 & 4/4. Overall, this patch set is just upgrading the code with styling changes and will bring it in sync with MIPS-SIMD optimized latest codebase at our end. Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC uni copy, uni ↵Shivraj Patil2015-05-07
| | | | | | | | horizontal and uni vertical mc functions Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for H264 lpf and ↵Shivraj Patil2015-05-01
| | | | | | | weight/biweight functions Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC copy and hv mc ↵Shivraj Patil2015-04-24
| | | | | | | | | | | functions Incorporated review comment. Removed "__" from volatile. Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avutil/mips/generic_macros_msa: volatile doesnt need __Michael Niedermayer2015-04-20
| | | | | Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for HEVC horizontal and ↵Shivraj Patil2015-04-17
vertical mc functions Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com> Reviewed-by: Nedeljko Babic <Nedeljko.Babic@imgtec.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>