summaryrefslogtreecommitdiff
path: root/libavcodec/x86
diff options
context:
space:
mode:
authorJanne Grunau <janne-libav@jannau.net>2015-12-29 12:08:38 +0100
committerJanne Grunau <janne-libav@jannau.net>2015-12-30 13:37:57 +0100
commit8563f9887194b07c972c3475d6b51592d77f73f7 (patch)
tree97e0ebde5ed5c87ae7f5679137c5719e71eaed64 /libavcodec/x86
parentf0f54117c8f206e8045d301c2eb975b26e9f263d (diff)
x86: use emms after ff_int32_to_float_fmul_scalar_sse
Intel's Instruction Set Reference (as of September 2015) clearly states that cvtpi2ps switches to MMX state. Actual CPUs do not switch if the source is a memory location. The Instruction Set Reference from 1999 (Order Number 243191) describes this behaviour but all later versions I've seen have make no distinction whether MMX registers or memory is used as source. The documentation for the matching SSE2 instruction to convert to double (cvtpi2pd) was fixed (see the valgrind bug https://bugs.kde.org/show_bug.cgi?id=210264). It will take time to get a clarification and fixes in place. In the meantime it makes sense to change ff_int32_to_float_fmul_scalar_sse to be correct according to the documentation. The vast majority of users will have SSE2 so a change to the SSE version has little effect. Fixes fate-checkasm on x86 valgrind targets. Valgrind 'bug' reported as https://bugs.kde.org/show_bug.cgi?id=357059
Diffstat (limited to 'libavcodec/x86')
-rw-r--r--libavcodec/x86/fmtconvert.asm9
1 files changed, 8 insertions, 1 deletions
diff --git a/libavcodec/x86/fmtconvert.asm b/libavcodec/x86/fmtconvert.asm
index 03833220a0..2a3e4a5f74 100644
--- a/libavcodec/x86/fmtconvert.asm
+++ b/libavcodec/x86/fmtconvert.asm
@@ -61,7 +61,14 @@ cglobal int32_to_float_fmul_scalar, 4, 4, %1, dst, src, mul, len
mova [dstq+lenq+16], m2
add lenq, 32
jl .loop
- REP_RET
+%if notcpuflag(sse2)
+ ;; cvtpi2ps switches to MMX even if the source is a memory location
+ ;; possible an error in documentation since every tested CPU disagrees with
+ ;; that. Use emms anyway since the vast majority of machines will use the
+ ;; SSE2 variant
+ emms
+%endif
+ RET
%endmacro
INIT_XMM sse