| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
Some new warnings regarding use of empty macro parameters has
been added, so adjust some x86inc code to silence those.
Fixes part of ticket #8771
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
|
| |
|
|
|
|
|
| |
On ELF platforms such symbols needs to be flagged as functions with the
correct visibility to please certain linkers in some scenarios.
|
|
|
|
|
| |
The standard section for read-only data on Windows is .rdata. Nasm will
flag non-standard sections as executable by default which isn't ideal.
|
|
|
|
|
|
|
| |
There are 32 pseudo-instructions for each floating-point comparison
instruction, but only 8 of them are actually valid in legacy-encoded mode.
The remaining 24 requires the use of VEX-encoded (v-prefixed) instructions
and can therefore be disregarded for this purpose.
|
|
|
|
|
|
|
|
|
| |
but not used
Fixes compilation of libavresample/x86/audio_mix.asm
Reviewed-by: Gramner
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AVX-512 consists of a plethora of different extensions, but in order to keep
things a bit more manageable we group together the following extensions
under a single baseline cpu flag which should cover SKL-X and future CPUs:
* AVX-512 Foundation (F)
* AVX-512 Conflict Detection Instructions (CD)
* AVX-512 Byte and Word Instructions (BW)
* AVX-512 Doubleword and Quadword Instructions (DQ)
* AVX-512 Vector Length Extensions (VL)
On x86-64 AVX-512 provides 16 additional vector registers, prefer using
those over existing ones since it allows us to avoid using `vzeroupper`
unless more than 16 vector registers are required. They also happen to
be volatile on Windows which means that we don't need to save and restore
existing xmm register contents unless more than 22 vector registers are
required.
Big thanks to Intel for their support.
|
|\
| |
| |
| |
| |
| |
| | |
* commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6':
asm: Consistently uppercase SECTION markers
Merged-by: James Almer <jamrial@gmail.com>
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When allocating stack space with an alignment requirement that is larger
than the current stack alignment we need to store a copy of the original
stack pointer in order to be able to restore it later.
If we chose to use another register for this purpose we should not pick
eax/rax since it can be overwritten as a return value.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| | |
Allows emulation to work when dst is equal to src2 as long as the
instruction is commutative, e.g. `addps m0, m1, m0`.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The yasm/nasm preprocessor only checks the first token, which means that
parameters such as `dword [rax]` are treated as identifiers, which is
generally not what we want.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| | |
Those instructions are not commutative since they only change the first
element in the vector and leave the rest unmodified.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.
Currently only implemented for ELF.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The REP_RET workaround is only needed on old AMD cpus, and the labels clutter
up the symbol table and confuse debugging/profiling tools, so use EQU to
create SHN_ABS symbols instead of creating local labels. Furthermore, skip
the workaround completely in functions that definitely won't run on such cpus.
Note that EQU is just creating a local label when using nasm instead of yasm.
This is probably a bug, but at least it doesn't break anything.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
cpuflags is never undefined any more, it's set to 0 instead.
Also fix an incorrect comment.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
When allocating stack space with a larger alignment than the known stack
alignment a temporary register is used for storing the stack pointer.
Ensure that this isn't one of the registers used for passing arguments.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Correctly handle FMA instructions with memory operands.
* Print a warning if FMA instructions are used without the correct cpuflag.
* Simplify the instantiation code.
* Clarify documentation.
Only the last operand in FMA3 instructions can be a memory operand. When
converting FMA4 instructions to FMA3 instructions we can utilize the fact
that multiply is a commutative operation and reorder operands if necessary
to ensure that a memory operand is used only as the last operand.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| | |
Makes it possible to use them in arithmetic expressions.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| | |
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| | |
The bug was fixed in 1.3.0, so only perform the workaround in earlier versions.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| | |
Signed-off-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| | |
Signed-off-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Change ALLOC_STACK to always align the stack before allocating stack space for
consistency. Previously alignment would occur either before or after allocating
stack space depending on whether manual alignment was required or not.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Emulation requires a temporary register if arguments 1 and 4 are the same; this
doesn't obey the semantics of the original instruction, so we can't emulate
that in x86inc.
Also add pmacsdql emulation.
Signed-off-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Silences warning(s) like:
libavcodec/x86/fft.asm:93: warning: section flags ignored on
section redeclaration
The cause of this warning is that because `struc` and `endstruc`
attempts to revert to the previous section state [1].
The section state is stored in the macro __SECT__, defined by
x86inc.asm to be `.note.GNU-stack ...`, through the `SECTION`
directive [2].
Thus, the `.note.GNU-stack` section is defined twice
(once in x86inc.asm, once during `endstruc`), causing the warning.
That is the first part of the commit: using the primitive `[section]` format
for .note.GNU-stack etc., which does not update `__SECT__` [2].
That fixes only half of the problem. Even without any `SECTION` directives,
`__SECT__` is predefined as `.text`, which conflicting with the later
`SECTION_TEXT` (which expands to `.text align=16`).
[1]: http://www.nasm.us/doc/nasmdoc6.html#section-6.4
[2]: http://www.nasm.us/doc/nasmdoc6.html#section-6.3
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
|
| |
| |
| |
| |
| |
| | |
Previously there was a limit of two cpuflags.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
| |
| |
| | |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
| |
| |
| |
| |
| | |
This makes more sense for future implementations of templates with zmm registers.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Yasm:
src/libavfilter/x86/af_volume.asm:24: warning: Standard COFF does not support read-only data sections
src/libavfilter/x86/af_volume.asm:24: warning: Unrecognized qualifier `align'
Nasm:
src/libavfilter/x86/af_volume.asm:24: error: standard COFF does not support section alignment specification
src/libavutil/x86/x86inc.asm:92: ... from macro `SECTION_RODATA' defined here
Tested-by: Clément Bœsch <u@pkh.me>
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Simplifies writing assembly code that depends on available instructions.
LZCNT implies SSE2
BMI1 implies AVX+LZCNT
AVX2 implies BMI2
|
| |
| |
| |
| |
| | |
The use of rsp was pretty much hardcoded there and probably didn't work
otherwise with stack_size > 0.
|
| |
| |
| |
| |
| |
| |
| | |
Due to a peculiarity in the ModR/M addressing encoding, the r12 and r13
registers sometimes requires an additional byte when used as a base register.
r14 and r15 doesn't have that issue, so prefer using them.
|
| |
| |
| |
| | |
There's no point in emitting a rep prefix before ret on modern CPUs.
|
| |
| |
| |
| |
| |
| | |
We overload the `call` instruction with a macro, but it would misbehave when
the macro argument wasn't a valid identifier. Fix it by explicitly checking
if the argument is an identifier.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When allocating stack space with an alignment requirement that is larger
than the current stack alignment we need to store a copy of the original
stack pointer in order to be able to restore it later.
If we chose to use another register for this purpose we should not pick
eax/rax since it can be overwritten as a return value.
|
| |
| |
| |
| |
| | |
Allows emulation to work when dst is equal to src2 as long as the
instruction is commutative, e.g. `addps m0, m1, m0`.
|
| |
| |
| |
| |
| |
| | |
The yasm/nasm preprocessor only checks the first token, which means that
parameters such as `dword [rax]` are treated as identifiers, which is
generally not what we want.
|
| | |
|
| |
| |
| |
| |
| | |
Those instructions are not commutative since they only change the first
element in the vector and leave the rest unmodified.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some debuggers/profilers use this metadata to determine which function a
given instruction is in; without it they get can confused by local labels
(if you haven't stripped those). On the other hand, some tools are still
confused even with this metadata. e.g. this fixes `gdb`, but not `perf`.
Currently only implemented for ELF.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The REP_RET workaround is only needed on old AMD cpus, and the labels clutter
up the symbol table and confuse debugging/profiling tools, so use EQU to
create SHN_ABS symbols instead of creating local labels. Furthermore, skip
the workaround completely in functions that definitely won't run on such cpus.
Note that EQU is just creating a local label when using nasm instead of yasm.
This is probably a bug, but at least it doesn't break anything.
|
| |
| |
| |
| |
| |
| | |
cpuflags is never undefined any more, it's set to 0 instead.
Also fix an incorrect comment.
|
| | |
|
| |
| |
| |
| |
| |
| | |
When allocating stack space with a larger alignment than the known stack
alignment a temporary register is used for storing the stack pointer.
Ensure that this isn't one of the registers used for passing arguments.
|