| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
| |
All our ARM asm preserves alignment so setting this attribute
in a common location is simpler. This removes numerous warnings
when linking with armcc.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
| |
LDR with register offset and PC as base register is not available in
the Thumb instruction set so the addition must be done separately.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
| |
statements
|
|
|
|
|
|
|
| |
The Apple assembler refuses to assemble the 3-operand form
in Thumb2 even though it is valid syntax.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
|
|
| |
When building Thumb2 code, the end of a function, where the PIC
offsets are placed, need not be aligned. Although the values
are only accessed with instructions allowing unaligned addresses,
keeping them aligned is preferable.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
| |
This allows using a 16-bit opcode when generating Thumb2 code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
| |
|
|
|
|
|
|
| |
Can be used by DTS-HD, TrueHD and E-AC-3, among others.
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
| |
This separates code relying on inline from that relying on external
assembly and fixes instances where the coalesced check was incorrect.
|
|
|
|
|
| |
The SWAP macro does not work for explicit xmm/ymm usage, so instead just move
the scalar value from xmm2 to xmm0.
|
|
|
|
|
|
| |
actual code.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
| |
Also mention this change in APIchanges.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
13% faster on penryn, 16% on sandybridge, 15% on bulldozer
Not simd; a compiler should have generated this, but gcc didn't.
|
|
|
|
|
| |
Double does not have enough precision to represent all int64 numbers
exactly.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GCC 4.3 and later do the right thing with the plain C code. Earlier
versions in 32-bit mode generate one extra instruction, needlessly
zeroing what would be the high half of the shifted value. At least
two gcc configurations miscompile the inline asm in some situations.
In 64-bit mode, all gcc versions generate imul r64, r64 followed by
shr. On Intel i7 and later, this imul is faster 32-bit mul. On
older Intel and all AMD, it is slightly slower. On Atom it is much
slower.
Considering where the FASTDIV macro is used, any overall negative
performance impact of this change should be negligible. If anyone
cares, they should file a bug against gcc and get the instruction
selection fixed.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
| |
There is no point in having the user disable any fastdiv macros.
Besides the condition implementation was broken and only disabled
the C implementation, but no platform specific assembly versions.
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
| |
This avoids having the compiler redundantly mask the values to
the smaller size.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
| |
Fixed-point audio codecs often use saturating arithmetic, and
special instructions for these operations are common.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
| |
This makes struct AVDictionary fully opaque now that nothing
needs to access it directly any more.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
|
| |
This adds a function to retrieve the number of entries in a
dictionary and updates the places directly accessing what should
be an opaque struct to use this new function instead.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
|
|
| |
The only compiler I have that does not define the standard
offsetof() macro is "Bruce's C Compiler", a simple compiler
for producing 8/16-bit 8086 code, usually for use in early
stages of PC booting.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
| |
This list is incomplete (we also use UINT16_MAX), so there does
not appear to be any system we care about that needs these.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This macro is only used in two places, both in libavcodec, so this
is a more sensible place for it.
Two small tweaks to the macro are made:
- removing the trailing semicolon
- dropping unnecessary 'volatile' from the x86 asm
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
| |
These x86-specific macros do not belong in generic code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
| |
This puts x86-specific things in the x86/ subdirectory where they
belong.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
|
|
|
| |
Some compilers do not support the Q/R modifiers used to access
the low/high parts of a 64-bit register pair. Check for this
and disable all uses of it when not supported.
Fixes bug #337.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
|
| |
It appears that something goes wrong in old nasm versions when the
%+ operator is used in the last argument of a macro invocation and
this argument is tested with %ifdef within the macro. This patch
rearranges the macro arguments such that the %+ operator is never
used in the last argument.
|
|
|
|
|
|
|
| |
nasm does not support 'CPU foonop' directives. This adds a configure
test for the directive and uses it only if supported.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
| |
For some reason, nasm requires this. No harm done to yasm.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
| |
nasm prints a warning if the colon is missing.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
| |
This allows simplifying a few expressions.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
| |
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
|
|
|
|
|
|
| |
Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to
"3dnowext", which is a more common name of the CPU flag, as reported
e.g. by the Linux kernel, unifies this.
|