diff options
author | Lynne <dev@lynne.ee> | 2021-02-27 04:11:04 +0100 |
---|---|---|
committer | Lynne <dev@lynne.ee> | 2021-02-27 04:21:05 +0100 |
commit | 8e94b7cff03539bcb4c360d2550a031a5378df03 (patch) | |
tree | d22c567906393c503244f1d43caa4d9c500528fe /libavutil/tx_template.c | |
parent | 9ddaf0c9f06fab9194161425a32615c4cfc2ec20 (diff) |
lavu/tx: invert permutation lookups
out[lut[i]] = in[i] lookups were 4.04 times(!) slower than
out[i] = in[lut[i]] lookups for an out-of-place FFT of length 4096.
The permutes remain unchanged for anything but out-of-place monolithic
FFT, as those benefit quite a lot from the current order (it means
there's only 1 lookup necessary to add to an offset, rather than
a full gather).
The code was based around non-power-of-two FFTs, so this wasn't
benchmarked early on.
Diffstat (limited to 'libavutil/tx_template.c')
-rw-r--r-- | libavutil/tx_template.c | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/libavutil/tx_template.c b/libavutil/tx_template.c index 711013c352..0c76e0ed6f 100644 --- a/libavutil/tx_template.c +++ b/libavutil/tx_template.c @@ -410,7 +410,7 @@ static void monolithic_fft(AVTXContext *s, void *_out, void *_in, } while ((src = *inplace_idx++)); } else { for (int i = 0; i < m; i++) - out[s->revtab[i]] = in[i]; + out[i] = in[s->revtab[i]]; } fft_dispatch[mb](out); @@ -738,7 +738,7 @@ int TX_NAME(ff_tx_init_mdct_fft)(AVTXContext *s, av_tx_fn *tx, if (n != 1) init_cos_tabs(0); if (m != 1) { - if ((err = ff_tx_gen_ptwo_revtab(s))) + if ((err = ff_tx_gen_ptwo_revtab(s, n == 1 && !(flags & AV_TX_INPLACE)))) return err; if (flags & AV_TX_INPLACE) { if (is_mdct) /* In-place MDCTs are not supported yet */ |