Hacker News new | past | comments | ask | show | jobs | submit login

That seems like an odd comparison, specialty hardware is often better, right?

Hey, do DSPs have special hardware to help with FFTs? (I’m actually asking, this isn’t a rhetorical question, I haven’t used one of the things but it seems like it could vaguely be helpful).






Xilinx has a very highly optimized core for the FFT. You are restricted to power of 2 sizes. Which usually isn't a problem because its fairly common to zero pad an FFT anyway to avoid highly aliased (i.e. hard-edges) binning.

The downside of implementing directly in hardware, the size would be fixed.


They usually have dedicated acceleration hardware, yes: https://www.ti.com/lit/an/sprabb6b/sprabb6b.pdf?ts=174057874...

yes, almost all DSPs I know have native HW supports for FFT, since it's the bread and butter for signal processing

I remember hearing about logic to help with deinterleaving the results of the butterfly network after the FFT is done.

Yeah, bit-reversed addressing mode as seen on the dsPIC is an example of this.

(Discrete) Fast Fourier Transform implementations:

https://fftw.org/ ; FFTW: https://en.wikipedia.org/wiki/FFTW

gh topic: fftw: https://github.com/topics/fftw

xtensor-stack/xtensor-fftw is similar to numpy.fft: https://github.com/xtensor-stack/xtensor-fftw

Nvidia CuFFTW, and/amd-fftw, Intel MKL FFTW

NVIDIA CuFFT (GPU FFT) https://docs.nvidia.com/cuda/cufft/index.html

ROCm/rocFFT (GPU FFT) https://github.com/ROCm/rocFFT .. docs: https://rocm.docs.amd.com/projects/rocFFT/en/latest/

AMD FFT, Intel FFT: https://www.google.com/search?q=AMD+FFT , https://www.google.com/search?q=Intel+FFT

project-gemmi/benchmarking-fft: https://github.com/project-gemmi/benchmarking-fft

"An FFT Accelerator Using Deeply-coupled RISC-V Instruction Set Extension for Arbitrary Number of Points" (2023) https://ieeexplore.ieee.org/document/10265722 :

> with data loading from either specially designed vector registers (V-mode) or RAM off-the-core (R-mode). The evaluation shows the proposed FFT acceleration scheme achieves a performance gain of 118 times in V-mode and 6.5 times in R-mode respectively, with only 16% power consumption required as compared to the vanilla NutShell RISC-V microprocessor

"CSIFA: A Configurable SRAM-based In-Memory FFT Accelerator" (2024) https://ieeexplore.ieee.org/abstract/document/10631146

/? dsp hardware FFT: https://www.google.com/search?q=dsp+hardware+fft




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: