Introduction
AMD Vitis™ AI Engine Component
AMD DSP Library
Instantiating the FFT from the AMD DSP Library
Running AI Engine Compiler and Analyzing the output
Summary
Disclaimers

Published November 17, 2025 © MIT

14 Building an FFT on AMD AIE-ML using the DSP Library

In this tutorial, I am showing how to build a 1024 point FFT on the AMD AIE-ML using the DSP Library which is available as Open Source

IntermediateProtip1 hour300

14 Building an FFT on AMD AIE-ML using the DSP Library

Things used in this project

Software apps and online services

AMD Vitis Unified Software Platform

Story

Introduction

The Fast Fourier Transform (FFT) is a fundamental building block used in DSP systems. Although its algorithm is quite easily understood, the variants of the architectures can be a large time sink for hardware engineers today.

To help DSP engineers working on the AI Engine, AMD is providing the DSP Library as a part of the Vitis Libraries repository which is an Open Source repository provided on GitHub:

https://github.com/Xilinx/Vitis_Libraries/tree/main/dsp

You can also find the comprehensive documentation here:

https://docs.amd.com/r/en-US/Vitis_Libraries/dsp/index.html

The DSP library is provided with multiple layers:

L1 are the basic kernels
L2 are sub-graphs (which can be called from a graph) which are running on 1 to multiple AI Engine tiles

The recommendation is to work with the L2 elements which give a higher level of abstraction.

As part of the DSP Library, we can find an FFT optimized for the various AI Engine architecture (AIE, AIE-ML and AIE-MLv2).

In this tutorial we will show how we can implement a 1024-point FFT using the DSP Library.

AMD Vitis™ AI Engine Component

The elements from the Vitis Library can be called from AI Engine graph code.

The first step is then to create a new AI Engine component in the Vitis Unified IDE (File > New Component > AI Engine). We can call this component fft_1024, leave it as an empty component (not adding any source file) for the moment and targeting the xcve2302-sfva784-1LP-e-S part which is the one on the Trenz TE0950.

Then we have to create our source files. We could create them using a simple text editor and write all the content manually. But here I will use a feature from the Vitis Unified IDE that can create a template based on some parameters.

When you have a new AI Engine component with no source file, you have an option in the component settings to GenerateAIEPrototypeCode.

Generate AIE Prototype Code option in the Vitis IDE

In Generate AIE Prototype Code window I am configuring the graph name, changing the data types to 16-bit complex integer (cint16) and enable Generate Top Level graph and Simulation code.

Note: We will not need the kernel code as we will call the FFT from the DSP Library but this will give us a good reference for the the graph and top level file.

Generate AIE Prototype Code window

Once we click on Generate we have the new source files added to the component and the Top-level file set for our component.

AI Engine component with the newly created files

As a sanity check we can run the x86 or AI Engine compiler to verify that the generated code is valid. I am getting a successful build on both so moving forward.

Note: At this point, I am removing the 2 kernel source and header files my_kernel.cpp/.h as I will not be using them.

AMD DSP Library

As mentioned in the DSP library is available as part of the AMD Vitis Libraries repository on GitHub. To use it you will need to clone the repository on your machine.

You can directly clone it from a terminal using the following link:

https://github.com/Xilinx/Vitis_Libraries.git

Or you can clone the library directly from the Vitis Unified IDE, from the Libraries section, click on the download icon on the Vitis Accelerated Libraries repository line

Download the Vitis Libraries from the Vitis Unified IDE

if you click on the pen icon on the same line, you can see where the library was downloaded. You can also use this window to change where the libraries are downloaded or get a different branch.

Location of the local Vitis Libraries

Then from our AI Engine component we need to point to 3 folders of the DSP Library:

<download path>/vitis_libraries/dsp/L2/include/aie
<download path>/vitis_libraries/dsp/L1/include/aie
<download path>/vitis_libraries/dsp/L1/src/aie

This can be done from the aiecompiler.cfg file for our AI Engine component

Adding the DSP Libraries folders to the AI Engine component

Instantiating the FFT from the AMD DSP Library

Now that the tool is set up to use the AMD DSP library we can call the DSP Library from our graph.

For that I have used the documentation which includes an example that I have adapted for our use case.

There are the different pages of the documentation you might want to look at when implementing the FFT from the DSP Library:

The overview of the classhttps://docs.amd.com/r/en-US/Vitis_Libraries/dsp/rst/class_xf_dsp_aie_fft_dit_1ch_fft_ifft_dit_1ch_graph.html_0
The code example for the FFT:https://docs.amd.com/r/en-US/Vitis_Libraries/dsp/user_guide/L2/func-fft-ifft-aie-only.html_7

First, in graph_FFT_1024.h, we can remove most of the lines related to the kernel to keep on the following code:

#include <adf.h>

using namespace adf;

class my_graph : public graph {
public:
  input_plio in;
  output_plio out;

  my_graph() {
    in  = input_plio::create(plio_64_bits, "data/input.txt");
    out  = output_plio::create(plio_64_bits, "data/output.txt");

    // TODO change connectivity to FFT
    connect<>(in.out[0], k.in[0]);
    connect<>(k.out[0], out.in[0]);
  }
};

Note: I am also changing the PLIO to 64 bit interfaces as we have seen in a previous tutorial that we could get better performance with this configuration.

Then we can call the header file for the FFT

#include "fft_ifft_dit_1ch_graph.hpp"

Then we just need to set a minimal set of parameters which are the sample datatype (DATA_TYPE_FFT), the twiddle datatype (TWIDDLE_TYPE), the point size of the FFT (POINT_SIZE), if the FFT is a FFT or inverse FFT (TP_FFT_NIFFT ) and the output shift (TP_SHIFT).

#define DATA_TYPE_FFT cint16
#define TWIDDLE_TYPE cint16
#define POINT_SIZE 1024
#define TP_FFT_NIFFT 1
#define TP_SHIFT 10

Then we can instantiate the FFT graph

xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph<DATA_TYPE_FFT, TWIDDLE_TYPE, POINT_SIZE,TP_FFT_NIFFT,TP_SHIFT> fft_1024;

And finally connect the FFT directly to the PLIOs of our graph:

connect<>(in.out[0], fft_1024.in[0]);
connect<>(fft_1024.out[0], out.in[0]);

This is all we have to do to implement our 1024 point FFT

Running AI Engine Compiler and Analyzing the output

Now that we have implemented the 1024 point FFT inside our graph we can run the AI Engine compiler to verify that our code is correct and check the hardware implementation

Note: I have actually run the compiler targeting the X86 simulation first to verify that my code is correct as this is what I have recommended in a previous article ;). I am moving to AI Engine compiler as this built with no issue.

From the graph report view we can see that the FFT was implemented using 1 tile. We can see that multiple buffers are also connected to the kernel to hold the values for the twiddle data. There are also piing/pong buffers connected at each side of the FFT.

DSP Lib 1024 pt FFT Graph view

Then looking at the array view, we can see that our graph is taking 1 FFT for the compute but also 2 others for the compute.

DSP Lib 1024 point FFT Array view

The compiler is usually spreading the buffers as it is trying to achieve the best performances without "understanding" the design. This is probably something that we can improve through a future article.

Note 2: You can use the following project to rebuild an AMD Vitis workspace to get the final version of the project after the steps mentioned in this tutorial: https://github.com/xflorentw/AI_Engine_Basic/tree/main/02_FFT_AIE-ML

Run make allBefore running the command, you will need to clone the Vitis_Libraries repository from GitHub and set an environment variable DSPLIB_ROOT to Vitis_Libraries/dsp

Summary

In this tutorial we have seen how to instantiate a 1024-point FFT for the AIE-ML using the DSP Library. The next step for us will be to simulate it. For this we will use a Python test bench to generate stimuli and golden data. This is what I will show in the next tutorial.

If you are looking for more a more advanced FFT example, you might want to look at this example from Tom Simpson:

https://www.hackster.io/dsp2/amd-versal-ai-engine-2-gsps-4k-point-fft-11ab7d

Disclaimers

AMD, Versal, and Vitis are trademarks or registered trademarks of Advanced Micro Devices, Inc.
Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

Credits

Florent Werbrouck

14 projects • 14 followers

Passionate about FPGA devices

14 Building an FFT on AMD AIE-ML using the DSP Library

Things used in this project

Software apps and online services

Story

Introduction

AMD Vitis™ AI Engine Component

AMD DSP Library

Instantiating the FFT from the AMD DSP Library

Running AI Engine Compiler and Analyzing the output

Summary

Disclaimers

Code

AI Engine Basic Projects

Credits

Florent Werbrouck

Comments

Embed the widget on your own site

14 Building an FFT on AMD AIE-ML using the DSP Library

14 Building an FFT on AMD AIE-ML using the DSP Library

Things used in this project

Software apps and online services

Story

Introduction

AMD Vitis™ AI Engine Component

AMD DSP Library

Instantiating the FFT from the AMD DSP Library

Running AI Engine Compiler and Analyzing the output

Summary

Disclaimers

Code

AI Engine Basic Projects

Credits

Florent Werbrouck

Comments

Related channels and tags