This project is to demonstrate an implementation of configurable FIR digital filters in VHDL using Vivado. The design features a 32-tap parallel architecture that processes 16-bit inputs and outputs 32-bit filtered signals, with pipelined stages for optimal FPGA performance.
The highlight is the seamless integration between MATLAB coefficient generation and VHDL implementation: you can design your filter in MATLAB, quantize the coefficients, and directly copy them into the VHDL code. The tutorial covers from simulation validation using Xilinx Model Composer to hardware implementation on the PYNQ-Z2 board.
I demonstrate both low-pass and high-pass filter configurations, validate them with mixed-signal testing using DDS generators, and show real hardware results using integrated logic analyzers.
You can find the full video below
FIR Filter Architecture OverviewThis figure shows the implementation of a parallel FIR filter with constant coefficients. The output of an FIR filter of order or length L for an input time-series x[n] is calculated using the convolution sum equation.
An FIR filter consists of a series of tapped delay lines, adders, and multipliers. The FIR coefficients, also known as tap weights, determine the filter's response. Typically, the adders are implemented in multiple pipeline stages to improve the timing performance of the filter.
In the VHDL code, we will closely follow this graphical representation of the FIR filter. you can find the VHDL code in the provided GitHub repository.
VHDL Entity and Data Width ManagementThe proposed filter operates efficiently with integer values. The VHDL entity defines a 16-bit input, and after filtering—due to multiple stages of multiplication and addition—the output signal is expanded to 32 bits. All processes in the VHDL code operate synchronously on the rising edge of the provided clock.
In the VHDL code, each stage processes delayed samples and passes the results to the next stage. At each stage, the data bit width changes depending on the applied operations. To handle this, we need separate arrays to store the data for each stage. Here, we define the array types required to hold the intermediate data across stages.
The code allows you to design your FIR filter using Python or MATLAB and then easily copy the coefficients into the VHDL code. The filter consists of 32 taps, each with an 11-bit width. You will need to quantize your FIR coefficients to 11 bits before using them in the VHDL implementation.
Different processes in the design handle specific tasks with the help of defined arrays. The first process is the delayed line, which shifts input samples through the buffer—each new input pushes the existing samples one stage forward.
Filter Processing StagesMultiply Stage: Each delayed sample is multiplied by its corresponding coefficient, producing 32 parallel product values for summation.
Accumulating Stage: To compute the final output, all product values must be accumulated. However, the summation is performed across multiple stages to enhance pipelining and improve system timing, with sufficient registers inserted between the adders.
you can find the MATLAB code in the GitHub repository as well.
In the provided MATLAB code, you need to configure the following parameters:
- Sample Rate: This is the sampling frequency, which in our design corresponds to the clock frequency driving the system.
- Cutoff Frequency: Specify the cutoff frequency for the filter.
- Number of Taps: Set the number of taps for the filter, which determines the filter's precision and performance.
- Tap Bit Width: Define the bit width for the filter coefficients, which affects the quantization accuracy.
- Filter Type: Select the type of filter you need. This code currently supports only high-pass and low-pass filter designs as simple examples.
If you need more advanced filter types or designs, refer to the MATLAB documentation for creating custom filters.
After running the code, it will return the quantized taps that you can edit and use in your FIR filter. The code also returns the filter frequency response and the filter taps in time domain.
Vivado Simulation
Download the two VHDL files from the GitHub repository and add them as new sources to your project. Next, right-click on the block design and add the FIR_Filter as module to your design.
Signal Generation with DDS Compiler
Let's begin by generating the signals for our testbench:
1. Add a DDS Compiler to the block design. In the Configuration tab, set the system clock to 100 MHz.
2. Since we only need to generate sine waves, in the Implementation tab, select the Sine option for the output and deselect "has phase output" option. In the Output Frequency tab, set the frequency for Channel 1 to 500 kHz.
3. Add another DDS Compiler to the block design. Repeat the same setup, but this time set the output frequency to 10 MHz.
4. Add an Adder to the design. Set its input bit width to 8 and its output bit width to 9.
5. Connect the outputs of both DDS Compilers to the input of the Adder. This will generate the mixed signal.
The output of the adder is 9 bits wide, while the FIR filter expects 16-bit signed inputs. To ensure compatibility before feeding the mixed signals into the FIR filter, we need to cast the 9-bit result to a 16-bit signed value. For this purpose, I've provided a casting block that simplifies the conversion process. Just set the input bit width to 9 and the output bit width to 16.
Add a simulation clock generator to the design and set its frequency to 100MHz. This frequency will also be the sample rate of the design.
I recommend adding some ports to the design to make it easier to observe the simulation results.
Finally, add a top wrapper to your design and you can run the simulation to observe the behavior.
Simulation ResultsLow-Pass Filter Test
For the first test, we configure the FIR filter as low-pass filter. With the help of MATLAB code, generate 32 taps for low-pass filter and cutoff frequency of 5MHz.
As shown in the simulation results:
- The first sine wave has a period of 100 ns, confirming it is a 10 MHz signal.
- The second sine wave has a period of 2000 ns, confirming it is a 500 kHz signal.
- You can also observe the mixed signal, where the faster signal fluctuates in amplitude at the frequency of the slower signal.
- You'll observe that the FIR filter is behaving as a low-pass filter, attenuating the faster signal, leaving only the desired lower-frequency component in the output.
High-Pass Filter Test
For the second test, we configure the FIR filter as high-pass filter. With the help of MATLAB code, generate 32 taps for high-pass filter and cutoff frequency of 5MHz. Set the new taps in VHDL code, and again run the simulation.
As shown in the simulation results:
This time you will observe that the FIR filter is acting as high-pass filter and removing the slower signal while the faster signal remains!
High-Pass FIR Filter ConfigurationIn this section, we replace the FIR taps with high-pass filter coefficients. Go to the MATLAB code and configure it to generate quantized high-pass filter taps. Then, copy the generated taps from MATLAB and paste them into the VHDL code.
After inserting the new coefficients, Vivado will prompt you to update the IPs, which will refresh the filter module in the block design. Once you run the simulation, you will see that the FIR filter now functions as a high-pass filter—suppressing the slower signal while preserving the faster one.
The FIR_Filter and casting blocks are fully synthesizable, so there's no need to modify them. However, you must provide a clock source for the design. You can use any clock source, such as the PS-PL clock from the ZYNQ IP block or an external clock generated using the Clocking Wizard.
If you choose to use the ZYNQ IP block, ensure that the ZYNQ IP is properly initialized; otherwise, the PL (Programmable Logic) side will not receive a clock. For example, you can run a simple "Hello World" program on one of the ARM cores, which will activate the clock for the PL side.
To keep the process straightforward and avoid using Vitis, I selected to use an external clock available on the evaluation board.
The PYNQ-Z2 also has an external 125 MHz reference clock connected to pin H16 of the PL. The external reference clock allows the PL to be used independently of the PS.
Add a clocking wizard to your design and set its output to 100MHz. Make the input clock as external—later on after synthesis, you will need to assign proper pin to the input clock based on your evaluation board setup.
Debugging Setup with ILARemove all the external pins that we added for the simulation. We need to include an ILA (Integrated Logic Analyzer) in the design to monitor and study the targeted signals. Add debugging probes for all the ports that you like to study and then run auto connection—it will add a logic analyzer and add its inputs to the selected ports.
Don't forget to add a top wrapper to your design and now you can proceed to synthesis.
For the implementation, we configure the FIR filter as low-pass filter. With the help of MATLAB code, generate 32 taps for low-pass filter and cutoff frequency of 5MHz.
Pin Assignment and Bitstream GenerationAfter synthesis is completed, open synthesized design and navigate to the I/O window. Assign proper pin to the input clock based on your evaluation board schematic.
We can now proceed and generate the bitstream. After running the bitstream on device we should be able to see the same results as simulation!
Comments