insopbisector
Created November 17, 2020 © Apache-2.0

Towards efficient BERT inference on Alveo FPGA with TVM

We implemented BERT large model inference Vitis HLS kernel on Alveo U50 card with Apach TVM S/W.

ExpertFull instructions provided140
Towards efficient BERT inference on Alveo FPGA with TVM

Things used in this project

Hardware components

Alveo
AMD Alveo
Alveo U50
×1

Software apps and online services

Jupyter Notebook
Jupyter Notebook
Use jupyter notebook for experiment the models.
Nimbix cloud service
Nimbix cloud service for Alveo accerlator

Story

Read more

Schematics

HLS Matrix multiplication for BERT large model

This is a main repository that implements a highly efficient large matrix multiplication HLS implementation that taks the full advantages of HBM of the Alveo card

Code

HLS Matrix multiplication for BERT large model

This is a main repository that implements a highly efficient large matrix multiplication HLS implementation that taks the full advantages of HBM of the Alveo card

Apache TVM fork for the project

This is Apache TVM fork repo, which includes the changes for project

Simple Implementation for Transformer with C and Python

This repository implement Transformer with C and python for educational purpose.

TVM Vta test code on Alveo with OpenCL

This repository builds simple VTA on the Alveo and OpenCL

Credits

insop

insop

1 project • 1 follower
bisector

bisector

0 projects • 1 follower

Comments