Published July 25, 2024 © GPL3+

Paper reading assistant - Ryzen AI

A RAG system that assists users in reading academic papers and other documents with Ryzen AI, thereby enhancing their research efficiency.

IntermediateFull instructions provided1 hour247

Things used in this project

Hardware components

Minisforum Venus UM790 Pro with AMD Ryzen™ 9

Software apps and online services

Ryzen AI Software

Story

Introduction

With the rapid advancement of academic research and technological innovation, a vast number of papers are published each year. For researchers, effectively sifting through, understanding, and synthesizing key information from these documents has become increasingly important but also more challenging. Traditional methods of literature review often rely on manual reading and summarization, which can be time-consuming, labor-intensive, and inefficient. Therefore, developing a tool that assists users in quickly skimming, comprehending, and extracting the core content of papers is particularly important.

In recent years, large language models have seen significant development, providing new possibilities for addressing these challenges. The outstanding performance of these models makes them an ideal choice for building efficient literature-assistance tools. This project aims to leverage large language models to develop an intelligent application for assisting with paper reading. This application will efficiently summarize, organize, synthesize, and abstract relevant information from scholarly documents, thereby enhancing the productivity of researchers.

How it works

This project leverages AMD’s AI PC, which is equipped with a Neural Processing Unit (NPU) specifically designed to accelerate neural network inference. The NPU supports the deployment and application of various models, providing robust hardware support for high-performance artificial intelligence tasks. Building on this foundation, we have deployed a large language model called internlm/Agent-FLAN-7b. This model was trained on the Agent-FLAN dataset, building upon Llama2-7b, and it exhibits exceptional agency capabilities and natural language processing skills.

To enhance the model’s comprehension and response capabilities, we adopted the llamaindex framework to build a Retrieval-Augmented Generation (RAG) system. This framework converts users’ PDF files into an external knowledge base and retrieves relevant information based on user queries. The retrieved information is then integrated into the prompt, enabling the model to generate more accurate and meaningful responses based on the document content.

Driver Install

The UM790 should come with the Integrated Processing Unit (IPU) driver pre-installed, but if you're unsure, you can reinstall it. There are two known installation paths:

1. One is through the download link mentioned in the tutorial on the AMD Ryzen AI website.(https://ryzenai.docs.amd.com/en/latest/inst.html)

2. Another is by downloading the driver tool from the Minisforum official website under UM790 Support.(https://www.minisforum.com/new/support?lang=cn#/support/page/download/79)

After installing the driver, if you don't see the IPU device listed in the Device Manager, it's likely because the IPU feature is not enabled in the BIOS settings. Follow these steps to enable it:

1. Enter BIOS:

Press and hold the Delete key during startup to enter the BIOS settings.

2. Navigate to IPU Settings:

Go to Setup > Advanced > CPU Configuration, and enable the IPU feature.

3. Save and Restart:

Save the changes and restart the computer.

After restarting, you should be able to see the AMD IPU Device listed under System Devices in the Device Manager.

Environment Setup

Remember!!!!! All the command should conduct in CMD environment.

First, we should install Ryzen AI SW follow the instruction

Then, we clone our project to our computer

git clone https://github.com/fengzhaoxin/paper_helper.git

To successfully run the LLM, we should move everything in the RyzenAI_Package folder to the conda env folder (YOUR_ENV_PATH\Lib\site-packages). We can run

conda env list

to find the path of our conda env.

Finally we should install other used packages, such as llamaindex and streamlit.

Model Quant

After successfully set up the environment, we can do model quantitation, all the things we need is in the "model_quant" folder.

we first activate the ryzenai-transformers env, then run "run_awq.py" for quantitation.

conda activate ryzenai-transformers
setup_local.bat
cd model_quant
python run_awq.py --w_bit 3 --lm_head --flash_attention --model ../Agent-FLAN-7b --output ./Agent-FLAN-7b

where "--model../Agent-FLAN-7b" is the trained model folder, and "--output./Agent-FLAN-7b" is the output model name.

After quantitation, we will get the quant model, we move it to the model folder. Moreover, we should download the embedding-model "m3e-small", and move the tokenizer to the model folder, which is as follow:

Run the Demo

First, we should apply a LlamaCloud API key on LlamaCloud (llamaindex.ai), and put it in your config.ini file just like your OpenAI key:

[llamma]
api_key = llx-xxxxxxxxx

Now, we finish all the preparation, and can run the "web_demo.py" for test. Have fun!

streamlit run web_demo.py

Demonstration video

Credits

ZhaoxinFeng

1 project • 0 followers

Paper reading assistant - Ryzen AI

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

How it works

Driver Install

Environment Setup

Model Quant

Run the Demo

Demonstration video

Code

paper_helper

Credits

ZhaoxinFeng

Comments

Embed the widget on your own site

Paper reading assistant - Ryzen AI

Paper reading assistant - Ryzen AI

Things used in this project

Hardware components

Software apps and online services

Story

Introduction

How it works

Driver Install

Environment Setup

Model Quant

Run the Demo

Demonstration video

Code

paper_helper

Credits

ZhaoxinFeng

Comments

Related channels and tags