xsran
Created July 31, 2024

Deploy Llama model in AMD Radeon PRO W7900

Deploy Llama model in AMD Radeon PRO W7900

14
Deploy Llama model in AMD Radeon PRO W7900

Things used in this project

Hardware components

AMD Radeon™ Pro W7900 GPU
AMD Radeon™ Pro W7900 GPU
×1

Story

Read more

Code

LLM UI

Python
import gradio
import requests


def run(prompt):
    url = "http://127.0.0.1:8099/completion"
    data = {"prompt": prompt, "n_predict": 128}
    resp = requests.post(url, json=data)
    data = resp.json()
    content = data.pop("content")
    return content, data


# demo = gradio.Interface(
#        fn=run,
#         inputs=["text"],
#        outputs=["text", "json"]
# )


with gradio.Blocks() as demo:
    inp = gradio.Textbox(
        label="Prompt",
        placeholder="prompt",
        lines=3
    )

    btn_run = gradio.Button("Run")
    out = gradio.Textbox()
    extra = gradio.Json()

    btn_run.click(fn=run, inputs=inp, outputs=[out, extra])

# demo.launch(server_name="0.0.0.0")
demo.launch(server_name="0.0.0.0", share=True)

server

Go
No preview (download only).

Credits

xsran
2 projects • 0 followers

Comments