While large language models (LLMs) have proven powerful in handling user requests, their functionality is often confined to chat interfaces. This limitation means they cannot profoundly impact users' lives. Our project aims to emulate an LLM operating system (OS) where the agent can craft tools tailored to users' specific needs. These tools can be reused, run continuously in the background, and shared across platforms, fostering a more versatile and characteristic operating system.
Overall DesignIn our design, users interact with the LLM to craft tools, which, once successfully created, are stored in a database. We classify these tools into three types:
- Prompt Tools: Based on pre-defined prompts to achieve specific functionalities.
- Script Tools: Execute desired functions via Python scripts or Bash files.
- System Tools: Run in the background and can actively interact with the user.
The tool crafting process, illustrated in the flowchart below, is fundamentally a human-in-the-loop process. Unlike a fully automated AI agent, this approach leverages human insight, paving the way for more powerful AI learning in tool creation.
Our project is implemented using Python. Due to the complexity of GUI implementation with Tkinter, we opted for a locally hosted Flask web app for simplicity.
- Local LLM Host: We set up a local host for the LLM.
- Local Shortcut that sends the items to the clipboard for implementation
- State Machine: A state machine guides the agent through the flowchart, ensuring the LLM follows the correct procedure to craft the tools. This can be further enhanced with more powerful LLMs to enable smoother state transitions.
- By incorporating these elements, we aim to extend the capabilities of LLMs beyond the chat box, enabling them to create impactful, user-specific tools that enhance user experience and functionality




Comments