Abstract
We are developing Tab, a wearable AI device designed to function as a personal assistant and life coach. Tab helps manage and remember personal interactions and information, enhancing personal development and productivity. It records conversations and uses AI to process and organize the information into a structured database, providing insights and proactive suggestions based on the user's daily interactions.
1. IntroductionManaging personal interactions and information can be challenging. Tab aims to solve this problem by recording conversations and using AI to process and organize the information. This helps users recall details from past conversations, brainstorm ideas, plan for the future, and make decisions. Tab has context of all the things happening around you it is just like chatgpt with context of your life. Tab enhances personal development and productivity by providing insights and proactive suggestions.
1.1 Problem Statement
In today's fast-paced world, individuals often struggle with:
- Remembering important details from conversations
- Organizing and retrieving personal information effectively
- Making data-driven decisions in personal and professional life
- Managing time and tasks efficiently, especially for those with ADHD
1.2 Proposed Solution
Tab offers a comprehensive solution by:
- Recording and processing conversations
- Creating a structured, searchable database of personal interactions
- Providing AI-powered insights and suggestions
- Enhancing memory recall and decision-making capabilities
1.3 Key Features
- Memory and Information Management
- Personal Development and Productivity Enhancement
- Privacy-Focused Design
2. Methodology2.1 System Architecture
Our system consists of three main components:
1. Hardware Device (Raspberry Pi Zero W)
2. Web Application
3. Backend Infrastructure (Supabase)
2.2 Technical Specifications
2.2.1 Web App
Platform: Web
Tech stack: Next.js
Database: Supabase (PostgreSQL)
2.2.2 Hardware Device
Raspberry Pi Zero W
A mic to record user's voice
A micro SD card
UM790 Pro
Ports: Mini HDMI, Micro USB OTG, Micro USB power
GPIO: 40-pin header
2.2.3 Software:
Operating system (Windows 11)
Next.js for the app development
Supabase for backend services, including database management, authentication, and compute functions.
Docker: For deployment
Programming languages (C++, Typescript, PLpgSQL)
Ollama or OpenAI gpt-3.5
sentence-transformers/all-MiniLM-L6-v2 (for embedding generation) or OpenAI embeddings
Whisper Large V3 (for transcription)
Hardware information and setup
We are using UM790 Pro for the entire development of our project.
CPU AMD Ryzen 9 7940HS Processor
GPU AMD Radeon 780M
System Memory 16GB x 2
Storage 512GB
3. ImplementationBuilding the tab was quite a huge task. We mainly had three components to work on: first, setting up the backend; second, building the mobile/web app; and lastly, the most tricky one for us, the hardware.
The plan is to first work on the software part, and within that, we prefer doing the backend first. For the backend, we used Supabase, which made our work a little easier. Supabase is an open-source Firebase alternative, a “backend-as-a-service” that allows you to set up a Postgres database, authentication, edge functions, vector embeddings, and more for free (at first) and with extreme ease!
3.1 Backend Setup
here is the information about key Components of supabase-
1. Functions
Process Audio ([process-audio/index.ts])
This function handles audio file uploads, processes them using OpenAI's Whisper model for transcription, and stores the results in the Supabase database. It supports multipart/form-data for uploads and includes error handling for various stages of the process.
- Chat Handler
Manages chat interactions by generating responses using OpenAI's GPT models. It fetches relevant records from the database to provide context-aware responses and streams these responses back to the client.
2. Common Utilities
- Supabase Client
Configures and returns a Supabase client instance, used across different functions to interact with the Supabase database.
- CORS Configuration
Defines CORS headers used in HTTP responses to handle cross-origin requests.
- Error Handling
Contains custom error classes and a function to handle errors, formatting them into structured HTTP responses.
3. Database Migrations
- Schema Definitions and Modifications (`migrations/*.sql`):
SQL scripts to create and modify database schemas, tables, and policies. These scripts handle everything from creating tables and adding columns to setting up row-level security and foreign key constraints.
4. Configuration
- Supabase Configuration
Contains settings for the Supabase project, including API ports, authentication settings, and database configurations. It also specifies project-specific settings like JWT expiry and file size limits.
Check out the repo for the code : https://github.com/adc77/tab
We will use Supabase as our database (with vector search, pgvector), authentication, and cloud functions for processing information.
I already had an account on supabase so no need to create one we will click “New Project”, give it a name, and will note the database password given for future reference.
Once the project is created, we will get the anon public API Key, and the Project URL, copy them both, as we will need them in a bit.
Now, on authentication tab on the right navbar (), note that it can take a few moments for Supabase to finish setup the project
There, we will see the “user management” UI. Click “Add User” -> “Add new user”, fill an email and password, and make sure to check the “auto-confirm” option.
By now, we have 4 things: email & password for our supabase user, and the Supabase URL and API Anon Key.
If so, on terminal, cd to the supabase folder:
cd ./supabase
We will install Supabase and set up the CLI. We can follow thier guide here(https://supabase.com/docs/guides/cli/getting-started?platform=macos#installing-the-supabase-cli), but in short:
run
brew install supabase/tap/supabase
to install the CLI (or check other options)
We already have Docker Desktop (we won’t use it, we just need docker daemon to run in the background for deploying supabase functions)
Now when we have the CLI, we need to login with our Supabase account, running supabase login - this should pop up a browser window, which should prompt you through the auth
And link our Supabase CLI to a specific project, our newly created one, by running supabase link --project-ref <project-id> (we can check what the project id is from the Supabase web UI, or by running supabase projects list, and it will be under “reference id”) - we can skip (enter) the database.
Now we need to apply the Tab DB schema on our newly created, and empty database. We can do this by simply run: supabase db push. We can verify it worked by going to the Supabase project -> Tables -> and see that new tables were created.
Now let’s deploy our functions! supabase functions deploy --no-verify-jwt (see issue re:security)
We are planning to first use OpenAI as our Foundation model provider, we need to also run the following command, to make sure the functions have everything they need to run properly: supabase secrets set OPENAI_API_KEY=<openai-api-key>.
We already have our OpenRouter .One can go to OpenRouter to get the API Key, then run supabase secrets set OPENROUTER_API_KEY=<openrouter-api-key>.
3.2 Web App Development
Now we can start working on the web app.
The code for the app can be found here: https://github.com/adc77/tab
Components of webapp:
Button: Customizable Button component using the class-variance-authority library for CSS variations. Supports multiple styles, sizes, and can render as a child component using @radix-ui/react-slot.
Drawer: Customizable Drawer component using the vaul library, with subcomponents like DrawerTrigger, DrawerContent, and DrawerOverlay.
Input & Textarea: Reusable Input and Textarea components with customizable styles and attributes, using cn for conditional class names.
Label: Label component using @radix-ui/react-label with styles defined through class-variance-authority.
Popover: Popover component with subcomponents like PopoverTrigger and PopoverContent, using @radix-ui/react-popover for positioning and styling, with animation options.
Skeleton: Skeleton component for placeholder UI during content loading, featuring a pulsing animation effect.
Toaster: Toaster component using the sonner library, integrated with next-themes for customizable toast styles.
Chat: Main chat component managing messages, conversation IDs, and UI states, integrating ChatLog, PromptForm, and navigation elements. Uses Supabase for backend interactions.
ChatDots: Animated dots indicating processing or typing, accepting a size prop.
ChatLog: Displays chat messages using MarkdownIt for rendering and highlight.js for syntax highlighting, with framer-motion animations.
ConversationHistory: Manages and displays conversation history, allowing selection, viewing, and deletion of past conversations. Uses react-query for data fetching from Supabase.
LoginForm & LogoutButton: Handles user authentication and logout functionality, interacting with Supabase authentication API and managing UI states and routing. Styled to fit the application's theme.
NavMenu & SideMenu: Navigation and sidebar menus using popover and drawer UI patterns, displaying child components (buttons/links) and the ConversationHistory component inside the drawer.
NewConversationButton: Button to initiate a new conversation, triggering a createNewConversation function passed as a prop.
PromptForm: Form for entering and sending messages, with a textarea and send button, handling input state and submission action. Supports Enter key submission, excluding when Shift is held.
Prose: Utility component applying predefined styles to children, consistently styling text content using Tailwind CSS classes.
ThemeProvider & ThemeToggle: Manages theme switching based on user or system settings, with a button toggling between light and dark themes using the useTheme hook from next-themes.
Pages
index: Main entry page. Uses useSupabase hook for authentication state, rendering LoginForm for unauthenticated users or Chat for authenticated users. Handles loading states and conditional rendering.
_app: Customizes the Next.js App component, wrapping all pages with global providers for theming and data fetching, including ThemeProvider and QueryClientProvider from react-query, and integrates a global toaster.
_document: Customizes HTML document structure used by Next.js, augmenting <html> and <body> tags, setting language, and adding attributes for proper rendering and script execution.
Styles
global.css: Configured for Tailwind CSS, includes directives for Tailwind's base, components, and utilities layers. Defines custom CSS variables for themes, specifying UI colors. Contains styles for rendering LaTeX equations using KaTeX and applies utility classes using @apply from Tailwind CSS.
Utils
cnHelper: Provides a cn function combining class names using clsx and merging with Tailwind CSS classes using twMerge, managing dynamic class names efficiently.
useSupabaseConfig: Hooks for managing Supabase configuration and state, fetching and setting Supabase URL and token from local storage, providing these values with loading states to components.
range: Utility function generating an array of numbers within a specified range, similar to Python's range function, supporting custom start, end, and step values.
we can deploy the app. The easiest way is through Vercel, or we can run it locally.
If we wantto run it locally, we'll follow these steps from the root folder:
We'll navigate to the app directory:
cd ./app
Then we'll install the dependencies and run the development server:
npm i
npm run dev
Once we have our app instance up and running, we'll head to its address (your-app-address.com/). We should see a screen prompting us for connection details.
On this screen, we'll enter the four required details that we obtained during the Supabase setup:
Supabase URL
Supabase Anon API Key
Email
Password
With these steps completed, we'll be ready to start interacting with our assistant through the app interface. Their won’t b any query response as there isn’t any data stored. The transcriptions of the audio recorded through our hardware device will be stored in the database.
After launching the app we can move to the toughest part of our project working on the hardware.
3.3 Hardware Assembly
Raspberry Pi Zero W Setup:
Now that we have our hardware components, we're ready to assemble our Raspberry Pi Zero W audio recording device. Let's go through the process step by step:
These are the components we need:
-Raspberry Pi Zero 2 W starter kit
-MicroSD card
-Microphone
-Battery (we will use a power bank)
-Soldering kit
Assembling the parts:
1. Connecting the pin headers to Raspberry Pi
We have a Raspberry Pi board that looks like this
Our goal is to attach a the header connector to our Raspberry Pi.
We need to solder the header connector.
After soldering this is what we end up with:
2. Attaching header pins to microphone
We need to attach pins to our microphone to be able to connect it to Raspberry Pi later. We may choose to connect the pins first or solder via wires directly. We will do it with pins.
From our microphone kit, we received 6 yellow pins on the same line.
Step 1: Since our microphone has 2 lines of pins, we need to break the 6-pin line into 2 pieces (see screenshot for reference).
Step 2: Following same guides, solder pins that we got in the same kit with our microphones..
3. Connecting Microphone to our Raspberry Pi
Along with our microphones we received multiple pin wires.
Dividing them into 2 groups of 3
connecting these wires to our header pins on our microphone
Now it’s time to connect our microphone to Raspberry Pi.
Each pin is responsible for a specific task, which is why we need to follow a schema of connecting it correctly.
Here is the schema for our microphone
4. Connecting Raspberry Pi with a power bank as an alternative to pi sugar battery
We can connect the pi with pi sugar battery. But due to budget constraints we will be using a power bank to provide power to the pi.
We will end up with something like this:
5. Assembling everything
Inserting the SD card into our Raspberry Pi Zero W.
Power on the device to check if everything works a light should blink.
After putting everything together
Now, we have successfully completed set up your Raspberry Pi Zero W for development with our project.
3.4 Software Installation on Hardware
With our hardware assembly complete, we're ready to move on to installing the software. Now we need to set up our Raspberry Pi Zero 2 w to install the necessary software and get our device up and running.
This setup will allow us to start recording audio with our Raspberry Pi Zero W.
Now that we have our Raspberry Pi Zero W hardware set up, let's go through the process of installing the necessary software and configuring our device. We'll follow these steps:
1. First, we'll download the Raspberry Pi Imager from the official Raspberry Pi website.
2. Next, we'll download the Raspberry Pi OS Lite (64-bit) image from the provided link.
3. To prepare our SD card, we'll unzip the OS image and insert the SD card into our computer. We'll run the Raspberry Pi Imager, select our unzipped OS image, and choose our device. In the settings, we'll enable SSH, add our username and password, and input our WiFi settings. Then we'll write the image to the SD card.
4. We'll boot our Raspberry Pi by inserting the prepared SD card and powering it on. We'll wait for it to boot up completely.
5. To connect via SSH, we'll open a terminal on our computer and use the command:
ssh username@raspberrypi.local
We'll replace 'username' with the one we set up earlier.
6. Now we'll transfer our project files. From a separate terminal on our computer, we'll use:
scp -r /path/to/your/project/raspizerow username@raspberrypi:~/
7. For installing dependencies, we'll navigate to the project directory on our Raspberry Pi and run the installation script:
cd raspizerow
chmod +x install.sh
./install.sh
We'll answer 'yes' to auto-loading the module at boot, 'no' to rebooting now, and enter our SUPABASE_URL and AUTH_TOKEN when prompted.
8. After the installation, we'll reboot our Raspberry Pi:
sudo reboot
9. Finally, to run our application, we'll navigate back to the project directory, make the run script executable, and start the application:
cd raspizerow
chmod +x compile.sh
./compile.sh
gdb ./main
Optionally, we can start the application with GDB for debugging. Within GDB, we'll use the 'run' command to start the application in debug mode.
With these steps completed, we have our Raspberry Pi Zero W set up and running our application, ready for development and testing.
After running the application with ./main it should start recording whatever comes in vicinity of the microphone but we aren’t able to see any recordings in the supabse. We are unable to use the hardware cause it is not connecting with the wifi due to some problems (more about this discussed in the section 5.1 ) with the raspberry-pi-zero-w so we used a python script to perform the recording task and it worked well.
After completing the entire process—building the hardware, developing the web app, and setting up the backend—we faced some technical issues with the hardware. As a workaround, we used a Python script to test our system. The script successfully recorded voice inputs, and we could see the transcripts stored in Supabase.
The web app functioned smoothly, with impressive query retrieval speeds and high-quality responses. The final results were promising: everything worked as expected, except the hardware that we built and the language model's responses were excellent, proving the system's readiness for use at our work desk.
Here are some snippets of the conversations with the webapp:
I turned on the tab while working at the desk and it recorded all the conversations I had. Later on I asked it some questions based on those conversations and it responded brilliantly.
The final results shows that everything worked as expected and the responses from the llm were also quite good and we can use this as of now at our work desk very well. We are trying our best to fix the hardware issues to ensure our solution works seamlessly in all environments.
5. Discussion5.1 Challenges Faced
We faced many challenges during the setup of hardware. We tried many things but at the end our hardware didn’t worked what it was intended to.
- While turning on the pi first it blinked for 7 times which means that we need to download a compitable os image for it.
- After hacking around and trying multiple times we found it and it worked and now our next task was to connect it with wifi so that the transfer of data from our hardware to the backend can happen.
- We tried connecting wifi from the terminal but even after trying everything we weren’t able to find the problem behind this.
- Due to this our hardware device didn’t worked and we had to look for another alternative i.e, the python recorder client to do the recording task.
5.3 Future Improvements
- Successfully running the hardware that we built so that we can record the voice from anywhere just by wearing the tab in form of a necklace.
- Enhanced natural language processing capabilities
- Integration with other personal productivity tools
- Development of a more compact and stylish hardware design
- Tab can be equipped with a mini camera in the Raspberry Pi video slot. This addition would enable the device to capture visual context, allowing for the use of multimodal models to better understand the user's environment and interactions. This enhanced context awareness will improve the accuracy and relevance of the insights and suggestions provided by Tab. Just imagine you will have three eyes and ears. It would be like having a personal assistant who can hear, understands and see all the things around you.
-Adding a speaker to the Raspberry Pi, similar to the microphone, will enable direct interaction between the user and the device. With a text-to-speech (TTS) model, Tab can respond to user queries and provide information verbally, eliminating the need for an app interface. This direct communication will make Tab more user-friendly and accessible, fostering a more natural and intuitive interaction experience.
5.4 Lessons learned
We learned the value of adopting a parallel development process rather than a sequential approach. We spent much of our time building the web app and setting up the backend, which left us with limited time for hardware development. As a result, we couldn't fully address the challenges we encountered, which led to some unfinished aspects of our hardware project. This experience has given us valuable insights for future endeavors.
6. ConclusionCreating Tab, our wearable AI personal assistant and life coach, has been an exciting journey. Tab is designed to help users manage personal interactions and information by recording conversations, processing them with AI, and organizing the data to offer useful insights and suggestions. Tab is like having a second brain having context of everything happening in your life.
We built the hardware, developed a web app, and set up the backend. Though we faced significant hardware challenges, using a Python script for recording allowed us to test our system's core features, like voice recording and storing transcripts in Supabase.
The web app performed well, with quick query responses and high-quality results. This confirmed that the system works and is ready for use at our work desk. However, the hardware issues need to be resolved to ensure it works seamlessly everywhere.
Next steps include fixing the hardware problems, making Tab more compact and stylish, and adding features like a mini camera and speaker. This will make Tab even more helpful and easy to use.
Despite the hiccups, we've made great progress with Tab, and it's clear it has the potential to make managing personal interactions and information much easier and more productive and from the results so far we can assure that Tab would be the device which will know about you more than yourself and it will be a game-changer, enhancing your daily life by keeping track of your interactions and offering valuable insights and suggestions at the right time.
It will be your best friend.
Comments