Imagine talking to your calendar and having it actually understand you...ποΈ I created an audio transmitter app that records 15-second voice notes and sends them to an AI agent that automatically:
- Creates calendar events from your voice
- Sets up recurring reminders
- Finds relevant articles and videos for your tasks
- Basically becomes your personal AI assistant
Example: Say "Create a recurring reminder for 7 days to learn Python, new topic everyday" and boom! You get a full week of Python learning sessions with curated resources attached. The tech stack is actually pretty fun:
- Cardputer handles the audio recording (simple button controls)
- sends the audio to my AI server
- LangGraph agent processes everything
- Google Calendar integration makes it real
What I love about this is how it bridges physical hardware with AI. You get the tactile feel of a physical device but the smarts of modern AI. And here's the crazy part - creating apps with Cardputer is WAY easier than building Android or iOS apps. All you need is a single.py file! No complex IDEs, no app store approvals, no dealing with mobile OS quirks. You can literally create something similar in less than a day and have your own edge AI device. The barrier to entry for hardware development has never been this low. Check out the code and see for yourself! I have tried to use all opensource/free frameworks
1. Agent harness: LangChain LangGraph
2. STT: generous free tier on eleven labs
3. Google calendar MCP
4. Tavily: generous free tier
5. Ollama: for host in my models in my personal PC/Raspberry pi (GPU RTX 3060) - no subscription.
6. M5Stack Cardputer and UIFlow2 for building the app.
Check it out:
π UIFlow Project:
HTTP: https://uiflow2.m5stack.com/?pkey=0ed0d26ee55041b981e48d53e3b80a92
UDP: https://uiflow2.m5stack.com/?pkey=f4da1d7386ed49ea977130ec68e813f2
Calendar AI Agent: https://github.com/Bkbest/calendar-agent
Additional instructions: https://github.com/Bkbest/cardputer-audio-transmitter-udp
AI Calendar Management Agent with M5Stack Cardputer.OverviewThis project combines real-time audio processing with intelligent calendar management, allowing users to send voice commands that are transcribed and converted into calendar events, tasks, and organized schedules. The system uses LangGraph for agent orchestration and integrates with Google Calendar and Tavily search for comprehensive calendar management.
Demo VideoWatch how this AI Calendar Management Agent can be used with cardputer:
M5Stack Cardputer IntegrationFeatures- 15-Second Audio Recording: Record voice memos up to 15 seconds long
- Audio Transmission: Send recorded audio to any listening HTTP server
- Simple Controls: Intuitive button-based interface for recording, playback, and transmission
- AI Calendar Integration: Works seamlessly with the Calendar AI Agent for intelligent task management
- Start Recording: Click the Go button once to begin recording audio
- Playback Recording: Hold the Go button to play back your recorded audio
- Transmit Audio: Press the Spacebar to submit the recorded audio for transmission
- Re-record: Click the Go button again to record a new sample if you're not satisfied with the current one
1. Press Go (click) β Start 15-second recording
2. Hold Go β Review recorded audio
3. Press Spacebar β Transmit tserver
4. Repeat β Record new sample if neededArchitectureThe system consists of several key components:
- Audio Servers: Supports both UDP and HTTP protocols for receiving audio data
- LangGraph Agent: Orchestrates calendar management workflows
- Audio Transcription: Converts speech to text using Eleven Labs API
- Calendar Integration: Manages Google Calendar events and tasks
- Search Integration: Provides real-time information for calendar planning
- Dual Protocol Support: UDP for real-time streaming and HTTP for reliable chunked transfers
- Real-time audio processing via UDP protocol
- HTTP Chunked Streaming: 8000-byte chunks with session management
- Speech-to-text transcription using Eleven Labs
- Intelligent calendar event creation and management
- Task planning and todo list management
- Internet search for enhanced calendar context
- Multi-calendar support with Google Calendar integration
- Virtual file system for context persistence
The system supports two audio input methods:
UDP Audio Server ( src/udp_audio_server.py )Traditional UDP-based audio streaming for real-time applications.
How it Works- Listens on port 9876 for incoming UDP packets containing audio data
- Manages client sessions with packet buffering and timeout handling
- Processes audio in chunks and waits 2 seconds after the last packet before processing
- Handles concurrent clients using thread pools
ClientSession: Manages individual client connections and audio bufferingAudioConversionService: Handles audio format conversion and transcriptionUdpAudioServer: Main server class that orchestrates the entire process
- Client sends audio packets to UDP port 9876
- Server buffers packets per client session
- After 2 seconds of inactivity, complete audio is processed
- Audio is converted to WAV format
- Eleven Labs API transcribes the audio to text
- Transcription is passed to the LangGraph agent for processing
Modern HTTP-based chunked audio streaming for reliable transfers and better network compatibility.
How it Works- Listens on port 9876 for HTTP POST requests
- Receives audio in 8000-byte chunks with session-based management
- Supports explicit completion markers for reliable session termination
- RESTful API design with status checking capabilities
- Chunked Transfer: Audio data sent in 8000-byte chunks
- Session Management: Unique session IDs track multiple simultaneous recordings
- Completion Detection: Two methods - implicit (last chunk with data) or explicit (empty final marker)
- Error Handling: Missing chunk detection and detailed error responses
- Status Tracking: Real-time progress monitoring via REST endpoints
POST /audio/chunk- Receive audio chunksGET /audio/status/<session_id>- Check processing statusGET /health- Server health check
Content-Type: application/octet-stream
X-Session-ID: unique-session-identifier
X-Chunk-Number: 0-based chunk number
X-Is-Last: true/false (for explicit completion)Audio Processing Flow- Client creates unique session ID
- Audio data sent in 8000-byte chunks via HTTP POST
- Server buffers chunks per session
- Final marker (empty POST with X-Is-Last: true) triggers completion
- Server verifies all chunks are received
- Audio is converted to WAV format
- Eleven Labs API transcribes the audio to text
- Transcription is passed to the LangGraph agent for processing
# Example for M5Stack Cardputer using requests2
import requests2
def send_audio_chunks(audio_data, server_url):
session_id = generate_session_id()
chunk_size = 8000
for i in range(0, len(audio_data), chunk_size):
chunk = audio_data[i:i+chunk_size]
chunk_number = i // chunk_size
is_last = i + chunk_size >= len(audio_data)
headers = {
'Content-Type': 'application/octet-stream',
'X-Session-ID': session_id,
'X-Chunk-Number': str(chunk_number),
'X-Is-Last': str(is_last).lower()
}
response = requests2.post(f"{server_url}/audio/chunk",
data=chunk, headers=headers)
# Send final marker (recommended)
final_headers = {
'Content-Type': 'application/octet-stream',
'X-Session-ID': session_id,
'X-Chunk-Number': str(len(audio_data) // chunk_size),
'X-Is-Last': 'true'
}
requests2.post(f"{server_url}/audio/chunk",
data=b'', headers=final_headers)2. LangGraph Agent InvocationThe LangGraph agent is invoked automatically when audio transcription is complete for both UDP and HTTP servers.
Invocation ProcessLocated in both server files (UDP: lines 355-372, HTTP: similar pattern):
# Generate random thread ID for session isolation
random_thread_id = ''.join(random.choices(string.ascii_letters + string.digits, k=8))
config = {"configurable": {"thread_id": random_thread_id}, "recursion_limit": 50}
# Invoke agent with transcription and available tools
output = graph.invoke({"messages": transcript, 'tools': MyTools().getAllTools()}, config=config)Agent WorkflowThe agent workflow is defined in src/AI_Scope_Agent/basic_agent.py:
- Input Processing: Receives transcribed text and available tools
- Tool Requirement Analysis: Determines if calendar tools are needed
- Tool Execution: Uses appropriate calendar management tools
- Response Generation: Returns confirmation and results to client
- START β llm_with_tools β tool_node (if tools needed) β llm_with_tools
- Uses conditional routing based on tool requirements
- Maintains conversation state and context
UDP Server: src/udp_audio_server.py (line 412) HTTP Server: src/http_audio_server.py (line 438)
ELEVEN_LABS_API_KEY = "<key>" # Replace with your actual Eleven Labs API keyTo update: Replace placeholder "<key>" with your actual Eleven Labs API key in both server files.
File: src/AI_Tools/tools.py
Tavily API Key (line 252):
tavily_client = TavilyClient(api_key="<api-key>") # Replace with your Tavily API key- Tavily API Key (line 252):tavily_client = TavilyClient(api_key="<api-key>") # Replace with your Tavily API key
Google Calendar API (lines 96-101):
credentials = get_google_credentials(
token_file="token.json",
scopes=["https://www.googleapis.com/auth/calendar"],
client_secrets_file="gcp-oauth.keys.json.json",
)- Google Calendar API (lines 96-101):credentials = get_google_credentials( token_file="token.json", scopes=["https://www.googleapis.com/auth/calendar"], client_secrets_file="gcp-oauth.keys.json.json",)
Calendar ID Configuration (line 241):
{"id": "<primary-calendar-id>", "summary": "<primary-calendar-summary>", "timeZone": "<timezone>"}- Calendar ID Configuration (line 241):{"id": "<primary-calendar-id>", "summary": "<primary-calendar-summary>", "timeZone": "<timezone>"}
To update:
- Replace
"<api-key>"with your Tavily API key - Ensure
gcp-oauth.keys.json.jsoncontains your Google Calendar API credentials
Important Note: The search_calendar_event tool from MCP is not properly functioning, so we have created a custom implementation. Please ensure to update your calendar ID in the SEARCH_CALENDAR_EVENT tool (line 241 in tools.py) with your actual Google Calendar ID, summary, and timezone.
The system prompt is located in src/AI_Sys_Prompt/system_prompt_todo_req.txt and defines the AI agent's behavior and capabilities.
The system prompt instructs AI to act as an AI Calendar Management Assistant with the following responsibilities:
- Convert user requests into well-planned calendar tasks and events
- Create calendar events for specific tasks and recurring schedules
- Find optimal time slots based on user availability
- Organize learning plans with structured timelines
- Manage virtual file system for context persistence
- Provide clear final confirmations
- Python 3.8+
- Eleven Labs API key
- Tavily API key
- Google Calendar API credentials
Install required packages for HTTP server:
pip install -r requirements.txtConfigure Google Calendar OAuth:
- Place your client secrets in
src/gcp-oauth.keys.json.json
cd src
python udp_audio_server.pyThe server will start listening on port 9876 for UDP audio input.
HTTP Server (Recommended)cd src
python http_audio_server.pyThe server will start listening on port 9876 for HTTP audio chunks.
6. UsageUDP Protocol- Send audio packets to
localhost:9876via UDP - Wait for transcription and agent processing
- Receive calendar management responses via UDP
- Send audio chunks to
http://localhost:9876/audio/chunkvia HTTP POST - Use session management with unique session IDs
- Monitor progress via
GET /audio/status/<session_id> - Send final marker to trigger processing
# Test the HTTP server implementation
python test_http_server.py




Comments