Personal AI Assistant - supermemory | Memory API for the AI era

Build a personal AI assistant that learns and remembers everything about the user - their preferences, habits, work context, and conversation history.

What You’ll Build

A personal AI assistant that:

Remembers user preferences (dietary restrictions, work schedule, communication style)
Maintains context across multiple chat sessions
Provides personalized recommendations based on user history
Handles multiple conversation topics while maintaining context

Choose Your Implementation

Python + FastAPI

Thoroughly tested, production-ready. Uses FastAPI + Streamlit + OpenAI.

TypeScript + AI SDK

Modern React approach. Uses Next.js + Vercel AI SDK + Supermemory tools.

Prerequisites

Python 3.8+ or Node.js 18+
Supermemory API key (get one here)
OpenAI API key (get one here)

Never hardcode API keys in your code. Use environment variables.

Python Implementation

Step 1: Project Setup

mkdir personal-ai && cd personal-ai
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install supermemory openai fastapi uvicorn python-dotenv streamlit requests

Create a .env file:

SUPERMEMORY_API_KEY=your_supermemory_key_here
OPENAI_API_KEY=your_openai_key_here

Step 2: Backend (FastAPI)

Create main.py. Let’s build it step by step:

Import Dependencies

from fastapi import FastAPI, HTTPException
from fastapi.responses import StreamingResponse
from openai import AsyncOpenAI
from supermemory import Supermemory
import json
import os
import uuid
from dotenv import load_dotenv

FastAPI: Web framework for building the API endpoint
StreamingResponse: Enables real-time response streaming (words appear as they’re generated)
AsyncOpenAI: OpenAI client that supports async/await for non-blocking operations
Supermemory: Client for storing and retrieving long-term memories
uuid: Creates stable, deterministic user IDs from emails

Initialize Application and Clients

load_dotenv()
app = FastAPI()

openai_client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))
supermemory_client = Supermemory(api_key=os.getenv("SUPERMEMORY_API_KEY"))

load_dotenv() loads API keys from your .env file into environment variables. We create two clients:

OpenAI client: Handles conversations and generates responses
Supermemory client: Stores and retrieves user-specific memories

These are separate because you can swap providers independently (e.g., switch from OpenAI to Anthropic without changing memory logic).

Define System Prompt

SYSTEM_PROMPT = """You are a highly personalized AI assistant.

MEMORY MANAGEMENT:
1. When users share personal information, store it immediately
2. Search for relevant context before responding
3. Use past conversations to inform current responses

Always be helpful while respecting privacy."""

This prompt guides the assistant’s behavior. It tells the AI to:

Be proactive about learning user preferences
Always search memory before responding
Respect privacy boundaries

The system prompt is injected at the start of every conversation, so the AI consistently follows these rules.

Create Identity Helpers

def normalize_email(email: str) -> str:
    return (email or "").strip().lower()

def stable_user_id_from_email(email: str) -> str:
    norm = normalize_email(email)
    if not norm:
        raise ValueError("Email is required")
    return uuid.uuid5(uuid.NAMESPACE_DNS, norm).hex

Why normalize? "User@Mail.com" and " user@mail.com " should map to the same person. We trim whitespace and lowercase to ensure consistency. Why UUIDv5? It’s deterministic—same email always produces the same ID. This means:

User memories persist across sessions
No raw emails in logs or database tags
Privacy-preserving yet stable identity

We use uuid.NAMESPACE_DNS as the namespace to ensure uniqueness.

Memory Search Function

async def search_user_memories(query: str, container_tag: str) -> str:
    try:
        results = supermemory_client.search.memories(
            q=query,
            container_tag=container_tag,
            limit=5
        )
        if results.results:
            context = "\n".join([r.memory for r in results.results])
            return f"Relevant memories:\n{context}"
        return "No relevant memories found."
    except Exception as e:
        return f"Error searching memories: {e}"

This searches the user’s memory store for context relevant to their current message. Parameters:

q: The search query (usually the user’s latest message)
container_tag: Isolates memories per user (e.g., user_abc123)
limit=5: Returns top 5 most relevant memories

Why search before responding? The AI can provide personalized answers based on what it knows about the user (e.g., dietary preferences, work context, communication style). Error handling: If memory search fails, we return a fallback message instead of crashing. The conversation continues even if memory has a hiccup.

Memory Storage Function

async def add_user_memory(content: str, container_tag: str, email: str = None):
    try:
        supermemory_client.memories.add(
            content=content,
            container_tag=container_tag,
            metadata={"type": "personal_info", "email": normalize_email(email) if email else None}
        )
    except Exception as e:
        print(f"Error adding memory: {e}")

Stores new information about the user. Parameters:

content: The text to remember
container_tag: User isolation tag
metadata: Additional context (type of info, associated email)

Why metadata? Makes it easier to filter and organize memories later (e.g., “show me all personal_info memories”). Error handling: We log errors but don’t crash. Failing to save one memory shouldn’t break the entire conversation.

Main Chat Endpoint

@app.post("/chat")
async def chat_endpoint(data: dict):
    messages = data.get("messages", [])
    email = data.get("email")
    
    if not messages:
        raise HTTPException(status_code=400, detail="No messages provided")
    if not email:
        raise HTTPException(status_code=400, detail="Email required")

This endpoint receives the chat request. It expects:

messages: Full conversation history [{role: "user", content: "..."}]
email: User’s email for identity

Why require email? Without it, we can’t create a stable user ID, meaning no persistent personalization.

Derive User Identity

    try:
        user_id = stable_user_id_from_email(email)
    except ValueError as e:
        raise HTTPException(status_code=400, detail=str(e))
    
    container_tag = f"user_{user_id}"

Convert email → stable user ID → container tag. The container tag (user_abc123) isolates this user’s memories from everyone else’s. Each user has their own “memory box.”

Search and Inject Memories

    user_message = messages[-1]["content"]
    memory_context = await search_user_memories(user_message, container_tag)
    
    enhanced_messages = [
        {"role": "system", "content": f"{SYSTEM_PROMPT}\n\n{memory_context}"}
    ] + messages

We take the user’s latest message, search for relevant memories, then inject them into the system prompt. Example:

Original: "What should I eat for breakfast?"

Enhanced system message:
"You are a helpful assistant... [system prompt]

Relevant memories:
- User is vegetarian
- User works out at 6 AM
- User prefers quick meals"

Now the AI can answer: “Try overnight oats with plant-based protein—perfect for post-workout!”

Stream OpenAI Response

    try:
        response = await openai_client.chat.completions.create(
            model="gpt-4o",
            messages=enhanced_messages,
            temperature=0.7,
            stream=True
        )

Key parameters:

model="gpt-4o": Fast, capable model
messages: Full conversation + memory context
temperature=0.7: Balanced creativity (0=deterministic, 1=creative)
stream=True: Enables word-by-word streaming

Why stream? Users see responses appear in real-time instead of waiting for the complete answer. Much better UX.

Handle Streaming

        async def generate():
            try:
                async for chunk in response:
                    if chunk.choices[0].delta.content:
                        content = chunk.choices[0].delta.content
                        yield f"data: {json.dumps({'content': content})}\n\n"
            except Exception as e:
                yield f"data: {json.dumps({'error': str(e)})}\n\n"

This async generator:

Receives chunks from OpenAI as they’re generated
Extracts the text content from each chunk
Formats it as Server-Sent Events (SSE): data: {...}\n\n
Yields it to the client

SSE format is a web standard for server→client streaming. The frontend can process each chunk as it arrives.

Optional Memory Storage

            if "remember this" in user_message.lower():
                await add_user_memory(user_message, container_tag, email=email)

After streaming completes, check if the user explicitly asked to remember something. If yes, store it. Why opt-in? Gives users control over what gets remembered. You could also make this automatic based on content analysis.

Return Streaming Response

        return StreamingResponse(generate(), media_type="text/plain")
    
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

StreamingResponse keeps the HTTP connection open and sends chunks as they’re generated. The frontend receives them in real-time.

Local Development Server

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

Run with python main.py and the server starts on port 8000. 0.0.0.0 means it accepts connections from any IP (useful for testing from other devices).

Step 3: Frontend (Streamlit)

Create streamlit_app.py:

Complete Frontend Code

import streamlit as st
import requests
import json
import uuid

st.set_page_config(page_title="Personal AI Assistant", page_icon="🤖", layout="wide")

def normalize_email(email: str) -> str:
    return (email or "").strip().lower()

def stable_user_id_from_email(email: str) -> str:
    return uuid.uuid5(uuid.NAMESPACE_DNS, normalize_email(email)).hex

# Session state
if 'messages' not in st.session_state:
    st.session_state.messages = []
if 'user_name' not in st.session_state:
    st.session_state.user_name = None
if 'email' not in st.session_state:
    st.session_state.email = None
if 'user_id' not in st.session_state:
    st.session_state.user_id = None

st.title("🤖 Personal AI Assistant")
st.markdown("*Your AI that learns and remembers*")

with st.sidebar:
    st.header("👤 User Profile")
    
    if not st.session_state.user_name or not st.session_state.email:
        name = st.text_input("What should I call you?")
        email = st.text_input("Email", placeholder="you@example.com")
        
        if st.button("Get Started"):
            if name and email:
                st.session_state.user_name = name
                st.session_state.email = normalize_email(email)
                st.session_state.user_id = stable_user_id_from_email(st.session_state.email)
                st.session_state.messages.append({
                    "role": "user",
                    "content": f"Hi! My name is {name}."
                })
                st.rerun()
            else:
                st.warning("Please enter both fields.")
    else:
        st.write(f"**Name:** {st.session_state.user_name}")
        st.write(f"**Email:** {st.session_state.email}")
        if st.button("Reset Conversation"):
            st.session_state.messages = []
            st.rerun()

if st.session_state.user_name and st.session_state.email:
    for message in st.session_state.messages:
        with st.chat_message(message["role"]):
            st.markdown(message["content"])
    
    if prompt := st.chat_input("Message..."):
        st.session_state.messages.append({"role": "user", "content": prompt})
        with st.chat_message("user"):
            st.markdown(prompt)
        
        with st.chat_message("assistant"):
            try:
                response = requests.post(
                    "http://localhost:8000/chat",
                    json={
                        "messages": st.session_state.messages,
                        "email": st.session_state.email
                    },
                    stream=True,
                    timeout=30
                )
                
                if response.status_code == 200:
                    full_response = ""
                    for line in response.iter_lines():
                        if line:
                            try:
                                data = json.loads(line.decode('utf-8').replace('data: ', ''))
                                if 'content' in data:
                                    full_response += data['content']
                            except:
                                continue
                    
                    st.markdown(full_response)
                    st.session_state.messages.append({"role": "assistant", "content": full_response})
                else:
                    st.error(f"Error: {response.status_code}")
            except Exception as e:
                st.error(f"Error: {e}")
else:
    st.info("Please enter your profile in the sidebar")

Step 4: Run It

Terminal 1 - Start backend:

python main.py

Terminal 2 - Start frontend:

streamlit run streamlit_app.py

Open http://localhost:8501 in your browser.

TypeScript Implementation

Step 1: Project Setup

npx create-next-app@latest personal-ai --typescript --tailwind --app
cd personal-ai
npm install @supermemory/tools ai @ai-sdk/openai

Create .env.local:

SUPERMEMORY_API_KEY=your_supermemory_key_here
OPENAI_API_KEY=your_openai_key_here

Step 2: API Route

Create app/api/chat/route.ts. Let’s break it down:

Import Dependencies

import { streamText } from 'ai'
import { createOpenAI } from '@ai-sdk/openai'
import { supermemoryTools } from '@supermemory/tools/ai-sdk'

streamText: Vercel AI SDK function that handles streaming responses and tool calling
createOpenAI: Factory function to create an OpenAI provider
supermemoryTools: Pre-built tools for memory search and storage

Initialize OpenAI Provider

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY!
})

Creates an OpenAI provider configured with your API key. The ! tells TypeScript “this definitely exists” (because we set it in .env.local). This provider object will be passed to streamText to specify which AI model to use.

Define System Prompt

const SYSTEM_PROMPT = `You are a highly personalized AI assistant.

When users share personal information, remember it using the addMemory tool.
Before responding, search your memories using searchMemories to provide personalized help.
Always be helpful while respecting privacy.`

This guides the AI’s behavior and tells it:

When to use tools: Search memories before responding, add memories when users share info
Personality: Be helpful and personalized
Boundaries: Respect privacy

The AI SDK uses this to decide when to call searchMemories and addMemory tools automatically.

Create POST Handler

export async function POST(req: Request) {
  try {
    const { messages, email } = await req.json()

Next.js App Router convention: export an async function named after the HTTP method. This handles POST requests to /api/chat. We extract:

messages: Chat history array [{role, content}]
email: User identifier

Validate Input

    if (!messages?.length) {
      return new Response('No messages provided', { status: 400 })
    }
    if (!email) {
      return new Response('Email required', { status: 400 })
    }

Why validate? Prevents crashes from malformed requests. We need:

At least one message to respond to
An email to isolate user memories

Without email, we can’t maintain personalization across sessions.

Create Container Tag

    const containerTag = `user_${email.toLowerCase().trim()}`

Convert email to a container tag for memory isolation. Simpler than Python: We skip UUID generation here for simplicity. In production, you might want to hash the email for privacy:

// Optional: More privacy-preserving approach
import crypto from 'crypto'
const containerTag = `user_${crypto.createHash('sha256').update(email).digest('hex').slice(0, 16)}`

Call streamText with Tools

    const result = streamText({
      model: openai('gpt-4o'),
      messages,
      tools: supermemoryTools(process.env.SUPERMEMORY_API_KEY!, {
        containerTags: [containerTag]
      }),
      system: SYSTEM_PROMPT
    })

This is where the magic happens! Let’s break down each parameter: model: openai('gpt-4o')

Specifies which AI model to use
The AI SDK handles the API calls

messages

Full conversation history
Format: [{role: "user"|"assistant", content: "..."}]

tools: supermemoryTools(...)

Gives the AI access to memory operations
The AI SDK automatically:
- Decides when to call tools based on the conversation
- Calls searchMemories when it needs context
- Calls addMemory when users share information
- Handles tool execution and error handling

containerTags: [containerTag]

Scopes all memory operations to this specific user
Ensures User A can’t access User B’s memories

system: SYSTEM_PROMPT

Guides the AI’s behavior and tool usage

How tools work:

User: “Remember that I’m vegetarian”
AI SDK detects this is memory-worthy
Automatically calls addMemory("User is vegetarian")
Stores in Supermemory with the user’s container tag
Responds: “Got it, I’ll remember that!”

Later:

User: “What should I eat?”
AI SDK calls searchMemories("food preferences")
Retrieves: “User is vegetarian”
Responds: “How about a delicious veggie stir-fry?”

No manual tool handling needed! The AI SDK manages the entire flow.

Return Streaming Response

    return result.toAIStreamResponse()

toAIStreamResponse() converts the streaming result into a format the frontend can consume. It:

Sets appropriate headers for streaming
Formats data for the useChat hook
Handles errors gracefully

This returns immediately (doesn’t wait for completion), and chunks stream to the client as they’re generated.

Error Handling

  } catch (error: any) {
    console.error('Chat error:', error)
    return new Response(error.message, { status: 500 })
  }
}

Catches any errors (API failures, tool errors, etc.) and returns a clean error response. Why log to console? In production, you’d send this to a monitoring service (Sentry, DataDog, etc.) to track issues.

Key Differences from Python:

Aspect	Python	TypeScript
Memory Search	Manual `search_user_memories()` call	AI SDK calls `searchMemories` tool automatically
Memory Add	Manual `add_user_memory()` call	AI SDK calls `addMemory` tool automatically
Tool Decision	You decide when to search/add	AI decides based on conversation context
Streaming	Manual SSE formatting	`toAIStreamResponse()` handles it
Error Handling	Try/catch in each function	AI SDK handles tool errors

Python = Manual Control
You explicitly search and add memories. More control, more code. TypeScript = AI-Driven
The AI decides when to use tools. Less code, more “magic.”

Step 3: Chat UI

Replace app/page.tsx:

Complete Frontend Code

'use client'
import { useChat } from 'ai/react'
import { useState } from 'react'

export default function ChatPage() {
  const [email, setEmail] = useState('')
  const [userName, setUserName] = useState('')
  const [tempEmail, setTempEmail] = useState('')
  const [tempName, setTempName] = useState('')
  
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: '/api/chat',
    body: { email }
  })
  
  if (!email) {
    return (
      <div className="flex items-center justify-center min-h-screen p-4">
        <div className="w-full max-w-md space-y-4 p-6 bg-white rounded-lg shadow-lg">
          <h1 className="text-2xl font-bold text-center">🤖 Personal AI Assistant</h1>
          <input
            type="text"
            placeholder="Your name"
            value={tempName}
            onChange={(e) => setTempName(e.target.value)}
            className="w-full px-4 py-2 border rounded-lg"
          />
          <input
            type="email"
            placeholder="your@email.com"
            value={tempEmail}
            onChange={(e) => setTempEmail(e.target.value)}
            className="w-full px-4 py-2 border rounded-lg"
          />
          <button
            onClick={() => {
              if (tempName && tempEmail) {
                setUserName(tempName)
                setEmail(tempEmail.toLowerCase().trim())
              }
            }}
            className="w-full px-4 py-2 bg-blue-600 text-white rounded-lg hover:bg-blue-700"
          >
            Get Started
          </button>
        </div>
      </div>
    )
  }
  
  return (
    <div className="flex flex-col h-screen max-w-4xl mx-auto p-4">
      <div className="flex-1 overflow-y-auto space-y-4 mb-4">
        {messages.map((message) => (
          <div
            key={message.id}
            className={`p-4 rounded-lg ${
              message.role === 'user' 
                ? 'bg-blue-100 ml-auto max-w-[80%]' 
                : 'bg-gray-100 mr-auto max-w-[80%]'
            }`}
          >
            <p className="whitespace-pre-wrap">{message.content}</p>
          </div>
        ))}
      </div>
      
      <form onSubmit={handleSubmit} className="flex gap-2">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Tell me about yourself..."
          className="flex-1 p-3 border rounded-lg"
        />
        <button 
          type="submit"
          className="px-6 py-3 bg-blue-600 text-white rounded-lg hover:bg-blue-700"
        >
          Send
        </button>
      </form>
    </div>
  )
}

Step 4: Run It

npm run dev

Open http://localhost:3000

Testing Your Assistant

Try these conversations to test memory: Personal Preferences:

User: "I'm Sarah, a product manager. I prefer brief responses."
[Later]
User: "What's a good way to prioritize features?"
Assistant: [Should reference PM role and brevity preference]

Dietary & Lifestyle:

User: "Remember I'm vegan and work out at 6 AM."
[Later]
User: "Suggest a quick breakfast."
Assistant: [Should suggest vegan options for pre/post workout]

Work Context:

User: "I'm working on a React project with TypeScript."
[Later]
User: "Help me with state management."
Assistant: [Should suggest TypeScript-specific solutions]

Verify Memory Storage

Python

Create check_memories.py:

from supermemory import Supermemory
import os
from dotenv import load_dotenv

load_dotenv()
client = Supermemory(api_key=os.getenv("SUPERMEMORY_API_KEY"))

# Replace with your user_id from console logs
user_id = "your_user_id_here"
container_tag = f"user_{user_id}"

memories = client.memories.list(
    container_tags=[container_tag],
    limit=20,
    sort="updatedAt",
    order="desc"
)

print(f"Found {len(memories.memories)} memories:")
for i, memory in enumerate(memories.memories):
    full = client.memories.get(id=memory.id)
    print(f"\n{i + 1}. {full.content}")

TypeScript

Create scripts/check-memories.ts:

const userId = "your_user_id_here"
const containerTag = `user_${userId}`

const response = await fetch('https://api.supermemory.ai/v3/memories', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.SUPERMEMORY_API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    containerTags: [containerTag],
    limit: 20,
    sort: 'updatedAt',
    order: 'desc'
  })
})

const data = await response.json()
console.log(`Found ${data.memories?.length || 0} memories`)

Troubleshooting

Memory not persisting?

Verify container tags are consistent
Check API key has write permissions
Ensure email is properly normalized

Responses not personalized?

Increase search limit to find more memories
Check that memories are being added
Verify system prompt guides tool usage

Performance issues?

Reduce search limits
Implement caching for frequent queries
Use appropriate thresholds

Built with Supermemory. Customize based on your needs.

Quick Start Recipes

​What You’ll Build

​Choose Your Implementation

Python + FastAPI

TypeScript + AI SDK

​Prerequisites

​Python Implementation

​Step 1: Project Setup

​Step 2: Backend (FastAPI)

​Import Dependencies

​Initialize Application and Clients

​Define System Prompt

​Create Identity Helpers

​Memory Search Function

​Memory Storage Function

​Main Chat Endpoint

​Derive User Identity

​Search and Inject Memories

​Stream OpenAI Response

​Handle Streaming

​Optional Memory Storage

​Return Streaming Response

​Local Development Server

​Step 3: Frontend (Streamlit)

​Step 4: Run It

​TypeScript Implementation

​Step 1: Project Setup

​Step 2: API Route

​Import Dependencies

​Initialize OpenAI Provider

​Define System Prompt

​Create POST Handler

​Validate Input

​Create Container Tag

​Call streamText with Tools

​Return Streaming Response

​Error Handling

​Step 3: Chat UI

​Step 4: Run It

​Testing Your Assistant

​Verify Memory Storage

​Python

​TypeScript

​Troubleshooting

What You’ll Build

Choose Your Implementation

Prerequisites

Python Implementation

Step 1: Project Setup

Step 2: Backend (FastAPI)

Import Dependencies

Initialize Application and Clients

Define System Prompt

Create Identity Helpers

Memory Search Function

Memory Storage Function

Main Chat Endpoint

Derive User Identity

Search and Inject Memories

Stream OpenAI Response

Handle Streaming

Optional Memory Storage

Return Streaming Response

Local Development Server

Step 3: Frontend (Streamlit)

Step 4: Run It

TypeScript Implementation

Step 1: Project Setup

Step 2: API Route

Import Dependencies

Initialize OpenAI Provider

Define System Prompt

Create POST Handler

Validate Input

Create Container Tag

Call streamText with Tools

Return Streaming Response

Error Handling

Step 3: Chat UI

Step 4: Run It

Testing Your Assistant

Verify Memory Storage

Python

TypeScript

Troubleshooting