I Got Tired of Re-Explaining My Codebase to AI — So I Built a Memory Layer
After ~18 years of shipping products, I noticed an invisible tax: re-explaining my project to AI assistants every session. Here's how I fixed it.
I Got Tired of Re-Explaining My Codebase to AI — So I Built a Memory Layer
It's 9:47 AM. I'm reopening my IDE to continue yesterday's work on an auth system. I ask Claude to pick up where we left off.
"What authentication approach are you using? JWT or sessions? Which OAuth provider? What's your database?"
We literally discussed this yesterday. For an hour.
This kept happening to me. Every. Single. Session.
Disclosure: I'm the founder of ContextStream — I built this because I couldn't stand paying this tax anymore.
The problem nobody budgets for: AI amnesia
AI coding assistants are incredible inside a single chat. They can reason about architecture, write production code, catch bugs.
But the moment you close the window? Total amnesia.
After ~18 years of shipping products, I've learned to notice invisible productivity taxes. This one was huge:
- Re-explaining my stack
- Re-listing architectural decisions
- Re-attaching the same context files
- Re-arguing patterns we already settled
I started tracking it. I was spending 10–15 minutes per session just getting the assistant back up to speed.
Why the obvious "solutions" didn't solve it
I tried all the usual workarounds:
Chat history: noisy, not portable across tools, and I still had to scroll and re-read.
Built-in memory toggles: tied to one product; I bounce between tools depending on the task.
Pasting context every time: it works, but defeats the point of having an assistant.
What I actually needed was a memory layer that:
- Captures decisions as I make them
- Retrieves the right context automatically
- Works across the AI tools I use
What I built: a memory layer behind my AI tools
I spent the last year building ContextStream — a memory layer that sits behind my AI tools via MCP (Model Context Protocol). MCP is a protocol that lets AI clients call "tool servers" to fetch context.
The core insight is simple:
Storage is cheap. Retrieval is hard.
If you dump everything into context, token costs explode and the model gets confused. The only thing that works is delivering the right context at the right time.
So ContextStream captures three kinds of "project memory":
- Decisions — "We're using JWT with refresh tokens"
- Context — indexed code/docs so the assistant can retrieve what matters
- Connections — which decisions affect which modules
A tiny "before → after"
Before:
"JWT or sessions? Which provider? Which database?"
After:
"Last time we chose JWT with refresh tokens. OAuth provider is X. The auth code lives in …. Want me to continue with the refresh rotation + middleware?"
That's the bar I wanted: start where we left off, not at square one.
Setup (the happy path is one command)
npx -y @contextstream/mcp-server@latest setupThat's it — it configures MCP for the tool you're using.
What actually changed for me
The obvious win: no more re-explaining.
The surprising win: consistency.
Before, my assistant would suggest camelCase on Monday and snake_case on Wednesday. Now it remembers "this codebase uses camelCase" and stays consistent.
And when bugs resurface (they always do), it can pull the previous fix back into view:
"We saw this before — the issue was X, and we fixed it by Y."
If you've felt this too…
I'm building this in public. The free tier gives you enough operations to see if it clicks.
If you try it, I'd love one piece of feedback:
What's the #1 thing you wish your AI assistant would remember about your project?
- Project: contextstream.io
- MCP server repo: github.com/contextstream/mcp-server
Related Reads
Feb 24, 2026
A New Study Says Delete Your CLAUDE.md / AGENTS.md. Here's What Actually Works Instead
A recent paper just measured what CLAUDE.md and AGENTS.md files actually do to coding agents. The results are bad. Developer-written context files improved task success by about 4% on average, while LLM-generated context files *decreased* success by roughly 3% and both increased cost and latency by over 20%. And this isn't a fringe result it matches what most developers already suspect.
Jan 16, 2026
Why We Built Notion Integration (And What We Learned)
The hardest part of building AI memory isn't the AI — it's deciding what deserves to be remembered. Here's how we thought through Notion integration and the tradeoffs we made.
Ready to build with persistent context?
ContextStream keeps your team decisions, code intelligence, and memory connected from first prompt to production.