How I Built a SaaS to Crack the Code on Rare Coin Provenance Research (And Scaled It with AI)

How I Leveraged Auction History Research to 3X My Freelance Developer Rates

October 1, 2025

Developer’s Guide to Legal & Compliance Risks in AI-Powered Auction History Research

October 1, 2025

Published by Dre Dyson on October 1, 2025

Why This Problem Was Worth Solving

As a lifelong coin collector and SaaS builder, I kept tripping over the same problem: no single source of truth for coin provenance. Want to know where that 1907 Saint-Gaudens double eagle came from? Good luck.

Spend hours digging through Heritage, Stack’s, or Goldberg archives
Download blurry, scanned PDFs from the Newman Portal
Flip through dusty catalogs you paid a small fortune for
Pay $200/hour to an expert for 10 minutes of their time

It wasn’t just annoying. It was limiting. Most collectors couldn’t afford the time or money to do real research. So they guessed. Or worse — they bought blind.

I asked myself: *Can I fix this?*
Not with another spreadsheet. Not with a better scanner. But with a provenance engine — one that learns from decades of auction data, slab photos, and collector wisdom. And could I build it solo, bootstrapped, without a single VC dollar?

Turns out — yes.

Choosing the Right Tech Stack for a Niche but High-Value SaaS

This wasn’t a CRUD app. It was a data puzzle. I needed to pull from auction sites, parse scanned catalogs, match images, and link coins across decades. The stack had to be fast, cheap, and scale quietly.

1. Backend: Node.js + Express + PostgreSQL

I went with Node.js — not because it’s trendy, but because it handles async scraping like a champ. Express kept the API lean. And PostgreSQL with JSONB let me store messy OCR text and structured metadata side by side.

Storing OCR results? Yes. But also being able to search “1905-O dime MS65” without breaking a sweat? That’s the win.

// Real-world example: A parsed auction lot { "lot_id": "HA-1999-0452", "year": 1999, "grade": "PCGS MS65", "ocr_text": "1905-O Dime, rainbow toning...", "image_urls": ["..."], "provenance_hints": ["Blay Collection", "GC Auction"] }

2. Scraping & Data Ingestion: Puppeteer + Cheerio + Custom OCR Pipeline

Heritage and Stack’s don’t play nice with APIs. So I built a headless browser with Puppeteer to crawl their archives, grab lot details, and snag thumbnails.

For 1940s-era catalogs — scanned as 300dpi PDFs with coffee stains — I used Tesseract.js for OCR. But I didn’t stop there. I trained it on numismatic terms: “MS63,” “Cameo,” “1905-O,” “PCGS,” “CAC.”

Result? A 30% boost in accuracy on low-res, compressed scans. A small tweak. A big difference.

3. AI Layer: Fine-Tuned GPT-3.5 + CLIP for Visual Matching

This is where things got interesting. I didn’t just want to extract text — I wanted to understand it.

I fine-tuned a custom GPT-3.5 model on thousands of auction entries. Not with fancy infrastructure. Just prompt engineering and real coin data.

Example: I’d feed it a messy line from a 1990s catalog:

Prompt: “Parse: ‘1905-O Dime, PCGS MS65, Blay Collection, rainbow toning, ex. GC Auction 2001.’ Extract: year, mintmark, grade, collection, previous auction, visual description.”

And it would return clean JSON — every time. Even if the original text was buried in a paragraph about shipping costs or bid increments.

Then came the visual piece. I used OpenAI’s CLIP model to match slab images. User uploads a photo of their coin’s slab? The system finds it in Heritage’s 2005 archives — even if the cert number changed after a regrade.

No cert number? No problem. The image talks.

Building a Lean, Searchable Provenance Database

Data alone isn’t useful. Searchable data is. But old catalogs are image-based. Descriptions are inconsistent. Provenance chains break with every resale. So I built two systems:

1. Full-Text Search with PostgreSQL + Trigram Matching

Fuzzy search matters. A user types “1905O dime”? They still need to see results for “1905-O Dime” or “1905 O Dime.”

Enter pg_trgm — a PostgreSQL extension for trigram matching. It measures similarity between strings. Works like a charm.

-- Enable trigram CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- Fuzzy match on OCR text SELECT * FROM catalog_entries WHERE ocr_text ILIKE '%1905%O%' ORDER BY similarity(ocr_text, '1905-O Dime') DESC;

2. Provenance Graph Engine

Coins move. They get resold. They regrade. Their history gets scattered.

So I built a graph layer using PostgreSQL’s recursive CTEs. Now, if a coin appeared in Blay’s collection in 1995, then Heritage in 1999, then GC in 2020, the system connects the dots.

And the AI helps. It looks at grade, description, and image — and suggests matches, even when cert numbers don’t line up. “This looks like the same coin. Want to confirm?”

Product Roadmap: From MVP to Scalable SaaS

I didn’t build the whole thing at once. I followed collector-first validation — build small, test fast, learn faster.

Phase 1: The “One Coin” MVP (2 Weeks)

First version? A Google Form. Users entered a cert number. I ran a script. They got a PDF with matching auction lots. No UI. No dashboard. Just results.

I sent it to 30 collectors. Feedback: “It works. But I can’t always find the cert number.”

Fair. So I listened.

Phase 2: Image Upload + AI Matching (4 Weeks)

I added image upload. User drags in a slab photo. CLIP runs. Within a week, a collector found a 1913-S Buffalo Nickel — missing from Heritage’s archive for 20 years.

That coin? Worth $150,000. He became my first paying user. And 49 more followed.

Phase 3: Provenance Chains & Dealer Network (8 Weeks)

I added a curated directory of experts — specialists in early dollars, patterns, colonials. Users can request a “Provenance Check.” I charge $50. 70% goes to the expert.

It’s not just revenue. It’s trust. Humans verify the AI. That’s what collectors want.

Phase 4: Public Archive + Community Curation (Ongoing)

I opened the archive. Users can submit findings. Experts (paid moderators) review and approve. Every entry improves the AI.

More data → better matches → more users → more data. A real flywheel.

Getting to Market Faster: Bootstrapping Without Burnout

Solo founder. No team. No funding. So I moved fast — and stayed lean.

Key Takeaways for SaaS Founders in Niche Markets

Pain is your product compass: I spent 8 hours per coin on research. That’s why people paid.

AI needs data, not hype: GPT works — but only after I trained it on real auction entries.

Human + AI > AI alone: AI finds leads. Humans verify. That’s the combo.

Community is your QA team: Collectors flagged edge cases, suggested features, and became advocates.

Charge early, charge often: A $50 expert review? Profitable. And it builds trust.

Conclusion: From Hobby to Scalable SaaS

This started as a way to save myself time. Now it’s a bootstrapped SaaS with over 1,200 collectors and 50+ dealers using it monthly.

The tech? Clever, but not revolutionary. The real win? Domain expertise + lean execution + AI that actually works.

Today, a user uploads a slab photo of a 1905-O Dime. In 20 seconds, they see its entire journey — Blay Collection, GC Auction, Heritage — with images, grades, and expert notes.

That’s not just a database. That’s a story. And people will pay to read it.

To founders in niche markets: Your obsession is your edge. Find the expensive, tedious problem. Solve it with smart, scrappy tech. And charge for it — because someone’s tired of guessing.

Related Resources

You might also find these related articles helpful:

How Developer Tools and Workflows Can Transform Auction Histories into SEO Gold – Most developers don’t realize their tools and workflows can double as SEO engines. Here’s how to turn auction histories—…
How Auction History Research Can Transform Your Numismatic ROI in 2025 – What’s the real payoff when you track a coin’s story? More than bragging rights—it’s cold, hard cash. …
How AI and Provenance Research Will Transform Numismatics in 2025 and Beyond – This isn’t just about catching up with the present. It’s about shaping what’s coming next in coin coll…

Dre Dyson

Comments are closed.

How I Built a SaaS to Crack the Code on Rare Coin Provenance Research (And Scaled It with AI)

How I Leveraged Auction History Research to 3X My Freelance Developer Rates

Developer’s Guide to Legal & Compliance Risks in AI-Powered Auction History Research

Dre Dyson

Main

Custom service

Cart

Login

How I Built a SaaS to Crack the Code on Rare Coin Provenance Research (And Scaled It with AI)

How I Leveraged Auction History Research to 3X My Freelance Developer Rates

Developer’s Guide to Legal & Compliance Risks in AI-Powered Auction History Research

How I Leveraged Auction History Research to 3X My Freelance Developer Rates

Developer’s Guide to Legal & Compliance Risks in AI-Powered Auction History Research

Why This Problem Was Worth Solving

Choosing the Right Tech Stack for a Niche but High-Value SaaS

1. Backend: Node.js + Express + PostgreSQL

2. Scraping & Data Ingestion: Puppeteer + Cheerio + Custom OCR Pipeline

3. AI Layer: Fine-Tuned GPT-3.5 + CLIP for Visual Matching

Building a Lean, Searchable Provenance Database

1. Full-Text Search with PostgreSQL + Trigram Matching

2. Provenance Graph Engine

Product Roadmap: From MVP to Scalable SaaS

Phase 1: The “One Coin” MVP (2 Weeks)

Phase 2: Image Upload + AI Matching (4 Weeks)

Phase 3: Provenance Chains & Dealer Network (8 Weeks)

Phase 4: Public Archive + Community Curation (Ongoing)

Getting to Market Faster: Bootstrapping Without Burnout

Key Takeaways for SaaS Founders in Niche Markets

Conclusion: From Hobby to Scalable SaaS

Related Resources

Dre Dyson

Related posts

A CTO’s Strategic Playbook: Translating Trade Show Insights into Technology Leadership Decisions

How Deep Technical Expertise in Software Can Launch Your Career as a High-Demand Expert Witness

The Technical Author’s Playbook: How I Published With O’Reilly and Established Industry Authority