Scrapingdome
Case study - Court records

Distress signals from NC court records, structured for a forensic title research workflow.

For a US real estate investor specializing in distress acquisitions, CivicMine surfaces ownership-breakdown signals (unknown heirs, foreclosure filings, lis pendens, heirship affidavits, requests for notice) from court records, Register of Deeds, and county parcel data across North Carolina and Texas. Platform-layer architecture so a new county is a configuration, not a rebuild.

Problem

Distress signals live in three layers of public data, each on a different platform.

Distress-acquisition investors win deals by surfacing ownership-breakdown signals before competitors: unknown heirs, foreclosure notices, lis pendens, affidavits of heirship, requests for notice. These signals live across three layers of public data.

Court filings expose case-level signals: cause of action, judgment type, party names. The Register of Deeds exposes instrument-level signals: recorded documents, often with text that only an attorney would read carefully. County tax and parcel data is the bridge that turns a court party name or a deed book reference into a specific property with an owner and an assessed value.

Each county is a different problem on the surface: courts on the state platform, Register of Deeds on a county-specific vendor (Wake County uses Peraton, Guilford County uses BIS Software, other counties pick their own), parcel files in different formats. Treated as a per-county scraping problem, the work scales linearly with cost. Treated as a platform problem, it scales at near-zero marginal cost per new county on a known platform.

Approach

Four layers, each built around the platform rather than the county.

01

Court records layer

A single adapter for Tyler Enterprise Justice. NC eCourts migrated all 100 NC counties to the platform in October 2025; the same product family (Tyler Odyssey) powers court systems in 22 or more US states. The Judgment Search API exposes lis pendens, claim of lien, partition, and unknown-heir party names as structured JSON without authentication, and Smart Search provides case-detail enrichment behind a free registration.

02

Register of Deeds layer

Per-vendor adapters for the document recording systems. Wake County runs on Peraton (a stateful ASP.NET application), Guilford County on BIS Software (direct URL access by instrument number). Recorded document PDFs flow downstream for classification. Wake County uses a /HEIR suffix convention in the grantor field that pre-filters heirship affidavits before any AI cost.

03

Tax and parcel reconciliation

Daily bulk file ingestion from each county. Deed book and page references inside court filings and recorded instruments resolve to specific properties, owners, parcel IDs, and assessed values. Owner-name fuzzy matching handles court records that lack a deed reference. The output of this layer is the property-anchored row that downstream consumers actually want.

04

AI classification and lead scoring

Claude API classifies document text for ambiguous signal types: heirship affidavits, scrivener affidavits, service-by-publication language inside general filings, correction deeds. Every classification returns a signal type, a confidence score, and the source-text excerpt. A composite score weights signal type, signal count on the same property, and recency, so the daily digest ranks properties by conviction rather than by ingestion order.

Scale and outcome In production

Live in two states, addressable across the Tyler portfolio.

Wake and Guilford counties validated the architecture in North Carolina; the model extended to Texas counties on the same engagement without rewriting the core. The investor renewed under a monthly maintenance retainer after delivery and is adding counties on the same architecture.

NC + TX
live across two states on a single architecture
100
NC counties addressable on Tyler Enterprise Justice
22+
US states using the Tyler court platform family

Daily delivery with composite signal scoring. Each new county on a Tyler court system is a filter parameter; each new Register of Deeds vendor is an adapter that covers every county on that vendor.

What this proves

Platform-layer thinking is the unit that scales.

Building per-county turns every new county into a new project. Building per-platform turns it into a configuration change. The architecture validated end-to-end in North Carolina extended to Texas without rewriting the core, and the client expanded coverage on the same engagement.

This is the pattern CivicMine applies across verticals. The unit of work is the platform, not the geography. Tyler Enterprise Justice for court records. Socrata for state-level open data. Granicus for municipal meetings. Each platform-level adapter unlocks the data of every jurisdiction running on that platform; each new jurisdiction is configuration.

Questions answered in this engagement

How this pipeline works in practice.

How does adding a new NC county work?

Court data: a filter parameter change in the Tyler Enterprise Justice adapter, zero additional development since all 100 NC counties migrated to the same platform in October 2025. Register of Deeds: a per-vendor adapter. If the new county uses a vendor we have already built for (Peraton for Wake, BIS Software for Guilford), it is a configuration change. A new vendor takes a few days of work and covers every county on that vendor.

What about counties on other court platforms?

Some counties run on non-Tyler systems: locally built Clerk-of-Courts sites or public-record vendors specific to a state. We map each county to its court platform and ROD vendor before building, so a non-Tyler court system becomes a new adapter alongside the Tyler one. All adapters feed the same downstream classification, entity resolution, and lead scoring.

How accurate is AI classification for legal documents?

Direct code matches (foreclosure notice instruments, request-for-notice filings, the /HEIR grantor convention in Wake County affidavits) need no AI judgment and surface as high-confidence signals. Document text classification handles ambiguous cases like service-by-publication language inside general affidavits; Claude API returns a signal type, a confidence score, and the source-text excerpt so every lead carries the evidence with it. Low-confidence leads are flagged for manual review rather than auto-asserted.

Can this scale to other states?

The court layer ports natively. Tyler Odyssey and Tyler Enterprise Justice power court records in 22 or more US states, and the Judgment Search API pattern is consistent across deployments. Register of Deeds varies more across states, but the per-vendor adapter approach contains the work. Texas was the first state expansion and ran on the same architecture without rewriting the core.

Who owns the pipeline and how is it maintained?

The client receives the full codebase, Docker environment, configuration, and documentation at handoff. The pipeline runs independently on a standard VPS. An optional monthly maintenance retainer covers monitoring, scraper fixes when county systems change their markup, classification prompt tuning as new signal patterns emerge, and minor adjustments. The retainer is optional; the system is designed to be self-sufficient with the documentation provided.

Contact

Need a similar pipeline for your state or signal type? Tell us about it.