Rural Service Lead Engine
AI-validated rural-service lead scoring: **5.9–6.2x lift** vs HGAC baseline (v5, validated 2026-05-17). 3.66M parcel polygons across **11 Texas CADs** + 924M broadband records → ranked rural-Mac-style targets ready for dialer.
Rural service businesses (septic, well-drilling, propane, satellite TV, off-grid solar) need to find addresses without sewer/fiber/cable. After five iterations, the v5 model delivers AI-validated lift: Mac Septic's customers score 5.9x higher in the high-rural bucket than HGAC Houston-area permits — apples-to-apples after symmetric spatial enrichment with 1.75M parcel polygons. Logistic regression independently confirms (AUC 0.86, lift 4.09x). 4,145 rural-Mac-style targets ranked and ready for Mac's dialer queue. The same engine generalizes to any rural-service vertical by swapping the validation cohort.
Rural Service Lead Engine v5 is a scoring layer over a 4.25-billion-row federal-and-state cluster: 924M FCC broadband records, 1.75M parcel polygons (HCAD 1.46M Harris + Comal 102K + Hays 119K + Bandera 32K + Kerr 34K = full Texas Hill Country + Houston coverage), 2,644 TIGER urban-area polygons, 33K census ZCTAs, and 228K TX OSSF septic permits. For every address it computes a rural_septic_score 0-100 combining: TIGER urban-area NOT-overlap (anti-signal -35), polygon-derived lot_acres (primary signal up to +35), population density, broadband absence, and joint-condition interactions.
The validation story: v1 used broadband absence alone (0.29x lift — failed). v2 added urban-area + density (2.29x lift — usable). v3 tried address-based parcel join (1.56x — regressed due to 0.012% match rate). v4 used HCAD spatial join (0.43x — inverted because Mac is in Hill Country, not Harris). v5 added Hill Country parcel polygons for symmetric enrichment of both cohorts (5.99x vs all-TX, 5.91x vs HGAC — SHIP). Logistic regression AUC 0.86, lift 4.09x independently confirms — no leakage.
Output: 4,145 Mac-customer-confirmed rural targets in the dialer-ready queue, plus 10,000 ranked TX OSSF permits as expansion candidates. Symmetric spatial enrichment is the key — both cohorts get apples-to-apples treatment so the lift number is real, not an artifact.
- > 5.9–6.2x lift vs HGAC baseline (v5 validation runs, 2026-05-17)
- > Mac parcel coverage 77.8% (8,630 customers → 6,718 with derived lot_acres)
- > 3.66M parcel polygons across 11 CADs: HCAD, DCAD, TAD, TravisCAD, WilliamsonCAD, HaysCAD, CCAD, BastropCAD, BurnetCAD, KerrCAD, BanderaCAD
- > 4,145 Mac-customer-confirmed rural targets ready for dialer
- > 10,000 ranked TX OSSF expansion candidates exported
- > 924M FCC broadband location-provider-technology records
- > Logistic regression: AUC 0.86, precision-lift 4.09x
- > 228,255 TX OSSF permits scored
- → FL + NC multi-state v5 (FL DOR 10.8M polygons + NC OneMap 5.72M polygons loading now)
- → Census-precision geocoding deployed (85% address-precision)
- → Outcomes feedback loop live in Mac CRM (conversion-by-score-band view, ML retraining path)
- → Real-time /v1/rural-score/lookup endpoint live for any US address
- → Sign first external rural-vertical pilot (well drilling, propane) using same engine
- ! HaysCAD polygons are coarse (median 653 acres) — subdivision-aggregate geometry inflates per-lot numbers, though the score formula bins at 10+ acres so it doesn't over-credit
- ! 27 Bandera customers share a single 67.9-acre polygon (subdivision-level geometry) — cosmetic; signal still says "rural Hill Country"
- ! Per-address geocoding (Mapbox/Census) would push Mac coverage from current 77.8% toward 95%+ at ~$0.005/address