[ BACK ]
[ BETA ] // Government Data

Tank Vision OCR Pipeline

Dual-GPU OCR pipeline with learned-rule extraction; turns county septic-permit PDFs into structured rows.


[ WHY_IT_MATTERS ]

Most counties refuse PIA requests for OSSF data or quote thousands of dollars. Tank Vision turns the freely available PDF permits into structured rows for the same outcome at zero marginal cost per county.

[ OVERVIEW ]

Tank Vision is a self-improving OCR pipeline at /home/will/ocr-pipeline. It takes county septic-permit PDFs (currently Comal County CCEO in Texas), runs dual-GPU OCR, applies a learned-rules extractor that grows over time (learned_rules.json, learned_examples_comal.json), and writes structured rows directly into the permits database.

The pipeline is the cost-avoidance answer to county PIA requests like the Kendall County $3,456 quote (declined April 2026). Each new county adds incremental scrape effort but zero ongoing per-record cost.

Active coverage now spans Texas + Tennessee: Comal County (TX) plus a Tennessee cohort across Hamilton (Chattanooga), Davidson (Nashville), Rutherford (Murfreesboro), Shelby (Memphis), Williamson (Franklin), Wilson (Lebanon), Maury, and a Johnson City 4-county bundle. Hays, Williamson (TX), and Bexar are next in queue.

[ BY_THE_NUMBERS ]
  • > Comal CCEO septic permits actively processed (TX)
  • > TN coverage: Hamilton (Chattanooga), Davidson (Nashville), Rutherford (Murfreesboro), Shelby (Memphis), Williamson (Franklin), Wilson (Lebanon), Maury, plus Johnson City 4-county bundle
  • > Dual-GPU launch script (launch_dual_gpu.sh)
  • > Self-improving learned rules + examples corpus
[ NEXT_90_DAYS ]
  • Add Hays + Williamson (TX) + Bexar counties
  • Generalize learned-rules engine across counties
  • Auto-detect and queue new permits from county portals
[ TALK_TO_PHIL ]

QUESTIONS ABOUT TANK VISION OCR PIPELINE? ASK PHIL DIRECTLY.

// Phil is Will's voice agent. Architecture, pricing, roadmap, licensing — for any product.