MVP 2024: public programs → budget lines → evidence → review queue.
Public
DIPRES Evaluation Bridge
Auditable system connecting evaluated Chilean public programs with budget lines, execution data, and documentary evidence, treating each bridge as a reviewable hypothesis.
- 706
- BIPS Monitoreo 2024 programs
- 420
- defensible bridges
- 60%
- defensible MVP coverage
Constraints
- Base case written from local project documentation; public repo, demo, and data package are pending confirmation.
- Metrics come from a local MVP 2024 run and should not be read as official program-level accounting.
- The program-budget bridge is an auditable hypothesis: exact_match does not imply budget exclusivity.
TL;DR
- Builds a bridge table across BIPS/DIPRES programs, the Budget Law, execution data, budget notes, and documentary evidence.
- Each match keeps rule, score, status, source, URL, hash, and text fragment so it can be reviewed or challenged.
- Separates programmatic linkage from financial scope so an aggregate line is not mistaken for exclusive program spending.
Reusable patterns
- Conservative ingestion: store raw artifacts, SHA-256 hashes, and metadata before interpretation.
- Institution-scoped matching by ministry, service, and year instead of global fuzzy matching over sensitive public data.
- Explicit statuses (exact_match, high_confidence, ambiguous, unmatched) plus a manual review queue.
- Separate match status and financial_scope to communicate uncertainty without hiding useful evidence.
Context
Chile evaluates and monitors public programs, but its budget classification does not provide a formal key connecting each evaluated program to budget lines.
The problem is not only technical: a public program and a budget program are different entities and can have many-to-many relationships.
The system documents which bridges are defensible, which are ambiguous, and where traceability breaks under available public sources.
Decisions
- Model each bridge as an auditable hypothesis, not as accounting certification.
- Store HTML, PDF, XLSX, XML, CSV, or API responses as raw hashed artifacts before parsing.
- Build program entities from BIPS/DIPRES and budget lines from the Budget Law, execution data, and budget notes.
- Scope automatic matching to the correct institutional universe and block known false positives as regression tests.
- Keep manual review and a change log for ambiguous or high-impact decisions.
Architecture
- DuckDB stores raw_artifact, program dimensions, bridge_programa_presupuesto, and review queues.
- The system consumes local upstreams for DIPRES budget execution and Financial Reports instead of duplicating those pipelines.
- Outputs separate traced, ambiguous, and non-comparable amounts to avoid false accounting totals.
Outcome
- The local 2024 run documents 706 monitored programs, 420 with a defensible bridge, and 286 without a defensible bridge under conservative rules.
- The 40% without a bridge is communicated as a traceability gap, not as a scraping failure or proof of missing budget.
- The case demonstrates reusable public-data infrastructure: contracts, sources, hashes, review, and methodological policy.