Method
How the graph is built, and what it does not claim.
OpenInfluence is a join over public records. The value is in the joining, so the method has to be legible: where each fact came from, how two records became one entity, and where we draw the line on what the data can say.
From register to graph
- 01
Collect from the source of record
Every figure starts at a public register: the AEC's donation returns, each state's lobbyist register, ministerial diaries, AusTender, parliamentary registers of interests. Collectors run on a daily cron. Where a site resists automation we drive our own headless browser; we do not scrape anything that isn't a published record. We also enrich entities with public professional profiles (LinkedIn, X) via a licensed data provider, using only publicly visible information, never private or login-gated content.
- 02
Keep the raw file, hash it, version it
On each ingest the original file is stored untouched in object storage with a content hash and a fetch timestamp, and the rows parsed from it carry a link back to that file. Re-runs are idempotent: if the source hasn't changed, nothing moves; if it has, the old record is marked superseded rather than overwritten. Some earlier bulk-loaded data predates this and is being replayed to attach its source link. How far that has reached is published, per dataset, on the coverage page.
- 03
Resolve identities into one entity
The same firm appears as 'GRA Partners', 'G.R.A. Partners Pty Ltd' and an ABN in three different registers. We resolve those into one entity using deterministic keys first (ABN, exact name) and a language model only for the ambiguous remainder. Every model decision is written to a ledger with its inputs, the confidence, and the parser version, so a merge can be audited and reversed.
- 04
Join the records into a graph
Once identities are stable, the records connect: a donor carries its donations, an organisation carries its lobbying and its meetings, a person carries their directorships and their moves between office and firm. The path between a donor and a decision becomes a query, not a hunch.
The decision ledger
When a language model decides that two records are the same entity, the decision is recorded, not hidden. Each row carries the candidates it compared, the confidence it assigned, and the parser version that produced it. A merge is a fact you can inspect.
An illustrative ledger row:
- candidate a
- "GRA Partners" · NSW lobbyist register
- candidate b
- "G.R.A. Partners Pty Ltd" · AusTender
- key
- ABN 41 003 ••• •••
- decision
- merge → org_8t3k
- confidence
- 0.96
- parser
- resolve@v7
The influence index
The influence index is a computed proxy from disclosed activity, donations, meetings, contracts, lobbying, weighted and normalised. It is a way to sort and compare, not a verdict.
It is not a measure of wrongdoing, corruption, or success at influencing any decision. A high index means a lot is on the public record, nothing more.
Where we stand
Public-record spine
Every record comes from a disclosed public document. We are attaching a machine-traceable link from each row back to the exact file it was parsed from, and we publish how far that has reached on the coverage page. Nothing in the graph is inferred or alleged.
No inference of intent
We connect what was disclosed. We do not allege motive, corruption or wrongdoing, and we don't score anyone's politics.
Auditable resolution
Identity merges are logged with confidence and inputs. A wrong join is a bug we can find and fix, not a black box.
Right of reply
An entity that believes a record is wrong can point us to the source. We correct against the register, not against pressure.
OpenInfluence is a research tool over public records, not legal advice and not an allegation about any person or organisation. For the full picture of what is in and what is missing, see coverage.