Data Model
Data Model
Fund Analyst Intelligence is built around a small set of canonical entities.
These entities make the monthly workflow reproducible and auditable.
They also keep integrations stable.
This page describes the conceptual data model.
It is independent of any specific database technology.
It focuses on what must exist in production.
Design goals
The data model must support:
- a living fund profile with versioned history
- evidence-first claims linked to artefacts
- deterministic monthly cycles and snapshots
- structured deltas and exceptions
- review and approval workflow states
- reproducible reporting outputs
- policy versioning for validation and materiality
Core entities
1. Fund
Represents the canonical fund object.
Key attributes
fund_id(internal stable id)
- identifiers (ISIN, share class ids, internal codes)
- manager and umbrella relationships
- strategy classification and tags
- domicile and legal structure
- status (active, watch, offboarded)
Notes
A fund id must remain stable across time.
External identifiers can change or be incomplete.
2. Artefact
A versioned source document or captured online reference.
Key attributes
artefact_id
fund_id
cycle_id(nullable for baseline libraries)
source_type(upload, email, online, internal)
origin(policy-defined source descriptor)
versionandsupersedes_artefact_id
- timestamps: ingested, retrieved (if online)
- storage pointer and access scope
- metadata (file type, pages, hashes)
Notes
Artefacts are the foundation of evidence and reproducibility.
They must be immutable once stored.
3. Claim
A statement extracted from an artefact, linked to evidence.
Key attributes
claim_id
fund_id
cycle_id
claim_type(term, fee, liquidity, personnel, strategy, etc.)
text(or normalised representation)
- evidence pointer:
artefact_id+ location reference
- confidence signals
- status: supported, conflicted, missing, stale, needs_verification
Notes
Claims are distinct from “fields”.
Claims preserve what sources said.
Fields represent the validated canonical state.
4. FieldValue
A structured attribute used in the fund profile.
Key attributes
field_value_id
fund_id
cycle_id
field_key(e.g.,liquidity.notice_days)
- typed value (string/number/date/enum/object)
- evidence references (one-to-many claims or artefacts)
- status and validation marks
- override flags and reviewer notes
Notes
Field values should be typed and schema-bound.
This enables deterministic validation and comparison.
Workflow entities
5. Cycle
A monthly (or event-driven) run of the pipeline.
Key attributes
cycle_id
fund_id
- cadence: monthly, quarterly_synthesis, event
- timestamps: started, completed
- state: draft, review, approved, published, failed
- rule-set versions applied
- materiality policy version applied
- operator and reviewer assignments
Notes
The cycle is the production unit of work.
It must be uniquely identifiable and reproducible.
6. Snapshot
The approved state of a fund at a point in time.
Key attributes
snapshot_id
fund_id
cycle_id(the approving cycle)
- snapshot timestamp
- canonical field set (structured)
- reference to evidence pack
- approval metadata
Notes
Snapshots should only be created on approval.
Draft cycles must not alter history.
7. Delta
A structured change event computed between snapshots.
Key attributes
delta_id
fund_id
cycle_id
from_snapshot_id/to_snapshot_id
field_keyor change category
- old value / new value (typed)
- magnitude and direction (if applicable)
- classification tags
- evidence references
- materiality score inputs
Notes
Deltas are the basis of change logs and quarterly synthesis.
They must be deterministic.
8. Exception
A structured issue requiring review or follow-up.
Key attributes
exception_id
fund_id
cycle_id
- category (consistency, missing evidence, stale, conflict, policy breach)
- severity
- description and recommended action
- linked deltas/claims/fields
- status: open, resolved, monitoring, escalated
- owner and SLA metadata
- decision notes and closure evidence
Notes
Exceptions are how quality remains honest.
They also provide operational steering signals.
Output entities
9. Report
A generated output artefact with a defined template.
Key attributes
report_id
fund_id
cycle_id
report_type(monthly_memo, quarterly_pack, evidence_pack)
template_version
- generated timestamp
- approval stamp reference
- storage pointer
- reproducibility pointer (cycle record reference)
Notes
Reports are production artefacts.
They must be linked back to the cycle that produced them.
10. Alert
An escalation event derived from deltas and exceptions.
Key attributes
alert_id
fund_id
cycle_id(nullable for event-driven)
- alert type (material_change, exception_escalation, followup_breach, monitor)
- severity and payload
- evidence links
- owner and resolution status
- timestamps for escalation and closure
Notes
Alerts must be low-noise and auditable.
They should be resolvable and tracked to closure.
Policy entities
11. SourcePolicy
Defines allowed sources and evidence requirements.
Key attributes
policy_id
- allow-lists and source types
- recency requirements by field/category
- retention expectations
- conflict resolution preferences
- approver roles
Notes
Source policy changes must be versioned.
They affect reproducibility and governance.
12. ValidationRuleSet
Defines deterministic checks and constraints.
Key attributes
ruleset_id
- rule definitions
- severity mapping
- field schemas and required-field sets
- test coverage references (engineering practice)
Notes
Rulesets must be versioned and tested.
Cycles record which ruleset they used.
13. MaterialityPolicy
Defines thresholds and alerting rules.
Key attributes
materiality_id
- category severities
- thresholds and scoring logic
- confidence requirements
- escalation and follow-up SLAs
Notes Materiality is a governance choice.
It must be explicit and stable.
Minimal contracts for integrations
External systems typically need:
Fundidentifiers and tags
- latest
Snapshotfield set
Deltalist for a cycle or period
Exceptionqueue with status and ownership
Reportregistry with links
Alertfeed for escalations
- policy references for governance audits
These objects should be exposed through stable, versioned APIs or exports.
Evidence references should never be dropped.
Definition of done
The data model is production-grade when:
- cycles, snapshots, and artefacts form a reproducible chain
- fields are typed and schema-bound
- claims preserve evidence links and uncertainty states
- deltas are deterministic and structured
- exceptions support resolution workflows with auditability
- reports are registered artefacts linked to cycle records
- policies are versioned and referenced by cycles
This model is the backbone of Fund Analyst Intelligence in production.