Data Model

Canonical entities for Fund Analyst Intelligence: funds, artefacts, claims, cycles, snapshots, deltas, exceptions, reports, and policies.

Data Model

Fund Analyst Intelligence is built around a small set of canonical entities.
These entities make the monthly workflow reproducible and auditable.
They also keep integrations stable.

This page describes the conceptual data model.
It is independent of any specific database technology.
It focuses on what must exist in production.

Design goals

The data model must support:

a living fund profile with versioned history
evidence-first claims linked to artefacts
deterministic monthly cycles and snapshots
structured deltas and exceptions
review and approval workflow states
reproducible reporting outputs
policy versioning for validation and materiality

Core entities

1. Fund

Represents the canonical fund object.

Key attributes

fund_id (internal stable id)
identifiers (ISIN, share class ids, internal codes)
manager and umbrella relationships
strategy classification and tags
domicile and legal structure
status (active, watch, offboarded)

Notes

A fund id must remain stable across time.
External identifiers can change or be incomplete.

2. Artefact

A versioned source document or captured online reference.

Key attributes

artefact_id
fund_id
cycle_id (nullable for baseline libraries)
source_type (upload, email, online, internal)
origin (policy-defined source descriptor)
version and supersedes_artefact_id
timestamps: ingested, retrieved (if online)
storage pointer and access scope
metadata (file type, pages, hashes)

Notes

Artefacts are the foundation of evidence and reproducibility.
They must be immutable once stored.

3. Claim

A statement extracted from an artefact, linked to evidence.

Key attributes

claim_id
fund_id
cycle_id
claim_type (term, fee, liquidity, personnel, strategy, etc.)
text (or normalised representation)
evidence pointer: artefact_id + location reference
confidence signals
status: supported, conflicted, missing, stale, needs_verification

Notes

Claims are distinct from “fields”.
Claims preserve what sources said.
Fields represent the validated canonical state.

4. FieldValue

A structured attribute used in the fund profile.

Key attributes

field_value_id
fund_id
cycle_id
field_key (e.g., liquidity.notice_days)
typed value (string/number/date/enum/object)
evidence references (one-to-many claims or artefacts)
status and validation marks
override flags and reviewer notes

Notes

Field values should be typed and schema-bound.
This enables deterministic validation and comparison.

Workflow entities

5. Cycle

A monthly (or event-driven) run of the pipeline.

Key attributes

cycle_id
fund_id
cadence: monthly, quarterly_synthesis, event
timestamps: started, completed
state: draft, review, approved, published, failed
rule-set versions applied
materiality policy version applied
operator and reviewer assignments

Notes

The cycle is the production unit of work.
It must be uniquely identifiable and reproducible.

6. Snapshot

The approved state of a fund at a point in time.

Key attributes

snapshot_id
fund_id
cycle_id (the approving cycle)
snapshot timestamp
canonical field set (structured)
reference to evidence pack
approval metadata

Notes

Snapshots should only be created on approval.
Draft cycles must not alter history.

7. Delta

A structured change event computed between snapshots.

Key attributes

delta_id
fund_id
cycle_id
from_snapshot_id / to_snapshot_id
field_key or change category
old value / new value (typed)
magnitude and direction (if applicable)
classification tags
evidence references
materiality score inputs

Notes

Deltas are the basis of change logs and quarterly synthesis.
They must be deterministic.

8. Exception

A structured issue requiring review or follow-up.

Key attributes

exception_id
fund_id
cycle_id
category (consistency, missing evidence, stale, conflict, policy breach)
severity
description and recommended action
linked deltas/claims/fields
status: open, resolved, monitoring, escalated
owner and SLA metadata
decision notes and closure evidence

Notes

Exceptions are how quality remains honest.
They also provide operational steering signals.

Output entities

9. Report

A generated output artefact with a defined template.

Key attributes

report_id
fund_id
cycle_id
report_type (monthly_memo, quarterly_pack, evidence_pack)
template_version
generated timestamp
approval stamp reference
storage pointer
reproducibility pointer (cycle record reference)

Notes

Reports are production artefacts.
They must be linked back to the cycle that produced them.

10. Alert

An escalation event derived from deltas and exceptions.

Key attributes

alert_id
fund_id
cycle_id (nullable for event-driven)
alert type (material_change, exception_escalation, followup_breach, monitor)
severity and payload
evidence links
owner and resolution status
timestamps for escalation and closure

Notes

Alerts must be low-noise and auditable.
They should be resolvable and tracked to closure.

Policy entities

11. SourcePolicy

Defines allowed sources and evidence requirements.

Key attributes

policy_id
allow-lists and source types
recency requirements by field/category
retention expectations
conflict resolution preferences
approver roles

Notes

Source policy changes must be versioned.
They affect reproducibility and governance.

12. ValidationRuleSet

Defines deterministic checks and constraints.

Key attributes

ruleset_id
rule definitions
severity mapping
field schemas and required-field sets
test coverage references (engineering practice)

Notes

Rulesets must be versioned and tested.
Cycles record which ruleset they used.

13. MaterialityPolicy

Defines thresholds and alerting rules.

Key attributes

materiality_id
category severities
thresholds and scoring logic
confidence requirements
escalation and follow-up SLAs

Notes Materiality is a governance choice.
It must be explicit and stable.

Minimal contracts for integrations

External systems typically need:

Fund identifiers and tags
latest Snapshot field set
Delta list for a cycle or period
Exception queue with status and ownership
Report registry with links
Alert feed for escalations
policy references for governance audits

These objects should be exposed through stable, versioned APIs or exports.
Evidence references should never be dropped.

Definition of done

The data model is production-grade when:

cycles, snapshots, and artefacts form a reproducible chain
fields are typed and schema-bound
claims preserve evidence links and uncertainty states
deltas are deterministic and structured
exceptions support resolution workflows with auditability
reports are registered artefacts linked to cycle records
policies are versioned and referenced by cycles

This model is the backbone of Fund Analyst Intelligence in production.