The vast majority of data matching and reconciliation tends to focus on flat, csv-like data sets. However, much of modern data flow concerns richer data structures which are sent around via standard formats:
- XML (including industry specific schemas - FpML, XBRL, )
- JSON (including valid single object formats, and object-per-line - logs/MongDB etc)
- Protobufs / Parquet etc
We have built such solutions in the past using a number of approaches. While it’s always possible to transform a rich data structure into a flattened representation - such that csv-based tools can process them - this often both loses context and makes the system brittle to changes.
A better approach is to use a structured matching tool. We have such a tool on the roadmap leveraging previous enterprise knowledge hard won on client projects.
If interested, reach out and we can discuss prioritising your use case.
Depending on where you stand, all data matching/recs are structured. More than just keys and values, over time the data changes. This might be all new transactions, new sales, for example - this can be viewed as a larger version of the previous set of data.
Each data run / extraction / query can be viewed in isolation, if convenient.
However, you can also view this as monthly buckets of data in a single larger set with an extra column / sub-structure.
Which representation makes most sense, depends on the use-case and also how versioning interacts with the structure – see versioning
Array of objects:
Keyed sub objects:
Content keyed sub objects (commonly in IRS / financial swaps)