Skip to content

Sample data

Betsy Lorton edited this page Aug 29, 2025 · 2 revisions

The M2 repository contains a set of data files with randomly generated fake data. The data is not representative of any actual credit or loan scenarios. Instead, it is intended to provide a semi-realistic placeholder in the Metro2 Evaluator Tool's UI, so users can explore the functionality of the application--including evaluator results, filtering, and administrator page--without needing to find or produce Metro2-conforming data.

When a user runs docker compose up, the seed data script runs the parser to ingest these fake data files and runs all avaliable evaluators on the parsed data.

How we generated sample data

The data files in the sample_data/ folder were generated using the methods in parse_m2/data_generator.py. These files do not completely fulfill the Metro2 data standard as described in the CRRG. Instead, the generated data contains only the fields that are used in the Metro2 Evaluator Tool's parser. Fields that are ignored by the parser contain a filler character (.). The files do not contain Trailer segments, since those are ignored by the parser.

For the fields that are included, some are filled with static values, such as Identification Number, Portfolio Type, and Terms Frequency. Some are filled by choosing a value from a list of valid possible values, such as Account Type and Special Comment Code. For many numeric and date fields, such as Credit Limit, Date Opened, and Date of Last Payment, their values are chosen randomly from a selected range.

Clone this wiki locally