ERP Load Testing

Generate high-volume data for ERP system testing.

Overview

SyntheticData generates realistic transaction volumes for:

Load testing
Stress testing
Performance benchmarking
System integration testing

Configuration

High Volume Generation

global:
  seed: 42
  industry: manufacturing
  start_date: 2024-01-01
  period_months: 12
  worker_threads: 8                  # Maximize parallelism

transactions:
  target_count: 1000000              # 1 million entries

  line_items:
    distribution: empirical

  amounts:
    min: 100
    max: 10000000
    distribution: log_normal

  sources:
    manual: 0.15
    automated: 0.65
    recurring: 0.15
    adjustment: 0.05

  temporal:
    month_end_spike: 2.5
    quarter_end_spike: 3.0
    year_end_spike: 4.0

document_flows:
  p2p:
    enabled: true
    flow_rate: 0.35
  o2c:
    enabled: true
    flow_rate: 0.35

master_data:
  vendors:
    count: 2000
  customers:
    count: 5000
  materials:
    count: 10000

output:
  format: csv
  compression: none                  # Fastest for import

SAP ACDOCA Format

output:
  files:
    journal_entries: false
    acdoca: true                     # SAP Universal Journal format

Volume Sizing

Transaction Volume Guidelines

Company Size	Annual Entries	Per Day	Configuration
Small	10,000	~30	`target_count: 10000`
Medium	100,000	~300	`target_count: 100000`
Large	1,000,000	~3,000	`target_count: 1000000`
Enterprise	10,000,000	~30,000	`target_count: 10000000`

Master Data Guidelines

Size	Vendors	Customers	Materials
Small	100	200	500
Medium	500	1,000	5,000
Large	2,000	10,000	50,000
Enterprise	10,000	100,000	500,000

Load Testing Scenarios

1. Steady State Load

Normal daily operation:

transactions:
  target_count: 100000

  temporal:
    month_end_spike: 1.0             # No spikes
    quarter_end_spike: 1.0
    year_end_spike: 1.0
    working_hours_only: true

2. Peak Period Load

Month-end closing:

global:
  start_date: 2024-01-25
  period_months: 1                   # Focus on month-end

transactions:
  target_count: 50000

  temporal:
    month_end_spike: 5.0             # 5x normal volume

3. Year-End Stress

Year-end closing simulation:

global:
  start_date: 2024-12-01
  period_months: 1

transactions:
  target_count: 200000

  temporal:
    month_end_spike: 3.0
    quarter_end_spike: 4.0
    year_end_spike: 10.0             # Extreme spike

4. Batch Import

Large batch import testing:

transactions:
  target_count: 500000

  sources:
    automated: 1.0                   # All system-generated

output:
  compression: none                  # For fastest import

Manufacturing ERP Testing (v0.6.0)

Production Order Load

Generate production orders with WIP tracking, routings, and standard costing:

global:
  industry: manufacturing
  start_date: 2024-01-01
  period_months: 12
  worker_threads: 8

transactions:
  target_count: 500000

manufacturing:
  enabled: true
  production_orders:
    orders_per_month: 200              # High volume
    avg_batch_size: 150
    yield_rate: 0.96
    rework_rate: 0.04
  costing:
    labor_rate_per_hour: 42.0
    overhead_rate: 1.75
  routing:
    avg_operations: 6
    setup_time_hours: 2.0

document_flows:
  p2p:
    enabled: true
    flow_rate: 0.40                    # Heavy procurement

subledger:
  inventory:
    enabled: true
    valuation_methods:
      - standard_cost
      - weighted_average

This configuration exercises production order creation, goods issue to production, goods receipt from production, WIP valuation, and standard cost variance posting.

Three-Way Match with Source-to-Pay

Test the full procurement lifecycle from sourcing through payment:

source_to_pay:
  enabled: true
  sourcing:
    projects_per_year: 20
  rfx:
    min_invited_vendors: 5
    max_invited_vendors: 12
  contracts:
    min_duration_months: 12
    max_duration_months: 24
  p2p_integration:
    off_contract_rate: 0.10            # 10% maverick spending
    catalog_enforcement: true

document_flows:
  p2p:
    enabled: true
    flow_rate: 0.40
    three_way_match:
      quantity_tolerance: 0.02
      price_tolerance: 0.01

HR and Payroll Testing (v0.6.0)

Payroll Processing Load

Generate payroll runs, time entries, and expense reports:

master_data:
  employees:
    count: 500
    hierarchy_depth: 6

hr:
  enabled: true
  payroll:
    enabled: true
    pay_frequency: "biweekly"          # 26 pay periods per year
    benefits_enrollment_rate: 0.75
    retirement_participation_rate: 0.55
  time_attendance:
    enabled: true
    overtime_rate: 0.15
  expenses:
    enabled: true
    submission_rate: 0.40
    policy_violation_rate: 0.05

This exercises payroll journal entry generation (salary, tax withholdings, benefits deductions), time and attendance record creation, and expense report approval workflows.

Expense Report Compliance

Test expense policy enforcement with elevated violation rates:

hr:
  enabled: true
  expenses:
    enabled: true
    submission_rate: 0.60              # 60% of employees submit
    policy_violation_rate: 0.15        # Elevated violation rate for testing

anomaly_injection:
  enabled: true
  generate_labels: true

Procurement Testing (v0.6.0)

Vendor Scorecard and Qualification

Generate the full source-to-pay cycle for procurement system testing:

source_to_pay:
  enabled: true
  qualification:
    pass_rate: 0.80
    validity_days: 365
  scorecards:
    frequency: "quarterly"
    grade_a_threshold: 85.0
    grade_c_threshold: 55.0
  catalog:
    preferred_vendor_flag_rate: 0.65
    multi_source_rate: 0.30

vendor_network:
  enabled: true
  depth: 3

Sales Quote Pipeline

Test quote-to-order conversion with the O2C flow:

sales_quotes:
  enabled: true
  quotes_per_month: 100
  win_rate: 0.30
  validity_days: 45

document_flows:
  o2c:
    enabled: true
    flow_rate: 0.40

Won quotes automatically feed into the O2C document flow as sales orders.

Performance Monitoring

Generation Metrics

# Time generation
time datasynth-data generate --config config.yaml --output ./output

# Monitor memory
/usr/bin/time -v datasynth-data generate --config config.yaml --output ./output

# Watch progress
datasynth-data generate --config config.yaml --output ./output -v

Import Metrics

Track these during ERP import:

Metric	Description
Import rate	Records per second
Memory usage	Peak RAM during import
CPU utilization	Processor load
I/O throughput	Disk read/write speed
Lock contention	Database lock waits

Data Import Strategies

SAP S/4HANA

# Generate ACDOCA format
datasynth-data generate --config config.yaml --output ./output

# Use SAP Data Services or LSMW for import
# Output: output/transactions/acdoca.csv

Oracle EBS

-- Create staging table
CREATE TABLE XX_JE_STAGING (
    document_id VARCHAR2(36),
    posting_date DATE,
    account VARCHAR2(20),
    debit NUMBER,
    credit NUMBER
);

-- Load via SQL*Loader
LOAD DATA
INFILE 'journal_entries.csv'
INTO TABLE XX_JE_STAGING
FIELDS TERMINATED BY ','

Microsoft Dynamics

# Use Data Management Framework
# Import journal_entries.csv via Data Entity

Validation

Post-Import Checks

-- Verify record count
SELECT COUNT(*) FROM journal_entries;

-- Verify balance
SELECT SUM(debit) - SUM(credit) AS imbalance
FROM journal_entries;

-- Check date range
SELECT MIN(posting_date), MAX(posting_date)
FROM journal_entries;

Reconciliation

import pandas as pd

# Compare source and target
source = pd.read_csv('output/transactions/journal_entries.csv')
target = pd.read_sql('SELECT * FROM journal_entries', connection)

# Verify counts
assert len(source) == len(target), "Record count mismatch"

# Verify totals
assert abs(source['debit_amount'].sum() - target['debit'].sum()) < 0.01

Batch Processing

Chunked Generation

For very large volumes, generate in chunks:

# Generate 10 batches of 1M each
for i in {1..10}; do
    datasynth-data generate \
        --config config.yaml \
        --output ./output/batch_$i \
        --seed $((42 + i))
done

Parallel Import

# Import chunks in parallel
for batch in ./output/batch_*; do
    import_job $batch &
done
wait

SyntheticData Documentation