Fraud Patterns & ACFE Taxonomy

SyntheticData includes comprehensive fraud pattern modeling aligned with the Association of Certified Fraud Examiners (ACFE) Report to the Nations. This enables generation of realistic fraud scenarios for training machine learning models and testing audit analytics.

ACFE Fraud Taxonomy

The ACFE occupational fraud classification divides fraud into three main categories, each with distinct characteristics:

Asset Misappropriation (86% of cases)

The most common type of fraud, involving theft of organizational assets:

fraud:
  enabled: true
  acfe_category: asset_misappropriation
  schemes:
    cash_fraud:
      - skimming           # Sales not recorded
      - larceny            # Cash stolen after recording
      - shell_company      # Fictitious vendors
      - ghost_employee     # Non-existent employees
      - expense_schemes    # Personal expenses as business
    non_cash_fraud:
      - inventory_theft
      - fixed_asset_misuse

Corruption (33% of cases)

Schemes involving conflicts of interest and bribery:

fraud:
  enabled: true
  acfe_category: corruption
  schemes:
    - purchasing_conflict  # Undisclosed vendor ownership
    - sales_conflict       # Kickbacks from customers
    - invoice_kickback     # Vendor payment schemes
    - bid_rigging          # Collusion with vendors
    - economic_extortion   # Demands for payment

Financial Statement Fraud (10% of cases)

The least common but most costly fraud type:

fraud:
  enabled: true
  acfe_category: financial_statement
  schemes:
    overstatement:
      - premature_revenue      # Revenue before earned
      - fictitious_revenues    # Fake sales
      - concealed_liabilities  # Hidden obligations
      - improper_asset_values  # Overstated assets
    understatement:
      - understated_revenues   # Hidden sales
      - overstated_expenses    # Inflated costs

ACFE Calibration

Generated fraud data is calibrated to match ACFE statistics:

Metric	ACFE Value	Configuration
Median Loss	$117,000	`acfe.median_loss`
Median Duration	12 months	`acfe.median_duration_months`
Tip Detection	42%	`detection_method.tip`
Internal Audit Detection	16%	`detection_method.internal_audit`
Management Review Detection	12%	`detection_method.management_review`

fraud:
  acfe_calibration:
    enabled: true
    median_loss: 117000
    median_duration_months: 12
    detection_methods:
      tip: 0.42
      internal_audit: 0.16
      management_review: 0.12
      external_audit: 0.04
      accident: 0.06

Collusion & Conspiracy Modeling

SyntheticData models multi-party fraud networks with coordinated schemes:

Collusion Ring Types

#![allow(unused)]
fn main() {
pub enum CollusionRingType {
    // Internal collusion
    EmployeePair,           // approver + processor
    DepartmentRing,         // 3-5 employees
    ManagementSubordinate,  // manager + subordinate

    // Internal-external
    EmployeeVendor,         // purchasing + vendor contact
    EmployeeCustomer,       // sales rep + customer
    EmployeeContractor,     // project manager + contractor

    // External rings
    VendorRing,             // bid rigging (2-4 vendors)
    CustomerRing,           // return fraud
}
}

Conspirator Roles

Each conspirator in a ring has a specific role:

Initiator: Conceives scheme, recruits others
Executor: Performs fraudulent transactions
Approver: Provides approvals/overrides
Concealer: Hides evidence, manipulates records
Lookout: Monitors for detection
Beneficiary: External recipient of proceeds

Configuration

fraud:
  collusion:
    enabled: true
    ring_types:
      - type: employee_vendor
        probability: 0.15
        min_members: 2
        max_members: 4
      - type: department_ring
        probability: 0.08
        min_members: 3
        max_members: 5
    defection_probability: 0.05
    escalation_rate: 0.10

Management Override

Senior-level fraud with override patterns:

fraud:
  management_override:
    enabled: true
    perpetrator_levels:
      - senior_manager
      - cfo
      - ceo
    override_types:
      revenue:
        - journal_entry_override
        - revenue_recognition_acceleration
        - reserve_manipulation
      expense:
        - capitalization_abuse
        - expense_deferral
    pressure_sources:
      - financial_targets
      - market_expectations
      - covenant_compliance

Fraud Triangle

The fraud triangle (Pressure, Opportunity, Rationalization) is modeled:

fraud:
  fraud_triangle:
    pressure:
      source: financial_targets
      intensity: high
    opportunity:
      factors:
        - weak_internal_controls
        - management_override_capability
        - lack_of_oversight
    rationalization:
      type: temporary_adjustment  # "We'll fix it next quarter"

Red Flag Generation

Probabilistic fraud indicators with calibrated Bayesian probabilities:

Red Flag Strengths

Strength	P(fraud\|flag)	Examples
Strong	> 0.5	Matched home address vendor/employee
Moderate	0.2 - 0.5	Vendor with no physical address
Weak	< 0.2	Round number invoices

Configuration

fraud:
  red_flags:
    enabled: true
    inject_rate: 0.15  # 15% of transactions get flags
    patterns:
      strong:
        - name: matched_address_vendor_employee
          p_flag_given_fraud: 0.90
          p_flag_given_no_fraud: 0.001
        - name: sequential_check_numbers
          p_flag_given_fraud: 0.80
          p_flag_given_no_fraud: 0.01
      moderate:
        - name: approval_just_under_threshold
          p_flag_given_fraud: 0.70
          p_flag_given_no_fraud: 0.10
      weak:
        - name: round_number_invoice
          p_flag_given_fraud: 0.40
          p_flag_given_no_fraud: 0.20

Evaluation Benchmarks

ACFE-Calibrated Benchmarks

#![allow(unused)]
fn main() {
// General fraud detection
let bench = acfe_calibrated_1k();

// Collusion-focused benchmark
let bench = acfe_collusion_5k();

// Management override detection
let bench = acfe_management_override_2k();
}

Benchmark Metrics

#![allow(unused)]
fn main() {
pub struct AcfeAlignment {
    /// Category distribution MAD vs ACFE
    pub category_distribution_mad: f64,
    /// Median loss ratio (actual / expected)
    pub median_loss_ratio: f64,
    /// Duration distribution KS statistic
    pub duration_distribution_ks: f64,
    /// Detection method chi-squared
    pub detection_method_chi_sq: f64,
}
}

Output Files

File	Description
`collusion_rings.json`	Collusion network details with members, roles
`red_flags.csv`	Red flag indicators with probabilities
`management_overrides.json`	Management override schemes
`fraud_labels.csv`	Enhanced fraud labels with ACFE category

Best Practices

Start with ACFE calibration: Use default ACFE statistics for realistic distribution
Enable collusion gradually: Start with simple rings before complex networks
Use red flags for training: Red flags provide weak supervision signals
Validate against benchmarks: Use ACFE benchmarks to verify model performance
Consider detection difficulty: Use detection_difficulty labels for curriculum learning

Keyboard shortcuts

SyntheticData Documentation