Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Fraud Patterns & ACFE Taxonomy

SyntheticData includes comprehensive fraud pattern modeling aligned with the Association of Certified Fraud Examiners (ACFE) Report to the Nations. This enables generation of realistic fraud scenarios for training machine learning models and testing audit analytics.

ACFE Fraud Taxonomy

The ACFE occupational fraud classification divides fraud into three main categories, each with distinct characteristics:

Asset Misappropriation (86% of cases)

The most common type of fraud, involving theft of organizational assets:

fraud:
  enabled: true
  acfe_category: asset_misappropriation
  schemes:
    cash_fraud:
      - skimming           # Sales not recorded
      - larceny            # Cash stolen after recording
      - shell_company      # Fictitious vendors
      - ghost_employee     # Non-existent employees
      - expense_schemes    # Personal expenses as business
    non_cash_fraud:
      - inventory_theft
      - fixed_asset_misuse

Corruption (33% of cases)

Schemes involving conflicts of interest and bribery:

fraud:
  enabled: true
  acfe_category: corruption
  schemes:
    - purchasing_conflict  # Undisclosed vendor ownership
    - sales_conflict       # Kickbacks from customers
    - invoice_kickback     # Vendor payment schemes
    - bid_rigging          # Collusion with vendors
    - economic_extortion   # Demands for payment

Financial Statement Fraud (10% of cases)

The least common but most costly fraud type:

fraud:
  enabled: true
  acfe_category: financial_statement
  schemes:
    overstatement:
      - premature_revenue      # Revenue before earned
      - fictitious_revenues    # Fake sales
      - concealed_liabilities  # Hidden obligations
      - improper_asset_values  # Overstated assets
    understatement:
      - understated_revenues   # Hidden sales
      - overstated_expenses    # Inflated costs

ACFE Calibration

Generated fraud data is calibrated to match ACFE statistics:

MetricACFE ValueConfiguration
Median Loss$117,000acfe.median_loss
Median Duration12 monthsacfe.median_duration_months
Tip Detection42%detection_method.tip
Internal Audit Detection16%detection_method.internal_audit
Management Review Detection12%detection_method.management_review
fraud:
  acfe_calibration:
    enabled: true
    median_loss: 117000
    median_duration_months: 12
    detection_methods:
      tip: 0.42
      internal_audit: 0.16
      management_review: 0.12
      external_audit: 0.04
      accident: 0.06

Collusion & Conspiracy Modeling

SyntheticData models multi-party fraud networks with coordinated schemes:

Collusion Ring Types

#![allow(unused)]
fn main() {
pub enum CollusionRingType {
    // Internal collusion
    EmployeePair,           // approver + processor
    DepartmentRing,         // 3-5 employees
    ManagementSubordinate,  // manager + subordinate

    // Internal-external
    EmployeeVendor,         // purchasing + vendor contact
    EmployeeCustomer,       // sales rep + customer
    EmployeeContractor,     // project manager + contractor

    // External rings
    VendorRing,             // bid rigging (2-4 vendors)
    CustomerRing,           // return fraud
}
}

Conspirator Roles

Each conspirator in a ring has a specific role:

  • Initiator: Conceives scheme, recruits others
  • Executor: Performs fraudulent transactions
  • Approver: Provides approvals/overrides
  • Concealer: Hides evidence, manipulates records
  • Lookout: Monitors for detection
  • Beneficiary: External recipient of proceeds

Configuration

fraud:
  collusion:
    enabled: true
    ring_types:
      - type: employee_vendor
        probability: 0.15
        min_members: 2
        max_members: 4
      - type: department_ring
        probability: 0.08
        min_members: 3
        max_members: 5
    defection_probability: 0.05
    escalation_rate: 0.10

Management Override

Senior-level fraud with override patterns:

fraud:
  management_override:
    enabled: true
    perpetrator_levels:
      - senior_manager
      - cfo
      - ceo
    override_types:
      revenue:
        - journal_entry_override
        - revenue_recognition_acceleration
        - reserve_manipulation
      expense:
        - capitalization_abuse
        - expense_deferral
    pressure_sources:
      - financial_targets
      - market_expectations
      - covenant_compliance

Fraud Triangle

The fraud triangle (Pressure, Opportunity, Rationalization) is modeled:

fraud:
  fraud_triangle:
    pressure:
      source: financial_targets
      intensity: high
    opportunity:
      factors:
        - weak_internal_controls
        - management_override_capability
        - lack_of_oversight
    rationalization:
      type: temporary_adjustment  # "We'll fix it next quarter"

Red Flag Generation

Probabilistic fraud indicators with calibrated Bayesian probabilities:

Red Flag Strengths

StrengthP(fraud|flag)Examples
Strong> 0.5Matched home address vendor/employee
Moderate0.2 - 0.5Vendor with no physical address
Weak< 0.2Round number invoices

Configuration

fraud:
  red_flags:
    enabled: true
    inject_rate: 0.15  # 15% of transactions get flags
    patterns:
      strong:
        - name: matched_address_vendor_employee
          p_flag_given_fraud: 0.90
          p_flag_given_no_fraud: 0.001
        - name: sequential_check_numbers
          p_flag_given_fraud: 0.80
          p_flag_given_no_fraud: 0.01
      moderate:
        - name: approval_just_under_threshold
          p_flag_given_fraud: 0.70
          p_flag_given_no_fraud: 0.10
      weak:
        - name: round_number_invoice
          p_flag_given_fraud: 0.40
          p_flag_given_no_fraud: 0.20

Evaluation Benchmarks

ACFE-Calibrated Benchmarks

#![allow(unused)]
fn main() {
// General fraud detection
let bench = acfe_calibrated_1k();

// Collusion-focused benchmark
let bench = acfe_collusion_5k();

// Management override detection
let bench = acfe_management_override_2k();
}

Benchmark Metrics

#![allow(unused)]
fn main() {
pub struct AcfeAlignment {
    /// Category distribution MAD vs ACFE
    pub category_distribution_mad: f64,
    /// Median loss ratio (actual / expected)
    pub median_loss_ratio: f64,
    /// Duration distribution KS statistic
    pub duration_distribution_ks: f64,
    /// Detection method chi-squared
    pub detection_method_chi_sq: f64,
}
}

Output Files

FileDescription
collusion_rings.jsonCollusion network details with members, roles
red_flags.csvRed flag indicators with probabilities
management_overrides.jsonManagement override schemes
fraud_labels.csvEnhanced fraud labels with ACFE category

Best Practices

  1. Start with ACFE calibration: Use default ACFE statistics for realistic distribution
  2. Enable collusion gradually: Start with simple rings before complex networks
  3. Use red flags for training: Red flags provide weak supervision signals
  4. Validate against benchmarks: Use ACFE benchmarks to verify model performance
  5. Consider detection difficulty: Use detection_difficulty labels for curriculum learning