Benchmark

This page records a real benchmark run comparing runMCMC(...; fast_blocks=true) and runMCMC(...; fast_blocks=false) for the same target-scale question:

  • N = 50,000 individuals
  • target P = 2,000,000 markers
  • target chain length L = 2000

Date of run: February 26, 2026.

Goal

Estimate wall-clock MCMC time for a single-trait BayesC run at target scale, and compare fast-block vs standard BayesC.

Environment

  • Platform: UCD farm cluster (Slurm)
  • Julia: 1.12.1
  • Model: y1 = intercept + geno
  • Data: synthetic dense Float32 genotype matrix and synthetic phenotype
  • Settings:
    • method="BayesC"
    • memory_guard=:warn
    • outputEBV=false
    • output_heritability=false

Fast-Block Benchmark (fast_blocks=true)

Design

  • Slurm job: 30951323
  • Warmup run before timed cases
  • Timed grid:
    • P ∈ {100,000, 200,000}
    • input chain_length ∈ {223, 446}
    • REPS = 2
  • Because fast_blocks=true and N=50,000:
    • block_size = floor(sqrt(50000)) = 223
    • outer iterations are floor(chain_length / 223) (1 or 2 in timed runs)
  • Extrapolation:
    1. For each P, fit t_mcmc = a + b * outer
    2. Convert target to outer iterations: floor(2000 / 223) = 8
    3. Fit target time vs P using P = 100,000 and P = 200,000, then extrapolate to P = 2,000,000

Measured t_mcmc (seconds)

RepPchain_length inputoutert_mcmc
1100,000223176.515
1100,000446284.821
1200,0002231147.150
1200,0004462171.600
2100,000223170.930
2100,000446284.973
2200,0002231142.255
2200,0004462169.454

Extrapolated Target Time

  • SUMMARY_EST target_seconds = 3449.100
  • target_hours = 0.958 (about 57.5 minutes)
  • Replicate-based range: about 3274 to 3624 seconds (0.91 to 1.01 hours)

Standard BayesC Benchmark (fast_blocks=false)

Design

  • Slurm job: 30951596
  • Warmup run before timed cases
  • Timed grid:
    • P ∈ {100,000, 200,000}
    • chain_length ∈ {2, 4}
    • REPS = 2
  • Extrapolation:
    1. For each P, fit t_mcmc = a + b * chain_length
    2. Predict t_mcmc at chain_length = 2000
    3. Fit those predicted times vs P and extrapolate to P = 2,000,000

Measured t_mcmc (seconds)

RepPchain_lengtht_mcmc
1100,000227.457
1100,000440.214
1200,000255.186
1200,0004107.764
2100,000235.555
2100,000453.252
2200,000256.058
2200,0004109.762

Extrapolated Target Time

  • SUMMARY_EST target_seconds = 735340.060
  • target_hours = 204.261 (about 8.5 days)
  • Replicate-based range: about 701539 to 769141 seconds (194.9 to 213.7 hours)

Summary Comparison

PathEstimated MCMC time at target (N=50k, P=2M, L=2000)
fast_blocks=true3449 sec (0.958 h)
fast_blocks=false735340 sec (204.261 h)

Approximate speedup from block mode in this benchmark setup: about 213x.

Notes and Interpretation

  • These are empirical estimates from short-chain calibration runs, not exact guarantees.
  • For fast_blocks=true, chain interpretation is two-level (outer iterations + inner block repeats).
  • The two Slurm jobs landed on different CPU node types, so absolute times include some hardware variance.
  • Even with that caveat, the gap between block and non-block is very large for this target-scale scenario.