Start of topic | Skip to actions

The following test cases provide representative evolving AMR mesh hierarchies. The data files trace*_all.dat are useful to evaluate different dynamic partitioning schemes for blockstructured mesh hierarchies. The format of trace*_all.dat has been proposed by Johan Steensland, Sandia National Laboratory.

- Roe solver with carbuncle fix in 2nd order multidimensional Wave Propagation Method with MUSCL slope limiting
- Base grid 480x120, 3 additional levels of refinement, refinement factors 2,2,4
- Contour plots with levels at final time: Full domain, Detail, schlieren plots: Detail
- Results trace.txt, out.txt, solver.in: Ramp2dBw4.tar.gz (BlockWidth=4, Efficiency=70%), Ramp2dBw2.tar.gz (BlockWidth=2, Efficiency=70%), Ramp2dBw4_050.tar.gz (BlockWidth=4, Efficiency=60%)
- Real imbalance results on 8, 16, 32, 64 processors ALC with current parallelization strategy: Ramp2dBw4ALC.tar.gz (BlockWidth=4, Efficiency=70%)
- SFC statistics result: PDFs, gnuplot
- SFC videos: 8cpus, 16cpus, 32cpus
- Source codes

- Robust two-component Roe-HLL solver in 2nd order multidimensional Wave Propagation Method with MUSCL slope limiting
- Base grid 240x120, 3 additional levels of refinement, refinement factors 2,2,2
- Schlieren plots of full domain (turbulent interface fully refined): t=0.5, t=1.0, t=1.5, t=2.0, t=2.5
- Results trace.txt, out.txt, solver.in: ShockTurb2dBw4.tar.gz (BlockWidth=4), ShockTurb2dBw2.tar.gz (BlockWidth=2)
- Real imbalance results on 8, 16, 32, 64 processors ALC with current parallelization strategy: ShockTurb2dBw4ALC.tar.gz (BlockWidth=4)
- SFC statistics result: PDFs, gnuplot
- SFC videos: 8cpus
- Source codes

- Robust two-component Roe-HLL solver in 2nd order multidimensional Wave Propagation Method with MUSCL slope limiting
- Base grid 200x200, [0,8]x[0,8], 4 additional levels of refinement, refinement factors 2,2,4,2
- Plots at t=2.1: Contour plots on levels, Schlieren plot, schlieren plots of origin: t=0.0, t=0.3, t=0.6, t=0.9, t=2.1, contour plots with levels of origin: t=0.3, t=0.6, t=0.9
- Results trace.txt, out.txt, solver.in: ConvShock2dBw4.tar.gz (BlockWidth=4), ConvShock2dBw2.tar.gz (BlockWidth=2), ConvShock2dBw4OldClusterer.tar.gz (BlockWidth=4, old Berger-Rigoutsos clustering algorithm)
- This is the only example where the costs for the BR algorithm take a significant portion of the overall time. The example uncovers that the new implementation by Johan is about twice as expensive than the old one, at least for large problem sizes. In this example it takes more than 50% of the overall time. We should take closer look on the new code again.
- Real imbalance results on 8, 16, 32, 64 processors ALC with current parallelization strategy: ConvShock2dBw4ALC.tar.gz (BlockWidth=4)
- SFC statistics result: PDFs, gnuplot
- SFC videos: 8cpus, 16cpus, 32cpus
- Source codes

- Robust two-component Roe-HLL solver with 2nd order MUSCL slope limiting and Godunov dimensional splitting, source term for one-step chemistry
- Base grid 220x100, 2 rectangular regions cut out, 5 additional levels of refinement, refinement factors 2,2,2,2,2 block width 4 (default)
- Schlieren plots of full domain: t=0.1, t=0.2, t=0.3, t=0.35, t=0.4, t=0.5, contour plots with levels at t=0.35: Full domain, Detail
- Results trace.txt, out.txt, solver.in: DetChan2dBw4.tar.gz (BlockWidth=4, Efficiency=85%), DetChan2dBw2.tar.gz (BlockWidth=2, Efficiency=85%), DetChan2dBw4_065.tar.gz (BlockWidth=4, Efficiency=65%)
- Real imbalance results on 8, 16, 32, 64 processors ALC with current parallelization strategy: DetChan2dBw4ALC.tar.gz (BlockWidth=4, Efficiency=85%)
- SFC statistics result: PDFs, gnuplot
- SFC videos: 8cpus
- Source codes

- Van Leer flux vector splitting with 2nd order MUSCL slope limiting and Godunov dimensional splitting
- Base grid 200x160, 3 additional levels of refinement, refinement factors 2,2,2
- Contour plots with levels at final time: Full domain, Detail, schlieren plots: Full domain, Detail
- Results trace.txt, out.txt, solver.in: Spheres2dBw4.tar.gz (BlockWidth=4), Spheres2dBw2.tar.gz (BlockWidth=2)
- Real imbalance results on 8, 16, 32, 64 processors ALC with current parallelization strategy: Spheres2dBw4ALC.tar.gz (BlockWidth=4)
- SFC statistics result: PDFs, gnuplot
- SFC videos: 8cpus
- Source codes

The balance.txt files contain: [step] [phyiscal time] [maximal relative workload] [minimal relative workload] [difference of both]

Because of small programming error maximal and minimal workload in all files reflect the average over time.

The trace*_all.dat files contain: [steps] [max sum work] [sum max work] [avg work] [avg sync] [avg orphan work] [avg move] [max sum work balance] [sum max work balance]

Given a box hierarchy *bh* the metrics are based on following calulations (by Randolf Rotta):

- orphans (level l, proc p): intersection(bh[level=l,proc!=p], bh[level=l+1,proc=p])
- movement (level l, proc p): intersection(bh_old[level=l,proc!=p], bh[level=l,proc=p]) + intersection(bh_old[level=l,proc!=p], orphans[level=l,proc=p]) (what processor p
*recieves*; Note that this does not account for boundaries!) - synchronization (level l, proc p): intersection(grow(bh[level=l,proc!=p]), bh[level=l,proc=p]) + intersection(grow(intersection(adjust(bh[level=l+1,proc!=p]), bh[level=l,proc=p])), bh[level=l,proc=p]) + intersection(bh[level=l,proc!=p], bh[level=l+1,proc=p]) (what processor p
*sends*, including orphans) - work (level l, proc p): bh[levep=p,proc=p] + orphans[levep=p,proc=p]
- balance: max/average

RalfDeiterding - 18 May 2005

Copyright © 1997-2020 California Institute of Technology.