The setting
A physically based flame simulator reproduces visual fire
behavior convincingly but doesn't directly produce sound: time-
stepping a 3D combustion simulation at audio sample rates is
impractically expensive, and small-scale combustion noise comes
from thermo-acoustics that those simulators don't resolve at
typical resolutions (Chadwick & James 2011, §1).
The paper's workaround (§§2.2, 3) is a simplified
analytical sound model. Combustion noise can be derived from a
wave equation forced by the rate of change of the heat release
rate (Crighton et al. 1992; equation 2 in the paper). Under the
premixed-flame assumption (Strahle 1972), the heat release
happens essentially at the moving flame front, so the volume
integral over the combustion region can be rewritten as a
surface integral over that front. Ignoring propagation delays,
1/r distance attenuation, and overall scaling constants
(all fixed multiplicative factors at a chosen listener
position), the radiated pressure reduces to
p(t) = d/dt ∫S(t)
u(x, t) ·
n(x, t) dS
(equation 6), the time derivative of the velocity flux integrated
over the moving flame front. This is also proportional to the
time derivative of the total heat release rate.
The flame solver runs at hundreds of steps per second (360 Hz
for the paper's examples). After interpolating its pressure
output up to audio sample rate, the resulting signal is band-
limited to roughly the simulation's Nyquist (≈ 180 Hz):
a slow-envelope rumble that captures the flame's overall
rate of activity but contains essentially no audible structure
above that. Real fire is broadband (audible content out to
several kHz), so the simulated signal alone sounds dull rather
than crackling. The job of
spectral bandwidth extension (Algorithm 1,
§4 of the paper) is to fill in the missing mid and high
frequencies in a way that sounds plausible and stays time-
locked to the simulated bursts.
Why “partially” physically based
The low-frequency input is physical: it comes from the
§3 sound model derived above. The high-frequency content
added here is statistical: random-phase noise whose
power-spectral-density (PSD) slope and time-varying amplitude are
chosen to match measured fire spectra, not derived from a wave-
level model of small-scale flame turbulence. The empirical
justification is that real fire spectra are well approximated by
a power law
P(f) ∝ f−α
over a wide band:
-
Clavin & Siggia (1991) derived
P(f) ∝ f−5/2
(i.e. α = 2.5) for turbulent
premixed flames from theoretical considerations.
-
Rajaram & Lieuwen (2009) measured combustion-noise spectra
directly and reported exponents in the range
α ∈ [2.1, 3.4].
So the high-frequency content has the right spectral shape
and (because it is gain-modulated by the simulated envelope) the
right burst timing, but the individual high-frequency
samples are random draws, not predictions of a finer-resolution
simulation. Two flame simulations with very different
small-scale combustion structure but the same envelope will sound
similar at high frequency.
How the algorithm works
One noise field is built up front for the whole
clip; a windowed loop then matches and blends it
into the input.
-
Envelope extraction. Apply a zero-phase order-2
Butterworth lowpass at
fc (default 180 Hz) to
the input p(t), producing
pL(t). The absolute value
|pL(t)| is the slow
amplitude envelope that will gain-modulate the high-frequency
noise.
-
Power-law noise field (built once). Construct a
complex spectrum
N(f) = f−α/2 ·
eiφ(f)
with phases
φ(f) drawn uniformly from
[0, 2π). The exponent
−α/2 on
magnitude gives the desired
−α on
PSD. Multiply N(f) by the
high-pass blend filter
blend2(f) to remove its
low-frequency content (which would compete with the input), then
inverse-FFT to get a time-domain noise signal
n(t). The same
n(t) is reused across all windows;
this is what makes overlap-added windows blend without phase
seams.
-
Per-window match. For each triangular analysis
window wi, compute the
FFT of the windowed signal
Y(f) = FFT(p · wi)
and the FFT of the envelope-shaped noise
YN(f) = FFT(|pL| · wi · n).
Then split:
YL(f) = blend1(f) Y(f)
is the lowpass branch and
YN′(f) = blend2(f) YN(f)
is the highpass branch. Solve for the positive scalar
β that makes the
Gaussian-weighted energy of
YL + β YN′
equal to that of the unfiltered windowed signal
Y:
∫ |YL(f) + β YN′(f)|2
g(f) df
=
∫ |Y(f)|2 g(f) df
where g(f) is a Gaussian weighting
centered on fc with
standard deviation fit_width / 3. Expanding gives a
quadratic aβ2 + bβ + c = 0
whose coefficients are integrals of bilinear forms in
YL,
YN′, and
Y; the implementation evaluates each
integral with the trapezoidal rule and picks the positive root via
Vieta's identity.
-
Blend & overlap-add. Inverse-FFT the blended
spectrum
YL + β YN′,
take its real part, and overlap-add into the running output. The
triangular window with stride
halfWidth satisfies the
constant-overlap-add identity
(sums to 1), so the in-band content is reconstructed from window
to window without amplitude modulation (modulo the half-amplitude
effect noted below).
What the controls do
-
α: the power-law PSD exponent.
Larger values produce a darker high-frequency tail (more rolloff,
less crackle); smaller values brighten it. Default 2.5; the
physically motivated range from the cited measurements is
roughly [2.0, 3.5].
-
cutoff fc: the centre of the
crossover. Above fc the
output increasingly comes from noise; below
fc it increasingly comes
from the simulated input. Slide up if you trust your simulated
signal further into the audible band; slide down to replace more
of it with noise.
-
window half-width (samples): the analysis
window covers 2×halfWidth+1 samples and the loop
strides by halfWidth (50 % overlap). Smaller windows
let β track fast envelope
changes (more time resolution) at the cost of a coarser
per-window frequency resolution.
-
Gaussian fit width (Hz): the full width of
the matching kernel; its standard deviation is one-third of this
value. A narrow kernel matches the spectra essentially at
fc only; a wider kernel
averages over a broader band of the spectrum.
-
blend half-width (Hz): the linear ramp
between blend1 and
blend2 spans
2×blend_half_width, centred on
fc. Smaller values give
a sharper crossover; larger values give a gentler one.
-
noise amplitude: an extra global multiplier
on the noise contribution after the
β match. 0 disables the
high-frequency content entirely (you hear only the lowpassed
input, at half amplitude); 1 is the paper's default; values above
1 push the high frequencies louder than the matching solver chose
on its own.
-
RNG seed: integer seed for the shared
PCG32 random-phase generator. The same seed produces the same
phase array in both this JavaScript port and the Python reference,
so the two implementations agree on the extended waveform up to
FFT round-off (~10−6 on the synthetic
validate.html test).
-
match Matlab reference (checkbox): if
ticked, every per-window FFT runs at the full signal length
(NFFT = next_pow2(L)),
bit-for-bit matching the published Matlab code. Off (the default)
uses smaller per-window FFTs (next_pow2(8×halfWidth),
typically 4096–8192) for ~100× speedup at the cost of
evaluating the matching integral on a coarser frequency grid.
Things to notice
-
With the Synthetic burst preset, set
noise amplitude = 0 and play the extended audio:
you hear the in-band tones at half their original amplitude (a
consequence of the algorithm's one-sided spectrum filtering; see
Caveats). Set it back to 1 and the tones gain a fire-like crackle
whose loudness tracks the burst envelope.
-
Drag α from 1.5 to 4.0 and watch the
amber dashed line in the spectrum plot tilt steeper. The audible
change is from crispy / hissy (small α, brighter spectrum)
to dull / muffled (large α, darker spectrum).
-
Drag cutoff fc from 60 Hz up to 500 Hz
and watch the green “blended target” line in the
spectrum plot rotate around the moving yellow cutoff line. At low
cutoffs almost the entire output is noise; at high cutoffs you
keep more of the simulated content.
-
Pick the Flame Jet example and watch the
per-window β trace below the
playback panel: it spikes upward during loud combustion bursts
and falls back between them, exactly the behaviour that ties the
synthesised high-frequency content to the simulated envelope.
-
Toggle match Matlab reference on and off after
loading one of the bundled signals. The audible difference is
usually below the noise floor; the perceptual identity is good
evidence that the small-FFT approximation is safe.
Caveats and things this model is not
-
The Matlab implementation applies a one-sided lowpass to the
two-sided FFT. The negative-frequency mirror is zeroed, so taking
the real part of the IFFT halves the in-band magnitude (a 6.02 dB
drop). The released code masks this by peak-normalising the
output before saving WAV; this port preserves the underlying
behaviour, and the Tier 3 test suite asserts the
−6.02 dB drop. The Download buttons here also peak-normalise
the WAV they save.
-
The high-frequency content is a statistical model, not a
prediction of small-scale physics. Comparing two simulations'
high-frequency content tells you whether their envelopes differ,
not whether their flame structures differ. For research that
cares about the high-end physics (e.g. analysis of
turbulent flame noise), use a higher-resolution simulation or a
wave-level combustion-noise model rather than this synthesis
step.
-
The matching β is a single
scalar per window. If the input has a strong feature very close
to fc in some windows
and not others, β may swing
sharply between adjacent windows; the per-window
β trace lets you see that
directly.
-
Defaults match the paper's example call:
fc = 180 Hz,
α = 2.5,
halfWidth = 500 samples (about 11.3 ms at 44.1 kHz),
Gaussian fit width = 30 Hz, blend half-width = 15 Hz,
noise amplitude = 1, seed = 42. The Reset button restores these.
References
-
Chadwick, J. N., and James, D. L. (2011). Animating Fire with Sound.
ACM Transactions on Graphics (SIGGRAPH 2011), 30(4).
cs.cornell.edu/projects/Sound/fire
·
paper PDF
-
Clavin, P., and Siggia, E. D. (1991). Turbulent premixed flames
and sound generation. Combustion Science and Technology, 78(1–3),
147–155. (theoretical f−5/2 derivation)
-
Rajaram, R., and Lieuwen, T. (2009). Acoustic radiation from
turbulent premixed flames. Journal of Fluid Mechanics, 637,
357–385.
(empirical exponents α ∈ [2.1, 3.4])
-
Zölzer, U., and Amatriain, X., eds. (2002). DAFX: Digital
Audio Effects. Wiley. (general reference for
frequency-domain noise synthesis and the spectral-blend approach.)