October 2024: Fractal Audio's VP4 Virtual Pedalboard has been added to the wiki.

Impulse responses (IR)

From Fractal Audio Wiki
Jump to navigation Jump to search

About Impulse Responses (IRs)

To recreate the tonal characteristics of (mainly) speaker cabs, the Cab block and IR Player block rely on impulse responses (IRs).

An impulse response or IR is a collection of data that represents sound measurements taken from a speaker cabinet or system. A test signal is played through the actual speaker, recorded, and used to generate a profile which is stored as a file. The profile (IR) can then be used by an IR loader, such as the Cab block, and IR Player block (Axe-Fx only), to produce the sound of the speaker.

The terms "cab", "user cab" and "IR" are synonyms.

The Axe-Fx II also used IRs to reproduce the characteristics of specific microphones. This feature has been dropped in later firmware.

Most speaker IRs reproduce the tone of a speaker that was recorded with the microphone close to the speaker, aka "near-field" or "close-miked". More information. "Far-field" IRs represent the sound of a speaker that was captured at a long(er) distance. These IRs represent (to a certain extent) the "in-the-room" sound of a traditional guitar speaker, see below.

Wikipedia: impulse response

From the Owners Manuals:


Fractal Audio Systems speaker simulation technology is incredibly accurate, yet some listeners find the sound of IRs unfamiliar at first. This is because impulse responses typically recreate the sound of “close miking.” When you mic a guitar speaker, the mic “hears” something very different to what you might hear. Our ears are by definition “neutral” whereas a mic has distinct “color.” We typically listen at a distance (and speaker tone is very different as we move around) while a mic is inches away and stationary, focusing on the desirable sound at a specific spot. As guitarists, we are accustomed to the sound of a speaker “in the room,” but this is not what our audiences hear. For recording and performing, the close mic’d sound is essentially a universal standard. THIS is the sound that the Cab block is designed to reproduce, and this explains why not only guitarists, but recording and front-of-house engineers have embraced its use. (Of course our Amp models can also be used with a traditional guitar speaker as demonstrated in many of the rig designs in Section 1: Setting Up). If you are new to using mics on a guitar amp, you will find the Cab block is a fantastic way to learn more. To get started, listen to single IRs, or explore the factory presets which combine several at once. For almost a century, artists, producers and engineers have honed the craft of placing or blending mics to achieve a desired tone. Many classic techniques are easy to recreate. Try a tried-and-true “recipe” blending one “bottomy” and one “bright” mic, or try something totally original. The factory content includes dozens of speakers with multiple different mics in different positions. You may also enjoy “Mix” IRs by Fractal Audio or 3rd-parties, which bring a producer’s experience to you in a single IR. In any case, recognize that the sound of IRs is the very sound of the speakers and mics they capture.

FRACTAL AUDIO QUOTES


An IR stands for "Impulse Response". In mathematical terms it is the time response of a system to a Dirac delta function (also known as an impulse). An IR can be used directly as the coefficients for an FIR (Finite Impulse Response) filter. In the modeling world IRs are obtained from real speakers and when processed using an FIR filter produce extremely accurate results. In essence an IR is a "sample" of the speaker and microphone and uses very similar principles. However the quality of any IR is subject to the talents of the individual(s) capturing the IR. Mic placement, preamp choice, etc., etc. are important as you are essentially recording the speaker. In the old days modelers used EQ to emulate speaker response but I don't think there are many left that still use that technique. So the quality of the IR is really the issue here. The original Axe-Fx pioneered this technology which has since become almost ubiquitous.

[1] It's analogous to a sampler. Consider the sample of, say, a kick drum. You make a short recording of that and then "trigger" that recording. If you want the kick drum louder you make the trigger louder. Now assume you trigger that recording thousands of times a second at varying amplitudes. That's essentially how IRs and convolution work. You trigger the recording at the sample rate and each playback is weighted by the sample amplitude.

[2] An IR is the coefficients for a Finite Impulse Response (FIR) filter. A classic EQ is an Infinite Impulse Response (IIR) filter. FIR and IIR are both filters which we call "equalization" in the audio realm. You can design an FIR using various techniques. These techniques yield the coefficients for the filter. Another way to obtain the coefficients is to "sample" them from a linear system. You can approximate an FIR filter with an IIR filter. There are techniques for creating a cascade of second-order sections (SOS) from an FIR filter (Prony's method is one).

[3] EQ curve" is semantics. It's a filter. In this case usually an Infinite Impulse Response (IIR) filter. There are two types of filters: Infinite Impulse Response (IIR) and Finite Impulse Response (FIR). An "IR" as we use the term is an FIR. It's a filter and, hence, could be said to be an "EQ curve".

The name "filter" arose because it filters out certain frequencies and lets others pass. Both an IIR and FIR accomplish this. The difference is that an FIR has no poles, only zeros. An IIR can have both poles and zeros. An IIR can achieve similar performance to an FIR with a lower order thus making it more efficient to implement. IIRs, however, typically have nonlinear phase response whereas an FIR can be linear phase.

IR properties

File formats

Capturing IRs creates WAVE files (.wav). WAVE files can be converted into SYX (MIDI System Exclusive) files using Cab-Lab or the Fractal Audio's current editors. When using Cab-Lab to convert WAV files, and when using IR Capture to create IRs, two files are created: an IR file (.ir) and a SYX file (.syx). The .IR file is raw IR data which can be imported into Cab-Lab for mixing purposes. Cab-Lab is the only application that can handle .IR files.

Sample rate

Impulse responses are tied to the sample rate of the processor. Fractal Audio's modelers are set to a fixed sample rate of 48kHz.

Read this: Sample rate.

Auto Trim

This removes superfluous leading silence from the start of the IR. Many commercial IRs do not require this. It might come in handy when shooting your own IRs using IR Capture.

FRACTAL AUDIO QUOTES


[4] There is no wrong place to trim. It's impossible to know where the data starts because of noise. So we find where the data starts to increase, back up a few samples and trim there.

Minimum Phase Transformation (MPT)

This verifies that the phase of the IR causes no issues when mixed with other (MPT) IRs, by time-aligning the IR. This is especially important when you mix multiple IRs and you don't want to align them manually in Cab-Lab, in the DAW or on the Axe-Fx III hardware. All legacy factory cabs are MPT (Minimum Phase Transformed) to make them mix-compatible. When neither Min Phase or Trim has been applied, the impulse response is considered "raw", containing the original phase details.

Cab-Lab and the editors let you apply MPT or Auto Trim when importing files, manually or automatically.

Current firmware lets you align multiple IRs in the Cab block manually. Read this: Cab block: Align

FRACTAL AUDIO QUOTES


[5] If it's a cab IR the difference is basically nil because a speaker is a minimum-phase device. All minimum-phase does in this case is automatically remove the leading silence. A room IR is not minimum phase so you should not use MPT when processing a room IR.

[6] The factory cabs are minimum phase for precisely the reason that mixing non-minimum phase leads to phase problems.

All the factory cabs are minimum-phase transformed so they are, by definition, "in phase" with each other.

[7] So I've been doing a lot of critical listening the last couple months and have come to the conclusion I like non-minimum phase IRs better. The difference is subtle. They don't really sound that different but there are differences in the attack and in the feel. They just sound/feel a little more open and realistic. Another thing is that they mix very differently. It's less predictable but more natural. The caveat is that it's like mixing real mics, you need to experiment moving each mic in and out whereas with minimum phase you can usually just leave one mic at zero and move the other in and out. So here's a zip file of my favorite IR session, the Wellspring session, in non-min phase format for use with the Axe-Fx III. My suggestion is to put them in one of the user banks and compare with the factory min-phase versions. Note that names are a bit different but you should be able to figure it out.

[8] Minimum vs. non-minimum phase changes the "delay" of the individual frequency components of the waveform. In a minimum phase system the individual sine waves have the least phase possible which concentrates the energy near the start of the waveform. For example consider a sine wave with an isosceles triangle envelope. The energy is concentrated at the center of the waveform (at the apex). The Fourier transform of that is mostly the primary frequency with a bunch of other sine waves at various amplitudes added. We can phase shift the component sine waves and the magnitude (frequency response) will not change but the waveform will. If we make it minimum phase the sine waves will add up so that the energy is concentrated at the beginning of the waveform and the waveform will then look something like a sine wave with a right triangle (ramp down) envelope.

[9] I've long maintained that a guitar speaker is essentially a minimum-phase device and that the benefits of MPT'ing the data far outweigh any minuscule differences in the IR phase response (the magnitude response is identical).

[10] Raw is technically "better" but the difference between raw and MPT is minimal. The disadvantage of raw is that you have to manually align them if you are mixing IRs.

[11] FullRes IRs can be processed with minimum-phase or auto-trim, if desired. However, minimum-phase is not recommended as this will tend to destroy the reflection information.

DynaCabs are always time-aligned (but not minimum phase) without destroying phase information, allowing them to be mix-and-matched. You can mix-and-match IRs from different cabs/mics and they'll always be perfectly aligned. They have 0.3 ms of leading silence.

FRACTAL AUDIO QUOTES


[12] It's because they are not minimum-phase transformed.

[13] The files are initially aligned to the same reference point -- a universal standard across all DynaCapture IRs. In other words, IRs for mics that are farther away have been time compensated to be aligned with IRs for mics that are closer. You can use the controls on the Align page in the usual way, however, to add time to any individual IR.

IR resolution

Fractal Audio's amp modelers and software support impulse responses of various lengths. This is also referred to as: resolution. This is measured in samples or milliseconds. The number of (milli)seconds of an IR is calculated by dividing the length (in samples, aka points) by the sample rate. All Fractal Audio devices are set to a fixed sample rate of 48000 Hz. Read this: Sample rate.

The Axe-Fx III, FM9 and FM3 let you edit the length of a legacy IR in the Cab block (hardware and editor). This affects CPU usage (longer = more CPU usage). The Axe-Fx III also lets you do this to DynaCabs (FM3/FM9: n/a).

Normal (Standard)

The number of samples is 1024 samples (1K), 20 ms.

However, the Axe-Fx III refers to 2K IRs (40 ms) as Normal IRs.

1024 samples suffice to capture the essential sound of the speaker cabinet, without so-called room reflections. In general, you can use Normal Res without having to worry that the sounds is not as good as HiRes or UltraRes. "Normal" is also referred to as: "Standard".

FRACTAL AUDIO QUOTES


[14] 99% of the information in a cabinet impulse response is in the first few ms. 1024 samples is more than enough to capture the essence of the cabinet and much of the room mics. You don't need a DAW, simply send the sysex files to your Axe-Fx.

HiRes

2040 samples, 40 ms.

Double the length of the "normal" IR, allowing it to store more information after the first 20 ms. HiRes IRs use more CPU than Normal Res, and also more than UltraRes. Because of this, HiRes as a label / format was discontinued, although the Axe-Fx III still supports it (the Axe-Fx III refers to 2K IRs (40 ms) as Normal / Standard IRs).

FRACTAL AUDIO QUOTES


[15] There are a couple reasons for 2048. Probably most important is that it allows 1024 in stereo mode. To be able to do stereo 1024 requires a 2048 convolution engine. Secondly, some IRs benefit from longer IRs. Better to have the ability and not need it than the converse.

UltraRes

Up to 8000 samples, 170 ms.

UltraRes speaker IR processing is a Fractal Audio proprietary technique which enhances the spectral resolution of an IR without adding CPU burden or storage requirements. UltraRes IRs allow more information to be captured than Normal or HiRes IRs, especially in the lower frequencies. UltraRes IRs require more CPU power than Normal Res but less than HiRes. UltraRes lRs are displayed in italics or in a different color in the software editors.

About UltraRes IRs:

FRACTAL AUDIO QUOTES


The problem with conventional IRs is that they are too short to capture the detail in the low frequencies. There are those that maintain 20 ms is the maximum length you need to fully replicate the speaker. This would be about 1000 samples at 48 kHz. I disagree with this as I have many IRs here that exhibit significant energy beyond 20 ms. I believe the room has some influence as the low-frequency modes of the room will impact the resulting sound. The amount of this impact depends on the room, the mics, distance, etc., etc. Or perhaps certain speakers have particularly high Qs in the low frequencies. Regardless, it is my opinion that you need IRs much longer than 20 ms to fully capture the "mic'd amp in the studio" sound. My tests show that IRs of 8000 samples are required to fully capture the low-frequency detail. Unfortunately to process an 8K IR in real-time require copious processing power... Fortunately I have developed "Ultra-Res" cabinet modeling. Ultra-Res cabinet modeling provides the frequency detail of a very long IR with little or no added processing power requirements.

Existing IRs will still be processed as usual. Ultra-Res IRs will be tagged as such which will indicate to the processor to use the new processing algorithms. Note that Ultra-Res IR data is not conventional IR data.

The frequency resolution of an IR is the sample rate divided by the number of samples in the IR. The window function has nothing to do with frequency resolution (except for making it even less). So a 1K IR at 48 kHz sample rate has a frequency resolution of roughly 48 Hz. If a speaker has a resonance (formant) at, say 80 Hz with a Q of, say, 3.0, then 48 Hz is insufficient to capture that resonance accurately. You need a frequency resolution of several Hz to accurately recreate that resonance. I chose 80 Hz and a Q of 3 because that's what that response looks like. The Q could even be higher than that. It doesn't take much mental energy to realize that if you have a narrow formant at a low frequency then you need fine frequency resolution to reproduce that. An 80 Hz formant with a Q of 3 only spans about 25 Hz. Obviously a frequency resolution of 48 Hz is not going to be able to reproduce that. Windowing only smooths the response even more. This is basic FFT theory. The less time-domain information you have, the less frequency domain information you have and vice-versa. This is the uncertainty principle. I always window IRs with a Hann window.

Another way to look at it is to think in terms of formants. That particular speaker has a pronounced 80 Hz formant. It takes well over 100 ms for the energy of that formant to decay to the point of imperceptibility. Obviously a 20 ms IR can't reproduce an event that occurs for over 100 ms. Here is a zoom of the original non-minimum-phase IR (IOW raw time response)... (see thread for image). You can clearly see the 80 Hz formant. There are some room reflections but they are very small. The 80 Hz formant starts well before any reflections. It's obviously a high-Q resonance as it rings for quite a while. The higher the Q, the longer it takes to decay.

Here's another example. (see thread for image) This is one of the new OwnHammer IRs. The IR is OwnHammer_412_MAR-CB_D-120_SS_RBN-121. These IRs are 100 ms long (4800 samples). I windowed the original IR to 4K to prove a point. The blue trace is the original IR (windowed to 4K samples). The green trace is the "typical" 20 ms IR (windowed to 1K samples). The red trace is the Ultra-Res version.

The problem is that human perception is logarithmic and IRs are a linear process. 48 Hz resolution is way more than necessary at, say, a few kHz but not nearly enough at low frequencies. The brute force solution is to use very long IRs, 8K or more. Ultra-Res solves this in a novel way that uses little to no extra processing power and no additional latency.

Normalization is your friend. Rectangular windows are simply truncation and are generally regarded as bad practice due to extremely high sidelobe levels. The choice of window is subjective. I actually use my own custom window that is not really a Hann window but that's proprietary information. My window preserves more frequency detail while still suppressing Gibbs phenomenon. Windowing trades off frequency resolution for sidelobe suppression. My window is optimized for the unique statistics of IRs. For a random process I tend towards Bessel-Kaiser windows. IRs have unique statistics that aren't addressed by any of the standard textbook windows.

Let me state these points:

  1. We don't record guitar amps in airplane hangers or anechoic chambers. We record them in studios.
  2. When we record a guitar amp we carefully set the amp up in the studio to get the best sound "on tape". This involves moving the amp around, placing gobos, etc. When we collected the Producer's Packs IRs we spent hours arranging the amps/speakers, mics and gobos and playing through the amp and readjusting until we were satisfied. This also included adjusting the preamps and mixing board. In one studio we found that we got the best tone raising the cabs off the floor by a couple feet, orienting them towards a particular wall and placing gobos behind (this was the engineer's standard recording arrangement).
  3. At this point our objective of the IR is to capture the sound of that amp/speaker at that position in the room, with the gobos, mics, preamps, etc., etc. The goal is not to capture the raw sound of the amp/speaker in an airplane hanger or outside using a ground-plane measurement and measurement mics. That might be someone else's goal but it is not ours. IOW our goal is to treat the cab, mics, preamps, room, etc. as a whole, as a good engineer/producer would.
  4. Subsequent analysis of the data shows that there is significant energy out to 100ms and even beyond. However there is little energy beyond 200 ms or so (as it should be in a well-designed studio). This observation was the catalyst for the Ultra-Res algorithm. There are other observations about the statistics of the data that I cannot disclose.
  5. Some cabinets displayed noticeable resonances at low frequencies. Others did not. The frequency of these resonances were not consistent and, not coincidentally, matched the measured resonance of the impedance sweep. It is a logical conclusion, therefore, that the resonance was NOT caused by the room but by the speaker/cabinet combination. Furthermore a plot of the group delay for the raw data showed that the delay of the resonance was too short to be a room mode. Regardless, whether the resonance is from the speaker or room or mics or preamps is irrelevant. All we care about is recreating the sound of that speaker as it would be recorded as accurately as possible.
  6. Truncating an IR destroys information by definition. We don't care where the information comes from, be it the speaker or the room or the mics or the preamps. We want all the information. If a plot of the frequency response of a truncated IR differs considerably from the non-truncated version then we have lost information and concomitant accuracy.
  7. NO ONE producing commercial IRs records them in an airplane hanger, for obvious reasons. The best ones are done in a studio using the same technique we used for the Producer's Packs: setting up the cab, adjusting the position, mics, preamps, etc. and playing through the amp/cab and readjusting until the best tone is achieved. The new OwnHammer IRs are an example of this. Many, if not all, of those IRs exhibit significant energy to 100 ms (and likely beyond but the data stops at 100 ms). Truncating them to 20 ms destroys vital information. You can argue the semantics all day long. I've compared truncated and non-truncated and the difference is clearly audible. It is especially noticeable when chugging power chords. You can hear the resonance. It goes "bonggggggg" as opposed to "thuk". Most importantly it sounds "better" IMO.
  8. Ultra-Res is an algorithm that markedly increases accuracy. It gives the frequency resolution of a 200ms IR without additional processing overhead and no added latency.

Ultra-Res is especially powerful in Tone Matching applications, particularly real-time matches and was another impetus behind the development.

The myopic only see the IR as a capture of the speaker's "unadulterated" response. As I stated before I believe the future is treating IRs as capturing the entire recording chain including mics, preamps, etc. and have pushing in that direction. We have already seen the fruits of this labor in the Producer Pack and OwnHammer V2 IRs. We used mainly PP and OH IRs at Axe-Fest this weekend and the results were stellar. Andy Wood's tone was among the best guitar tones I've ever heard live and we dialed it up in 10 minutes under far less than ideal conditions. It consisted of the Two Rock amp model and the EV 12L Mix IR. When you include more than the speaker response in the IR you can have low-frequency resonances that persist for tens of milliseconds or more. Truncating an IR destroys this LF information. In many cases this LF information loss would probably not be perceptible. In other cases, from experience, it can be extremely noticeable. The bottom line is that you can always remove the information if you don't want it but you can't add back what isn't there.

Let me phrase this another way. An IR can consist of the "raw" speaker response plus none, one, some or all of the following: mic, preamp, room, power amp (e.g. you want to capture the response of a tube amp driving the speaker), etc. If you only care about the raw response then a short IR is all that is required. However if you want any of the other elements as part of the IR then a longer IR may be necessary. Ultra-Res gives you the OPTION of processing longer IRs.

[16] If the .wav is only 40ms long there is no sense in converting to Ultra-Res as you won't gain anything. Over 80 ms is desirable. The maximum length supported is 170 ms or so. Anything longer than that is truncated to 170 ms.

[17] To get the optimum results the length should be 170 ms or more. As the length gets shorter you'll lose information. However there may not be any information to lose. It all depends on the IR. I've seen long IRs where only the first 100 ms or so is actual information and the rest is silence. OTOH I've seen 100 ms IRs where there is obviously more information but it got truncated. You lose nothing with Ultra-Res except the ability to change the size of the cabinet. You gain better sound and less CPU." "You can't mix Ultra-Res IRs as the data is not compatible. However... we foresaw that and the UltraRes conversion process produces two files: a .ir file and a .syx file. The .ir file is the raw IR data that can be imported into CabLab for mixing purposes. So CabLab can take .wav, non-Ultra-Res .syx and .ir files as input to the mixer section and product Ultra-Res .syx files." "The .ir files are included with our cabinet packs. We will not be offering .wav files. If you have the .wav file you don't need the .ir file. A .ir file can ONLY be used with CabLab. If you use the Axe-Fx II to capture IRs it will only generate .ir and/or .syx files. No .wav files are generated. The resulting data can only be used on Fractal Audio products.

[18] It depends on the IR. Ultra-Res improves low-frequency resolution. It is very apparent with some IRs and virtually inaudible with others. It all depends on the low-frequency formants in the original IR. If there are significant, high-Q formants Ultra-Res will preserve those whereas conventional, short IRs will not. Audibility also varies with the amp being used. The difference is more audible with high gain as this will excite the formants more. Low-frequency formants vary with the type of cabinet and speaker. Some cabinets have a smooth low frequency response. Others have prominent formants. The mic also has an impact. Some mics will accentuate the formants. The room also contributes if it has strong LF modes. Furthermore some people like to capture an IR using a tube power amp. In this case you WILL get a significant formant at the low-frequency resonance of the speaker. A conventional IR will not capture that as the Q of the formant will exceed the resolution of the IR. Ultra-Res will capture that formant as Ultra-Res has 8 times the low-frequency resolution. Those who claim they can't hear a difference are correct. They can't. It's nothing to be ashamed of. But because they can't doesn't mean others also cannot. I can clearly hear the difference but I've trained myself on what to listen for. I vastly prefer Ultra-Res and only use Ultra-Res IRs in my personal patches (aside from the TV Mix, which is just a magical IR).

[19] The length of the sweep only determines the signal-to-noise ratio. If the room is completely silent the sweep can be infinitely short (an impulse). To overcome ambient noise you need more energy in the applied stimulus. With an impulse you can only increase the power so much before the amplifier or the speaker or the mic or the preamp, etc. distort. However, if you spread that power out over a longer period of time you can increase the energy and therefore increase the SNR. Think of it this way: a 1 ms pulse at 1000W has the same energy as a 1 second pulse at 1W. Now you can't just put a 1 second pulse into a system because the pulse has little frequency content. A 1 second sweep over the band of interest allows the transfer function (IR) of the system to be obtained via deconvolution. There are other signals you can use like pseudo-noise and MLS sequences but a "chirp" has the best characteristics. In the early days of room IR capturing they used impulses generated by popping a ballon, firing a starting pistol or clapping two boards together. The results were poor due to low SNR. This lead to the development of signals that have higher energy. To get the IR of a room long sweeps are typically used because there is a lot of ambient noise and the "returned signal" is weak (the reverb portion of the response is very low compared to the direct signal). When close mic'ing a speaker the ambient noise is low and the signal strength is very high so a short sweep is adequate. In fact you could probably get away with 100 ms or less in a studio environment.

[20] I've never seen a cabinet IR (and I've examined thousands) that has any significant content beyond 150 ms or so. Most cab IRs are under 40 ms. The exception to this would be a "room IR" where the mic is very far from the speaker and the room is significantly reverberant. But one wouldn't normally use that as the primary tone, instead to add a little ambience to the tone and the loss of information would be imperceptible in context. Modeling products typically use IR lengths of 1K samples as this covers 90% of IRs ever captured. We support 2K and Ultra-Res (which is equivalent to 8K) which covers 99% of IRs. The amount of CPU power required to process an IR is proportional to the length of the IR. To support a 500 ms IR (24,000 samples) would require over ten times the CPU power of a 2K sample IR. It also requires over ten times the memory for storage. Given that that vast majority of IRs do not have any information beyond 40 ms it is wasteful of CPU and memory resources to support IRs longer than 2K.

[21] The length of time you hold a chord is irrelevant. The impulse response of a speaker cab is typically much less than 100 ms. Only when there is significant room reflections is the length greater. Then you get into the whole argument of whether the IR should contain any room information.

[22] (Tone Matching and Ultra-Res) In Realtime mode the raw internal IR length is 8K which you can dump.

[23] You can export the Tone Match to CabLab and create and Ultra-Res IR.

[24] I'm a huge advocate of longer IRs. In fact I think I was the first to advocate it despite all the naysayers. I pushed OwnHammer (and others) to increase their IR lengths and they were the only ones who acted on that advice (so far, maybe the other guys will start to follow suit). Ultra-Res was born out of the desire for longer IRs.

For recording you don't need to use the cab block in the Axe-Fx though. Record the raw amp sound and then "re-cab" it later. This way you can try different cabs. Cab-Lab is great for this. Cab-Lab does not do UltraRes processing. It creates UltraRes files for the Axe-Fx but it does all processing at the full IR length up to 8K samples. You can use other convolution plug-ins as well.

The reason for UltraRes is that long IRs have several drawbacks:

  1. They require lots of storage space. Not an issue on a computer but on a hardware product that means expensive non-volatile memory.
  2. They require lots of processing power if you don't want any latency. On a computer it doesn't matter since latency is a non-factor if you are processing prerecorded tracks. On a hardware product we must have zero latency.

So UltraRes was devised as a way to exploit the statistics of the data to give the benefits of longer IRs without the usual hardware drawbacks.

In my tests I've found that 8K samples (170 ms) is more than enough. I think 500 ms (24K samples) is overkill and if an IR has significant energy out that far then it has too much room in it. The speaker and cab itself are never more than 100 ms, usually much less. Anything beyond that is the room. I personally don't like IRs with lots of room in them. A little bit of early reflections are nice and make things sound less direct but too much room makes the sound get lost in the mix.

There's no meaningful data beyond 150 ms and if there is, it's the room and you don't want that much room.

[25] (Ultra-Res 2.0) No big deal, just some improved processing algorithms. The UltraRes cabs in Quantum 2.0 were all reprocessed with UltraRes.

[26] UltraRes 2.0 is the next level of evolution for our patent-pending speaker simulation technology, with even greater accuracy than the original version. UltraRes 2.0 cab files are backwards compatible with previous Axe-Fx and AX8 firmwares supporting UltraRes 1.0.

[27] "1" from the speaker is the near field. The response of a speaker in the near field is very different than the response in the far field. In the near field the response changes (drastically) across the face of the transducer. Even moving the mic a fraction of an inch will result in a very different sound. 10 ft. from the speaker is the far field and the response changes smoothly as you move across the field. If the near field were the same as the far field then the sound wouldn't change as you moved the microphone and you could place the microphone anywhere on the face of the speaker. Anyone who has mic'd a speaker knows that this isn't the case.

[28] The energy of the speaker itself is contained in less than 50 ms. Anything beyond that is room reflections. Therefore any differences that you may hear are room reflections. The question then becomes do you want room reflections. Some say yes, some say no. One approach is absolutely no room reflections and then you add them with room simulation. The other approach is to use longer IRs with reflections in them. Both are valid approaches. Close mic'ing minimizes room reflections as the direct path is much shorter than the reflections path and sound pressure decreases by the square of the distance.

[29] The length doesn't determine the amount of bass. It determines the resolution. If there is a sharp resonance in the bass response a short IR will smear that resonance causing it to be wider than it actually is.

[30] It's complicated. Some people claim you don't need any more than 20 ms. For many cabs this is true. It depends on the "formants". A guitar speaker is approximately "constant Q" and minimum phase. The peaks and dips are caused by formants and the Q of those formants is roughly constant. This means low frequencies take much longer to decay than high frequencies. I've measured some cabs with a low frequency formant that takes much longer than 20 ms to fully decay. This is the whole reason for Ultra-Res. There are no absolutes. Many cabs can be captured just fine with 10-20 ms. Others need more. If you want some of the room in there then you need more.

[31] I've been experimenting with IR length lately and keep finding that I like a shorter length. So I gave some thought to it and I think the reason is that a shorter IR trims off the early reflections. A 1024 sample IR is over 20 ms. If there is a wall 5 ft. away that puts a reflection smack dab in the middle of the IR. The Redwirez IRs you can see (and hear) the ceiling reflection pretty clearly (ceiling was probably about 8 ft). Using a shorter IR removes that reflection. I've actually been turning down the IR length lately on my personal patches, typically 512 samples, as I find it makes the IR "clearer". While the push in the industry has been towards longer and longer IRs I'm not sure that's a good thing unless you are careful with your IR capture to ensure that you aren't capturing reflections. Some IRs, particularly ported bass speakers, may need the longer length to capture the low end with sufficient detail but the average guitar cab is probably fine at 512 or even less samples. Heck, prior to the original Axe-Fx some products were even using 128 samples.

[32] SOMETIMES shorter is better. It depends on the environment they were captured in.

[33] FWIW, I almost always reduce IR length on my personal presets. Usually 512 or 1024. I like the lack of reflections.

[34] The takeaways from this are simple:

  1. Minimize reflections as much as possible when capturing IRs. Shoot them in the largest room you can find. Elevate the speaker off the floor or angle it back to minimize the floor reflection. Make sure there is ample distance behind an open-back cabinet.
  2. For existing IRs that may have prominent room reflections try different IR lengths to trim out the reflections.

About Far-field IRs:

"Far-field" IRs represent the sound of a speaker that was captured at a long(er) distance. These IRs represent (to a certain extent) the "in-the-room" sound of a traditional guitar speaker. There are a couple of far-field IRs among the legacy stock cabs, created by Jay Mitchell ("JM"). Also, Fractal Audio has released far-field IR libraries ("Far-Field Sessions").

"FullRes" IRs also capture longer distances, but these are intended to capture the room, where "far-field" IRs are aimed at capturing the sound coming directly from the speaker at a longer distance. More about FullRes below.

FRACTAL AUDIO QUOTES


[35] Far-field IRs are not the panacea some are making them out to be. Some things need clarification:

  1. A far-field IR will still not sound exactly like "amp in the room". The reason for this is that the dispersion of a guitar cabinet is very different than that of a FRFR speaker. An FRFR speaker has far wider dispersion at high frequencies, by design. With a guitar cabinet the low frequencies are less directional than the highs. This causes the cab to interact with the room differently. So even if you capture a far-field IR it will not sound the same through a FRFR speaker.
  2. Most of the time we are not in the far-field of a guitar cabinet. At 10 kHz the far-field of a 12" speaker is about 18 ft. So usually we're in the far-field at some frequencies but in the Fresnel zone at others. At a typical distance of, say, 5 ft. we are only in the far-field at frequencies below roughly 3 kHz. Above that we are in the Fresnel zone.
  3. Because of #2 the sound at each ear can be quite a bit different. That six inches or so between our ears makes a big difference. When using a far-field IR the same sound will be presented to each ear. Even when in the far-field the sound changes pretty dramatically vs. angle because the dispersion is a function of frequency. One ear will hear more highs than the other.
  4. A cab with more than one speaker creates significant challenges. For example, a 4x12 has a far-field at 10 kHz that's roughly 100 feet! If you capture an IR of that cab at, say, 10 feet you are nowhere near the far-field. At anything other than nadir (aka boresight, 0 degrees) the individual speakers will contribute with different times of arrival. This results in extremely phasey sound (we were able to get some 4x12 IRs by using a special trick but in general you need to be very far away).
  5. We don't hear this phasiness when listening to the real cab though because of #2. We get very different signals at each ear and our brain processes these. When using a Fresnel-zone IR of a 4x12 the same signal goes to both ears.
  6. Many guitar cabs are open back. A far-field IR of an open back cab through an FRFR monitor will sound very different because you're not reproducing the sound coming out of the back of the cab and bouncing off the walls.
  7. The sound of recorded guitar is near-field. This is what most people are used to hearing. So if you're trying to get the sound of your favorite record you won't get that with far-field IRs.

The takeaway from all this is that if you truly want the sound of amp in the room the best way to get that is to use an actual guitar cab. This isn't to say that far-field IRs are useless. They will give you a roughly similar sound to a guitar cab but it's just not the same.

[36] One of the things I've found really useful about these (far-field IRs) is they are a good starting point for dialing the amp block in. Near-field IRs can have excessive bass and/or treble. To compensate we might end up doing strange things in the amp block which throws off the distortion character and feel. When using a far-field IR it's very similar to how the amp sounds through a conventional cab. So what I'm doing is using one of the far-field IRs to start, dial in the amp block and then choose a near-field IR. I then adjust the low/high cuts in the cab block rather than adjusting the amp block.

[37] A far-field measurement is only the response of the transducer. There's a couple ways to do far-field measurements:

  1. Suspend the speaker and mic in air far enough above the ground so that the ground reflection arrives after the direct signal.
  2. Use a ground plane measurement technique outdoors or in a space large enough that any reflections arrive after the direct signal.

A room mic is completely different and it will have the room reflections, which are desirable and give the mix "space". It will also have the response of the mic "baked in".

[38]

  1. Close-micing a cabinet is not problematic. It's been done for years and solves a lot of technical issues. It may sound different than listening to that cabinet at distance but it's no more right or wrong than any other micing technique.
  2. Far-field has nothing to do with the picking up the sound from multiple speakers and "hearing the sound of the cab as a whole". Many cabinets have only one speaker. The far field of an acoustic radiator (i.e. speaker) is the point where the sound waves coming from the speaker behave as though the radiator is a point source. In the far field the intensity falls off by the inverse square of the distance. At distances less than the Fraunhofer distance the field is characterized by widely varying intensity due to interference. To calculate near field beam patterns you can treat the transducer as a lot of smaller point sources and find the contribution of each point source at a given point. As you move around in the near field each point source has a different phase and intensity due to distance and angle. This interference causes the intensity to vary widely as you move around.
  3. You can measure the far field response of a speaker in a number of ways. One way is a free field measurement. The speaker and mic are suspended far above the ground so that the multipath from the ground occurs after the direct path IR has decayed fully. This is obviously difficult. Another way is using a ground plane measurement. The microphone is placed on a smooth hard surface either outside or in a large enough room so that any reflections occur after the direct path IR has decayed. Placing the mic on the floor effectively removes the floor reflection as the direct path and reflected path are the same.
  4. There is no such thing as a "short impulse". An impulse is, by definition, infinitely short. Regardless, impulses are almost never used to measure IRs. Almost everyone uses sine sweeps or other wideband waveforms (PRN sequences, etc.). The length of the sweep does not need to be short and, in fact, can be quite long. The longer the sweep the better the SNR.
  5. A far-field IR is not a "truer" representation. It is simply the response of the cabinet in the far field. A near field IR is equally "true", it just sounds different. Far field IRs are not a panacea either. They're difficult to obtain for speakers with multiple drivers, i.e. 2x12, 4x12 because the far field is extremely far away. At 10 kHz the far field for a 4x12 is something like 100 ft. (too lazy to do the math right now).

When we listen to multiple driver speakers we are typically in the Fresnel zone. If you take an IR of a multiple driver speaker in the Fresnel zone there will be deep notches in the spectrum due to the different path length of each driver. We don't hear this though because we have two ears and our aural processing averages things out.

[39] A reflection free IR will NEVER sound the same as a real cab because the directivity of a monitor is very different.

Even a single 12" guitar cab will sound markedly different than a RFIR because the high frequency beam pattern is markedly different. Cabs with multiple speakers exacerbate the problem.

You can capture an IR of a cabinet at one point in space but it's just that: one point in space. Good monitors are designed to have smooth beam patterns with wide dispersion. A guitar cab has poor dispersion as the frequency increases. This is what causes the infamous "beaming" of high frequencies.

This causes two things. The first is a psychoacoustic effect because the sound changes rapidly with angle and our ears are a finite distance apart. One ear hears something different than the other. The other is the interaction with the room. A monitor, with its broad dispersion will send a wide range of frequencies to the various surfaces in the room. A guitar cab will send more low frequencies to the surfaces where high frequencies will be beamed. The "reverb excitation" system function is therefore different.

As a simple example consider sitting off to the side of a guitar cab and there's a wall on that side. The lower frequencies will hit that wall and reflect back to you but the higher frequencies are beamed and don't hit that wall. Now repeat that with a monitor. By design the monitor has much greater dispersion so the high frequencies hit that wall and reflect back.

The result is that guitar cabs sound "warm" in a room environment when listening off-axis (which is nearly always).

So in theory you could capture an IR at the same distance and angle as your listening point but using that IR through an FRFR monitor will simply not sound the same because of the aforementioned reasons. It may be close enough for some people but IME it's not close enough and does not offer the same experience.

And this is just the tip of the iceberg. You also have to consider near-field vs. far-field. For a guitar cab we are typically in the near-field for the higher frequencies at typical listening distances. This exacerbates the change in sound vs. angle and the resulting psychoacoustic effect.

The irony in all of this is that great pains are taken to obtain a reflection-free IR. That IR is then used in an FRFR monitor that generates reflections and the reflections generated are very different than the reflections generated by the guitar cab. That fact alone makes the whole exercise futile.

The best way to get AITR sound from something that's not a guitar cabinet is to use something that's *almost* a guitar cabinet, like a Celestion F12-X200 and then apply EQ to morph the sound. Or just use a guitar cab and a power amp and stop trying to use a hammer when you really need a screwdriver.

More information:

About Fractal Audio's Far-Field Session 2 IRs:

  • The pack is free. The IRs can only be used with the Axe-Fx III.
  • The IRs are mostly reflection-free in the first 20 ms area. Although the IRs themselves are longer, it recommended to set IR length in the Cabinet block to 1024.
  • There are "A" and "B" IRs. They indicate different distances. The number which follows is the number of degrees that the cab has been rotated away from the microphone.

FRACTAL AUDIO QUOTES


[40] Some minor reflections from some immovable objects within the "zone of silence". Nothing severe.

[41] We did the best we could given the building and circumstances. There are steel posts that support the roof that were likely the source of the minor reflections. Also the environment was a bit noisy. There's a transformer that was humming. We surrounded it with bags full of foam peanuts in an attempt to reduce the noise. Statistically they aren't perfect but when we listened to them we were quite pleased. It's not difficult to obtain a far-field IR. What is difficult is finding a good space. Since we have the building the only cost to us was our time. Since no studio costs were involved we can offer these as free. If they work for you great, if they don't, nothing lost.

[42] The gap at the beginning is because they aren't min-phase, they are auto-trimmed. The distance from the mic to the cab would be much greater but that's automatically removed (and we manually reduce it before-hand in the IR Capture menu).

[43] The magnitude of the reflections is very low. Comb filtering occurs when you add two signals where one is delayed vs. the other. If the magnitude of the two signals is equal the notch depth is infinite. As the magnitude of one decreases the depth of the notch decreases. Once you get -20 dB down or so the notch is insignificant. The amplitude of the reflections in these IRs is -30 dB down. For example if you have two equal signals and one is delayed by, say, 10 ms there will be infinite notches at 50, 150, 250, ... Hz as the delayed signal will be destructively interfere with the non-delayed signal at those frequencies (x - x = 0 => -inf dB). If the amplitude of the delayed signal drops to 1/2 the depth of that notch is now only 6 dB. I.e. x - 0.5x = 0.5x => - 6 dB. If the amplitude drops to 0.1 (-20 dB) then the notch is very small: x - 0.1x = 0.9x = -0.9 dB. At -30 dB the notch depth is x - 0.03x = 0.97 => -0.26 dB. This means ~1/2 dB amplitude variation over the spectrum. Our ears can't hear that.

FullRes

Up to 64000 samples, 1.37 sec.

FullRes IRs require more CPU power than Normal and UltraRes.

Firmware 17.xx release notes:


Version 17 introduces FullResTM Impulse Response processing. FullRes processes IRs up to 64K points with zero latency using a novel technique. This provides up to 1.37 seconds of response time. Seasoned producers and engineers often mix in “Room Mics” during recording to increase the depth and liveliness of recordings. However, the typical live room has a reverb time of 500-700 milliseconds, well beyond the 20-40 ms afforded by typical IR processing. FullRes allows capturing the full response of a typical live room and even the response of small-to-medium halls and clubs. FullRes can also be used for convolution reverb applications for reverb times less than 1.37 seconds.

The IR Player block and the Cabinet block both support FullRes IRs. The last two slots of the Cabinet block support FullRes. This is sufficient to provide two room mics, a left and a right, along with two direct mics within a single Cabinet block.

The new FullRes User IR bank (Axe-Fx III Mark II and Turbo only) supports up to 64 FullRes IRs. When capturing an IR selecting the USER FR bank will automatically set the IR Type to FullRes. Likewise, when setting the IR Type to FullRes the bank will automatically be set to USER FR. FullRes IRs can be processed with minimum-phase or auto- trim, if desired. However, minimum-phase is not recommended as this will tend to destroy the reflection information.

The Scratchpad bank has been updated to support FullRes IRs (Axe-Fx III Mark II and Turbo only).

The original Axe-Fx III has less non-volatile memory and therefore does not have the necessary resources to store the IRs. The Scratchpad bank supports FullRes IRs but the data will be lost when the unit is powered off.

Added 10 FullRes IRs to the Legacy bank provided by Valhallir and York Audio (Axe-Fx III Mark II and Turbo only). These are at the end of the bank. These can be loaded into the IR Player blocks or into Slots 3 and 4 of the Cabinet blocks. Note that Slots 1 and 2 of the Cabinet blocks do not support FullRes.

FullRes IRs are intended to capture "room mics" and reproduce these when recording or when playing through headphones. Room mics are sometimes mixed in to increase the depth and liveliness of recordings. They allow capturing the full response of a typical live room and even the response of small-to-medium halls and clubs. [44]

The expected approach is to combine a regular short (close-mic'd) IR with two (longer) room mics for ambience. This separation allows separate control over the direct sound and the (stereo) acoustic space.

Recording example

FRACTAL AUDIO QUOTES


[45] Normal IRs are 2K. FullRes IRs are 32 times larger so an entire bank would provide 32 FullRes slots.

[46] So I came up with a solution for Mark I owners: the Scratchpad bank supports both standard and FullRes IRs. Obviously you'll lose your IR when you power off.

[47] (Axe-Fx III Mk I) It has plenty of memory for firmware updates. It has limited NV memory for user data storage because at the time it was designed the FLASH chips used were the largest capacity available. The Mark II has larger capacity FLASH chips because they're available now.

[48] There are two cab blocks and each block supports two FullRes IRs so you could, in theory, do stereo 2.66 seconds. We went back and forth on how much time to support. At first I was thinking 2.66 seconds but that would double the CPU use and cut the number of available IRs in half. A survey of the literature showed that the reverb time of most acoustic spaces where recordings are done is less than a second so 1.33 seemed the best balance. I plotted the IRs of a bunch of the room mic IRs I have and the reverb time was typically around 500 ms which is consistent my experience and the literature.

[49] The point of FullRes IRs is to add some room sound to your recordings and/or headphone playing. To do this you would mix conventional IRs with FullRes IRs.

FullRes IRs require more CPU power than Normal and UltraRes IRs. Also, they are relatively large and require some time to load. Therefore, gapless switching (channels, scenes, presets) is not supported when using Full-Res IRs.

FRACTAL AUDIO QUOTES


[50] Full-Res (room) IRs are large and take a long time to load. Gapless switching is not supported when using Full-Res IRs.

To support FullRes: several adjustments have been made:

  • Slots 3 and 4 of the Cab block support FullRes IRs for stereo reproduction (left+right room mics).
  • So does the IR Player block.
  • IR Capture supports capturing FullRes IRs.
  • The Axe-Fx III Mk II gets a dedicated user bank for FullRes IRs.
  • FullRes IRs can be processed with minimum-phase or auto-trim, if desired. However, minimum-phase is not recommended as this will tend to destroy the reflection information.
  • A FullRes Scratchpad is available for auditioning IRs during the capture process.
  • Cab-Lab will be updated to support FullRes IR captures and management. [51]

About FullRes and Far-field IRs: not the same thing. They both capture the sound at longer distances, but where far-field IRs capture the sound coming directly from the speaker, FullRes IRs are intended for capturing the room (ambience) only. Both types of IRs are aimed at reproducing the sound of a guitar speaker as we hear it "in the room", as opposed to close-micing and recording a speaker. Where far-field IRs should have the least possible "room reflections", FullRes relies on them. [52]

FRACTAL AUDIO QUOTES


[53] A far-field measurement is only the response of the transducer. There's a couple ways to do far-field measurements:

  1. Suspend the speaker and mic in air far enough above the ground so that the ground reflection arrives after the direct signal.
  2. Use a ground plane measurement technique outdoors or in a space large enough that any reflections arrive after the direct signal.

A room mic is completely different and it will have the room reflections, which are desirable and give the mix "space". It will also have the response of the mic "baked in".

FullRes also allows the use of convolution reverbs, shorter than 1.33 seconds. [54]

FRACTAL AUDIO QUOTES


[55] You can ALSO use FullRes IRs to do short-to-medium convolution reverbs. If you have some reverb IRs you can convert them to FullRes IR format and load them in the IR Player block and use that as a reverb.

[56] Algorithmic reverb is considered to be superior to convolution reverb because algorithmic reverb can be perceptually tuned. From a perceptual standpoint the ideal reverb is not found in typical reverberant spaces. See the work done by Griesinger, et. al. on perceptual reverberation.

The factory FullRes IR in the Legacy bank are:

  • 190 — 4x12 V2 Viper Room L (Val)
  • 191 — 4x12 V2 Viper Room R (Val)
  • 192 — 2x10 Vibrato Lux Room L (YA)
  • 193 — 2x10 Vibrato Lux Room R (YA)
  • 194 — 2x12 Class A 30W Room L (YA)
  • 195 — 2x12 Class A 30W Room r (YA)
  • 196 — 4x12 Brit Greenback Room L (YA)
  • 197 — 4x12 Brit Greenback Room R (YA)
  • 198 — 4x12 Recto Room L (YA)
  • 199 — 4x12 Recto Room R (YA)

DynaCab

2048 samples.

Part of DynaCab cabinet modeling, introduced in firmware 22 and later for the Axe-Fx III (and corresponding FM3 and FM9 firmware).

Read this: DynaCabs

Far-field IRs and close-mixed IRs

See above.

Interesting IRs

Commercial and free IRs

The processors contain a lot of factory cabs, aka stock cabs. Cabinet models list

You can also create your own IRs with IR Capture or software, acquire IRs from sources such as Axe-Change, or from commercial vendors, including Fractal Audio.

Fractal Audio's free IR libraries:

Some commercial manufacturers provide free impulse responses. There are also quite a few popular impulse responses available in the public domain. Some examples:

Acoustic instruments

When recreating the sound of acoustic instruments (acoustic guitar, cello, violin etc.), an IR of an acoustic body may improve the sound. You can find some on Axe-Change.

Acoustic sounds benefit from long IRs with some room ambience, so UltraRes or even FullRes IRs are preferred.

Flat IR

The Factory 2 bank has a "totally flat" IR.

FRACTAL AUDIO QUOTES


Allows to individually adjust Mic Distance for phase adjustment of the "DI", while still enjoying the Preamp modeling and Mix controls of the Cab block without complex routings.

The totally flat cab has no filtering and sounds absolutely wretched. It's only there for diagnostic and special effect purposes.

IR Capture

Read this: IR Capture

Tone Matching

Read this: Tone Match block

Use Tone Match to capture an IR

FRACTAL AUDIO QUOTES


[57] The IR is vastly more important. Tone Matching is a nifty feature and certainly useful but you'll get far more satisfaction by concentrating on capturing good IRs. The single most important aspect of recording guitar amps is micing the amp. Therefore the single most important aspect of using your Axe-Fx is the IR. People are too hung up on "matching" or "profiling" an amp but fail to realize that when you are doing that you are basically capturing an IR. If you capture the IR separately now you have an IR that is fully separated from the amp and therefore can be used with all models. Matching and profiling cannot mathematically separate the amp's frequency response from the cabinet frequency response. Once you do this you'll be surprised at how accurate the amp models are. I do this all the time and find Tone Matching is unnecessary now (in fact many of the amp models have had their built-in matching data removed in the latest firmware). Any differences between the model and the real amp are so minuscule as to be immaterial. A little tweak of the tone stack or EQ is usually enough to remove and differences. Besides, once you get into mixing you'll realize that you'll be applying EQ anyways so tiny differences in EQ are irrelevant. Moving the mic just a small amount drastically changes the sound. The best producers have mastered micing. You can only fix so much via EQ since EQ is essentially painting with a broad brush where mic technique is akin to using a fine-point brush.

There is NO substitute for shooting an IR of the cab. IMO, this is the single most important thing you can do. Everything else is attempting to learn the cab IR through an indirect method and then you have inseparability. If you shoot the IR then do a Tone Match you can change the cab after or do another IR with a different mic or mic position and your matching data is still valid.

Volterra kernels

FRACTAL AUDIO QUOTES


[58] The Axe-Fx III (and II) actually capture the Volterra kernels when doing an IR capture (it's hidden in the firmware for possible future use). I've studied dozens upon dozens of them and the kernels above first order (the first order kernel is the linear IR) are so small as to be inaudible. The distortion from an amp is orders of magnitude greater even when using a clean amp. The only significant nonlinear thing I've measured that speakers do is thermal compression (that we model already) and "cone cry" which sounds like sh*t. Jay Mitchell is probably the leading authority on speaker design and he has stated pretty much the same thing. I'm all for improvements but they need to be real improvements. I've sat here countless times comparing an IR to the actual speaker with a mic on it doing blind A/B tests and can NEVER tell the difference and I think my ears are pretty good. I dug through my Matlab stuff and found this. (graph) This is an IR of a speaker taken twice. The first time the drive level is around 1W (in red). The second time the amp was turned way up, I would estimate at least 50W (in blue). As you can see the difference is extremely small. There's a small difference from 10 Hz and down which is way below the reproduction range of any system and a difference way up at Nyquist (24 kHz) but that's 100 dB down (!). Furthermore we don't know if the tiny differences are from the speaker or from the amp or the mic or the mic preamp. I should add that speakers can and do distort (when Xmax is exceeded) but it's not a pleasant sound. Since the displacement of the cone is the inverse of the frequency the low frequencies are distorted which is the opposite of what you want when creating "pleasing distortion". Speaker distortion is flubby, flabby and farty. The Axe-Fx II and III can simulate that, if desired, using the Speaker Drive parameter in the Amp block. I always set it to zero. There are probably some other modes that cause distortion but, again, these are dwarfed by the distortion of the amp. The only other significant one I've experienced is cone cry. Manufacturers go to great lengths to prevent it from happening. I have a speaker here that does it. Whenever I play a high F it cries and it's annoying.

[59] If you want "nonlinear IRs" you need to use something like Volterra kernels. I've experimented with this and, in fact, the Speaker Drive and Speaker Thump parameters essentially create higher order Volterra kernels based on various amp parameters. The harsh reality is that speakers are pretty darn linear. Things like cone cry are design/manufacturing defects and shouldn't be modeled IMO. There is some low frequency distortion and we model that but a classic, linear IR is the gold standard for replicating the sound of a speaker. Armchair pundits like to pontificate about "IRs are the weak link, blah, blah" but they're typical internet loudmouths who have little to no expertise and would fail a double-blind test every time. Now, thermal compression is definitely a factor and we model that quite extensively. As you play the voice coil heats up and its resistance increases. The radiation resistance does NOT change, however, so the resulting acoustic output decreases. There's a lot more involved when simulating a tube amp though and I can't disclose that stuff because it's proprietary. Another factor is the actual speaker impedance. It's a function of displacement. Again, we model this extensively. As the speaker moves in/out of the magnet the Bl product changes and therefore the inductance and various other electrical parameters change.

Videos