Designing the Axis Voicebox S – An Interview with John Reilly

John & Brad at CCI 4
John Reilly (standing) & Marc Phillips (seated, Colleen Cardas Imports)

At the end of the 1990’s John Reilly put Axis Loudspeakers on ice after a successful run that included him becoming Australia’s most awarded loudspeaker designer. The Asian financial crisis had hit hard, and John himself had health problems.

Fast forward some 15 years to 2014. John revives Axis Loudspeakers and quietly releases the astonishing Axis Voicebox S. Since then the Voicebox S has consistently stunned everybody who has listened to it. The reaction here in Singapore was the same, judging by the overwhelmingly positive feedback we got from attendees at our Axis/REDGUM product launch event held in August.

This little bookshelf monitor punches so far above its weight class that even jaded industry veterans are amazed. One pro-audio veteran who builds turnkey recording studios said that the Voicebox S was an “eye opener”. A consumer audio veteran suggested that the Voicebox S has more than a good chance of replacing the legendary ProAc Tablette as the definitive bookshelf speaker.

Before and during the two-day launch event John briefly told us about what prompted him to come out of ‘retirement’ after such a long absence. It was tantalising stuff. So we invited him to take a remote interview with us and expand on the snippets that we heard in April.

The Interview

AR (AudioRev): John, thanks for agreeing to take this interview at short notice.

JR (John Reilly): No problem mate. Happy to do so!

AR: Two questions. First, why? You walk away from the audio industry 16 to 17 years ago and then suddenly return with the Voicebox S in hand.

Second, how? I’m sure that anyone who listens to the speakers will quickly have one thought running through their mind: “How did he do it? How is it possible for two little boxes to sound so spectacular?” Did you perhaps, you know, discover ancient speaker design scrolls while exploring a lost temple?

JR: (Laughs) Nah, nothing of the sort mate. I’ll tackle the ‘why’ first, then the how. How much time do you have? (laughs)

I spent many years in Australia deeply involved in the audio industry doing distribution and retail. And of course I was also designing and producing speakers. When I took that extended leave of absence I relocated to China and helped a friend start up a wholly-owned foreign company doing distribution for a commodity in the mainland.

Audio has been my real passion and hobby. After I moved to China the audio bug was always sort of, you know, sort of nagging at me at the back of my mind. I got restless, and decided to do something which I really wanted to do in the early days, but could not because there were always commercial necessities with the past products.

So that answers the ‘why’ question.

AR: Wait, you may not have been doing what you wanted before moving to China, but you must have been doing something right. I mean, your speakers were good enough to win awards! And if I’m not wrong you have won more awards than any other Australian speaker designer.

JR: (Laughs) Well of course I liked my designs, otherwise I wouldn’t have put them into production. They were the best I could do given the limited resources and constraints that I and all designers work under. But you know, speaker design is a craft, where you learn a little more with each completed design. You get better the more you do it. It’s also kind of like obsessive-compulsive behaviour. After a design has gone out the door, you reflect on what you’ve done then ask yourself what-if types of questions. And then you start on the next design, then the next, and so on. Each successive design tries to answer some of those questions.

So yeah, my best designs at the time I quit the audio industry were good, but they weren’t quite what I was looking for. I had a gut feel for what I wanted but those commercial necessities prevented me from exploring ideas that might have helped me give some shape to that gut feel. You know, pin it down.

Anyway, after putting Axis on hold I couldn’t justify the time – and to some extent the cost – it would take to make a one-off pair of speakers just for my own use. So I spent vast sums of money buying state-of-the-art speakers that I thought might satisfy my wants. And when they didn’t I would dispose of them, sometimes pretty quickly. Like the Rolling Stones, I couldn’t get no satisfaction.

AR: (Laughs) Must have been frustrating to have an itch you couldn’t scratch. Sort of like a guy needing a car to get around. Any car will do to meet the need, but he wants a high-performance car. Except that none of the high-performance cars he buys is able to satisfy the want because the guy isn’t able to clearly define what he wants.

JR: (Laughs) Yeah, yeah. You should talk to Ian (Robinson) about fast cars mate.

Eventually I figured out that what I really wanted were two speakers. One speaker would be a small, articulate speaker that had all the virtues I believed the best speakers must have. The second speaker would be a larger one that had all the qualities of the small speaker, but would also be able to produce more low frequency detail. These two speakers would give me what I had been striving for in all my designs. My life-time quest.

AR: John, you’ve missed out one important requirement. In one of our chats you told me that the rocker and drummer in you demanded that these two speakers be able to play loud! Like, rock concert level loud.

JR: Oh yeah. That goes without saying doesn’t it (laughs). And by the way, it means that in most circumstances you’d have to use a solid-state power amp. The pre-amp can be anything, including valves. But you’ve got to have the solid-state muscle because there is no other way to enjoy music at…ah…good levels (laughs).

Well, anyway, after blowing all that money it was obvious that I would have to design and build these speakers myself. One-off. Just for my own use at home. Absolutely no intention to manufacture them as a commercial product. It would be a personal statement about what I had learned about the craft of speaker design during my decades in the industry.

AR: And so the search began.

JR: Yes, the search began.

I was working in Shanghai at the time. And because of my personal interest in audio I eventually got to meet Max Ding. He’s the designer and manufacturer of the Fountek ribbon tweeter I ended up using. He had just gone out on his own after leaving Aurum Cantus. They used his original ribbon tweeter designs in their speaker line-up.

Then a bell went off, because up to the point of meeting with Max I didn’t know of any ribbon tweeters that were reasonably priced. But Max mentioned that his tweeters were reasonably priced. I really didn’t know if they were any good, but I could see that he was a dedicated audiophile, and that we had things in common. So I bought some tweeters from him and brought them with me on one of my trips back to Australia.

I sat on these tweeters for a while. And I mentioned them to some friends, one of which was Brad (Serhan). Brad was keen to take a closer look at them, so I asked him to design a simple crossover and get a smallish woofer to integrate with the tweeters. He did some measurements and he suggested it would be pretty straightforward as the tweeters measured very flat with a very basic cap to protect the tweeter.

Brad also mentioned that he was toying around with a woofer with a NOMEX paper cone, so I asked him to do his best to match the woofer with the ribbon.

AR: You prefer NOMEX paper cones to plastic ones? Wait let me guess. They are light and stiff, yet relatively cheap because they are not made from expensive exotic materials. And they don’t have the mid-range glare that is characteristic of plastic cones.

JR: Right! Except for the fact that paper cones are not necessarily always cheap! Look, the thing is that paper cones hardly add any sonic character to the sound…

AR: …which means one less issue to have to deal with!

JR: Too right, mate! There are more than enough to deal with as it is. So, anyway, the simple crossover didn’t work. And the second one didn’t work either. The two drivers weren’t integrating well. I knew what I wanted to hear and it clearly wasn’t there. So we had to do more work.

AR: John, how did you decide on the dimensions and volume of the enclosures?

JR: The same way all speaker designers do nowadays. We used a speaker design program to get the dimensions for the enclosure. It also suggested the length and diameter of a basic reflex port. And this is where we made another design decision. Instead of using a cheap plastic port I decided that we must have the port tube fabricated from costly treated cardboard.

AR: Because a plastic port would have resonances?

JR: Not so much resonances, but they can vibrate and I didn’t want to take the chance of this happening.

I was now ready to tune the woofer to the cabinet. This is not an easy process. The port length suggested by the speaker design program is just a guide. The exact port length is then usually arrived at by testing in an anechoic chamber. But there aren’t many anechoic chambers available for rent in Australia, and they aren’t cheap to rent!

So I tuned the port length by ear in the best anechoic chamber I know — outside in an open field on a windless day. This is not easy, I can tell you now. But I prefer it anyway because it guarantees that I hear only the driver. Not the box, and not reflections from nearby surfaces. If I heard what I was listening for then I knew for sure that it was the real thing and not an artefact caused by the test environment.

AR: You must have good friends in the weather bureau!

JR: (Laughs) Well let’s just say that scheduling test sessions was sometimes a hit-or-miss affair.

So, with the port properly tuned I now knew that the bass driver in the enclosure was performing the way I wanted it to.

In the meantime, Brad had finished building a new crossover network. My initial reaction was ‘Blimey! This is BIG!’ But there was immediately an improvement. The stitch was there. Not yet exactly what I wanted, but ok for the time being. Brad and myself had many trial-and-error listening sessions over the next six months.

Eventually we both agreed that there were certain resonances that needed to be tamed. Great! We now had specific targets to shoot at. Brad came up with some Zobel networks that I could play around with by changing components. This went on for another six months or so. Brad and I would have listening sessions that sometimes stretched into the early morning hours.

You see, what I wanted was for the Voicebox S to be precise with detail. It had to be able to convert the tiniest wiggle in the signal into a sound wave…

AR: …with no overhang or ringing.

JR: No, mate. That would require the the drivers to be perfect, and these don’t exist. All you can aim for is the least possible overhang or ringing. Do it right and the distortions won’t be audible.

I also wanted the Voicebox S to be able to construct the phantom scene between the speakers so well that the speakers disappear completely. The listener must not be able to localise any part of the sound to either one of the speakers. This means the speaker must have have linear phase. Broadband, no less.

My personal belief is that tuning for linear phase can only be done by ear and according to what I needed to hear. Computer programs can’t measure phase group delays over the entire frequency range at the same time. It’s just not possible, unless you use a DSP. All they can do is measure phase errors one frequency at a time. And then give you an average phase error at the the individual frequencies. It can’t give you the phase error between, say, any two frequencies let alone the entire audio frequency spectrum.

AR: Fascinating! John, can you describe what it is your ears are listening for?

JR: It’s really hard to describe mate. Best that I can do is say that I’m aiming for reproduction of voices and acoustic instruments that you can’t tell apart from the real thing. For this I’m drawing on my decades of attending live performances in all kinds of venues. Recording studios, small clubs, large auditoriums, and so on.

The Voicebox S is a really fast speaker because of its broadband phase correctness and the fact that the bass has been tuned as finely as possible so that there is no slowing down of what is being reproduced in the bottom frequencies. When I got this right both Brad and I agreed that we had something special. And that I had taken a somewhat different approach to designing a very accurate loudspeaker. Some might say that my methods are unorthodox. But you know, who cares! I got what I wanted and that’s what matters.

AR: Some audiophiles and critics would say that highly accurate speakers like the Voicebox S are too clinical. I’ve never been sure what ‘clinical’ means, but if it means that the loudspeaker reproduces the recorded audio warts and all, then surely this is what we should be expecting and embracing? Or is it an issue of personal preferences?

JR: Difficult question to answer, mate. Maybe it has something to do with personal preferences. An audiophile who likes to attend live performances is likely to want an accurate speaker. Speaker accuracy may matter less to those who listen mainly or only to studio recordings where the product is dependent on the tastes of the artiste and to some extent the recording engineers.

AR: Could you expand on this, please?

JR: OK. Well, my preferred listening experience is for the music to be as close as possible to the real performance. You can do this in two ways. You can smoke some…ah…relaxants, and then even a TV speaker sounds like live music! The second way is to attend only live performances so that you don’t have to imagine what it would be like.

AR: (Laughs) The first way isn’t an option here that’s for sure. The second way isn’t practical unless you have limitless time and money on your hands.

JR: OK, so let’s agree that the reference is ‘live’ music. The alternative is recorded music. In recorded music you have two kinds: Live recordings, could be acoustic, could be amplified. And studio recordings, which could also be acoustic, or could be amplified.

Then we start to run into issues. No matter what is recorded in a live performance, what the microphone captures will include reflections of the venue where the recording takes place. But that’s good, because that’s what you want to hear. Now play this back in your listening room. You now have the reflections captured in the recording, but you also have the reflections from your room. If you want to hear what was recorded, then you need to control the room reflections and reduce them to a level which your brain can then filter out. Keeping in mind all the time that the Voicebox S is only transferring what is being put into it as this was the only criteria for the design.

In a studio recording there are other issues. Currently fancy electronics are available to do all sorts of things to voices and also instruments and room reverb. So making a decision on a loudspeaker is virtually impossible because there is no real reference. You have the artificial acoustics which the recording engineers add to the recording. Then you add the acoustics of your listening room. What’s the result? The sounds become a big jumble. So how do you choose a speaker? It’s impossible!

So, get the speakers in position first and do basic room correction. Then fine tune to your taste.

AR: So what you are saying in a nutshell is that you want to be hearing more of the direct sound from the speakers and less, not nothing, from the room. Some people mistakenly think that the listening room should have the same or similar acoustic to that of a recording studio. Which is actually a bad acoustic for listening to music. Visitors to our demo room are always surprised at how live it is.

JR: Yes, and I hope that you are telling them that the typical home listening room has most problems in the 80Hz to 120Hz range, where your broadband bass traps are designed to be most effective. Your diffusors are great at controlling reflections from the mids up.

AR: It is surprisingly difficult to convince people to spend money on fixing the room first before dropping a few thousand dollars on tweaks, or even more for new electronics. We have a high-end customer in Germany who says that the bass traps and diffusors he bought from us was the best money he ever spent on hifi. Says it was like upgrading by buying a whole new system. Sorry, I didn’t mean for this to sound like a plug.

John, anything else to add before we wind up?

JR: Oh yeah. I wanted to say one more thing. About harshness in recordings.

When you listen to a live rock band, which is amplified and therefore loud, there is never an occasion where you would say it sounds harsh. Same for a live classical concert. Violins, trumpets or vocals never sound harsh. The problems are all in the reproduction chain for recorded music. From the microphone to the speakers to the room to the electronics.

Most speaker designers are designing speakers to compensate for these shortfalls. So you end up with speakers that are mild and polite so as to not reveal the harshness in these recordings. This is a big compromise in the pursuit of high-fidelity audio.

I don’t believe in covering up for the sins of others. So I ended up with the VoiceBox S. Good recordings must sound good, bad recordings must sound bad. What I want to be able to do is create a loud amplified rock performance in your listening room, but without the harshness.

AR: Wow! Imagine that. Led Zeppelin live in your living room!

John, thank you very much for taking this interview. We could never have imagined that you’d share all of this with us. To put a stake in the ground and say “This is John Reilly.”

JR: No problem mate! Thanks for listening to me talk for so long!

Enter your email address to follow this blog and receive notifications of new posts by email.