Matan Arazi on Music Servers

: Written by: Jeff Fritz; Category: General Interest & Interviews; Created: 01 June 2011

If you’ve kept up with the audio trade shows of late, such as the annual Consumer Electronics Show, and perhaps even wandered into a Magico demonstration to hear their speakers, you’ve probably also heard a pretty ambitious music server that was also demonstrated in June 2009, at the Computer Audiophile Symposium in Berkeley, California. The fellow who developed this unique audio component did so because Magico wanted the best possible source with which to demonstrate their loudspeakers. Since music servers have been an area of increasing interest to audiophiles worldwide, we wanted to learn more about them from Matan Arazi, the designer of the Audeeva Conbrio music server.

201106-1

Jeff Fritz: Can you tell us a bit about your background?

Matan Arazi: I grew up all over the world because my dad was a diplomat. My family was stationed in Tokyo during the late ’80s, which was the peak of Japan’s economic bubble, and when it was the mecca of technology. So I spent my teenage years running around Akihabara and living the tech scene at the time. I played classical piano from a young age, and the high-end audio stores in Tokyo were ever interesting because they were such a great fusion of music and technology. Some systems sounded better than others . . . and there we go -- I contracted audiophilia. Five years later we returned to Israel, where I finished high school and was drafted into the compulsory military service there. Fortunately, instead of wielding a gun, I qualified for an elite unit of the military intelligence corps, where for three years I worked with mathematical algorithms, signal processing, and various communications equipment. It was actually a lot of fun, and in many ways was a direct continuation of my interests prior to the army. It was also when I met Yair Tammam, who introduced me to the world of boutique audio designs, and who later went on to become Magico’s chief technical officer.

After the army, I moved to Los Angeles and was fortunate enough to be part of a few great companies that were eventually acquired by some of the world’s largest technology companies. Around the same time, Yair introduced me to Alon [Wolf], just as Magico started developing the Magico Ultimate speaker. I ordered a pair in 2003.

JF: What are your general thoughts on music servers? Do you think they offer superior performance to disc-based players?

MA: Music servers are to the audiophile what the iPod is to a teenager these days. They bring an evolution of performance and convenience, and can be optimized for both without compromising the other. From the "convenience" point of view, they offer direct access to your entire music library and the ability to easily search for any track, album, composer, etc., with Web browsers, remotes, and tablets. Streaming to multiple zones and integration with smart-home systems are a big plus, too. With a (properly designed) music server you’re only moving electrons around from storage to processing to DAC. In contrast, a disc-based player has a much harder job to do: It has to mechanically spin a disc at variable RPMs, fire and focus a laser, then read its reflection, all while moving a mechanism that tracks a moving sequence of dots that are 1.6 microns wide -- plus having to move electrons around from the sensor output to processing to the DAC. Quoth Einstein: "Everything should be made as simple as possible, but not simpler." So the fewer the moving parts (and a music server can have none), the cleaner the environment, and the simpler and more direct the flow of data, the better the outcome. I’ll take electrons over mechanics any day. So, yes, I do believe that music servers can deliver superior performance to disc-based players.

JF: What are your technical priorities when designing a music server?

MA: Two priorities: The first is that design and improvement must be based on objective, empirical scientific criteria and theory, and the second is an obsessive underlying philosophy that "less is more." The design of both the software and the hardware of my music server reflect this to great lengths. We try to do the minimum and do it -- and just it -- as best we can. This means streaming low-jitter, bit-perfect audio to the DAC while eliminating all sources of interference, be they electrical, mechanical, or programmatic. "Add nothing, subtract nothing" is the goal. In the real world, this means using a low-latency, highly deterministic operating system and real-time, memory-buffered playback software running on low-power, low-heat hardware and lots of isolation, damping, and insulation.

I want to take a second here to explain my views on the concept of "best": I believe the world is relative, and that there is no absolute best. Whenever I say "best," it means the best I could do. Other people may have different opinions, and I respect those -- or they may have different experiences than I do. It’s analogous to wine -- while good wine is clearly better than bad, there is no absolute possibility to ever have a universal "best."

JF: How do you ensure bit-perfect output from a music server?

MA: "Bit-perfect" is merely a stepping-stone to achieving good sound. In the most simplistic terms, it is also the default manner in which a computer treats data -- when you copy a file from your hard disk to a USB stick, it is "bit-perfect" in the sense that each and every bit of the original and destination files are perfectly identical. This can be validated using mathematical algorithms that have an error rate of about 2^80 (that is, the odds of a mistake are approximately one in 12 trillion trillions). To validate bit-perfectness, we created a set of test files with bit patterns that are statistically the most vulnerable to jitter and other corruption. We stream these files and use a multichannel, high-speed logic analyzer to measure the bitstream at the output of the server, at the input of the DAC, and (when using the Pacific Microsonics Model 2) at the digital output of the DAC. The resultant plots show us the signals and any skew between them, and the logic analyzer also captures the datastream, which we then compare to the original file using those same mathematical algorithms. Also, on HDCD-capable DACs, confirming that the HDCD light turns on and stays on is a good (but not definite) indication. Human listening tests follow. But, as mentioned, bit-perfect is just the start; unlike conventional computer files, audio data are very timing-dependent, so we have to make sure that we move the bits to the AES interface (inside its subenclosure) and then to the DAC at the most precise intervals, and without any variance. Less is more: Fewer interruptions and fewer irregularities result in better audio.

JF: How important is noise -- electrical, mechanical, etc. -- in a music server? Is EMI or RFI an issue?

MA: Extremely important. Once we’ve established that a music server is bit-accurate (and virtually all good ones are), and after we’ve reduced jitter to the minimum possible, we have to start looking at the analog aspects of the digital connection to the DAC. Let me explain this for a second: Even though the connection between the music server and the DAC is a digital one, that digital signal still passes through a cable, which is an analog medium. Transmitting a pure digital signal through any analog medium requires infinite bandwidth, which can be achieved only theoretically. Thus, the digital signals are modulated (the exact modulation type depends on the type of connection and cable) so they can be transmitted through a cable, and the modulation/demodulation process can add undesired effects to the signal. In most computer-based scenarios this isn’t a problem, but with a DAC it matters, because DACs bridge the digital and analog domains and are highly susceptible to analog noise. Furthermore, since the computer is an environment full of electromagnetic noise, some of this noise can be transmitted by the cable and work its way into the DAC. Based on our experiments, I believe that differences in the analog parameters of the connection between the music server and the DAC account for the majority of the differences in how different music servers sound, even when using the same DAC. It is this radiated noise (along with differences in grounding) which is also the reason why some people notice differences in the sound when different USB cables are used, or why music servers with solid-state disk drives generally sound better than those with mechanical hard drives. As before, less vibration and less noise mean better quality.

201106-2

JF: Are there any intrinsic advantages to either Mac- or Windows-based computers? What other options are there?

MA: The design of any modern desktop operating system is not ideal for a music server or other real-time devices. This is the reason why many medical, aviation, and aerospace computers use dedicated, minimalistic, real-time operating systems. In a desktop computer, the operating system must handle the screen, the keyboard, the mouse, various applications, etc., all while maintaining the stability of the entire system. This is very useful for a general-purpose computer such as a Windows PC or Mac, but comes at a price.

In contrast, real-time operating systems such as VxWorks, or specialized versions of Linux, can afford to shed many of these safeguards and functionality because they serve dedicated purposes and operate within tightly controlled parameters, and thus can push the envelope. Both Windows 7 and Mac OS X are very good operating systems, and, with some careful tuning -- such as reducing the running processes and drivers to the bare minimum, optimum memory configuration, and so forth -- can certainly sound decent. But one has to understand that there is only so much that can be done. Contemporary desktop and laptop computers are a total overkill for music playback, as any processor can handle the computational requirements for audio playback in its sleep! Making matters worse, all monitors contain inverters, high-frequency drivers, and other electronic circuitry that emits lots of EMI/RFI, as does the graphics card that drives the monitor.

I like the Linux-based players, such as Jesus’ VortexBox and Demian’s Auraliti, because they don’t employ a local graphical user interface, yet bring a streamlined Linux-based platform to the realm of the nongeek audiophile at reasonable cost and deliver great performance. Just to give you an example, a typical general-purpose operating system such as Windows 7 or Mac OS X might have close to a hundred discrete processes running at the same time in its default, out-of-the-box configuration. Careful tuning and optimization might halve that; our proprietary operating system has a total of five processes that run concurrently, with three of them dedicated to the real-time music-playback application. It’s simply impossible to achieve that with anything that is generic.

JF: Have you discovered anything about playback systems relying on Windows or Mac OS X that might help readers optimize their existing computer-based playback system?

MA: Certainly. Continuing the preceding comment, and beating the poor ol’ horse to death: less is more. Proper optimization of either Windows or OS X comprises two main objectives: reducing the number of processes/services/applications/drivers running to the absolute minimum, and configuring the audio path to be as direct and as simple as possible. Reducing interruptions to the CPU (fittingly called "interrupts" in geekspeak) and CPU context switches is also very important, but very hard to achieve. I believe that a dedicated music-playback computer will be a better choice over a general-purpose one because one can optimize it to the business of serving music. In addition, using dedicated playback applications, such as the excellent Amarra or XXHighEnd programs, is beneficial because these programs are carefully designed to optimize audio quality by minimizing various parameters inside the computer that can interfere with the playback or increase noise or jitter.

JF: If you were speccing out your ideal home playback system today, using hardware, software, and operating systems readily available to our readers, what would you use?

MA: It’s hard to come up with a one-size-fits-all answer. If choosing convenience over sound quality, the [Meridian] Sooloos system is the one to beat for its integrated functionality and user interface, but at a steep price. Better sound quality and value can be attained with VortexBox and Auraliti, but they can’t match Sooloos’s user experience. Finally, XXHighEnd or Amarra can make a desktop or laptop into a pretty good music server, but each requires some tweaking of the host computer and operating system for optimal results. The Lynx AES-16 is probably the best interface available today, with the RME HDSPe AES close behind; I favor either of these over a USB or FireWire interface. DAC choice is paramount as well, but unfortunately I haven’t heard many that I like. If going the PC or Mac route, I would definitely use a desktop computer with a high-grade power supply, underclocked CPU and memory, minimal videocard, and as few peripherals as possible, over a laptop. I believe that laptops, by design, are suboptimal for music playback because of their inherent power-saving design and the fact that multiple power supplies, LCD inverters, cooling fans, wireless adapters, etc., are all crammed into the smallest possible space. It’s a lot easier to tweak the hardware of a desktop computer, and you can yank the monitor cable when the screen isn’t needed.

JF: Do audiophiles overestimate the value of any hardware or software in computer playback systems?

MA: Yes, to a degree. The basic premise is that less is more . . . as always, the hardware and software simply need to get out of the way. This means that the simpler the audio path, the more streamlined the host operating system, the lower the latency, and the fewer noise-generating components, the better the sound. This also explains why different hard drives and cables sound different -- they each differently generate, attenuate, or transmit patterns of electromagnetic noise in the computer, or between the computer and DAC. For example, SSDs typically sound better because they take less current than mechanical hard drives, because they require less variance in their current draw from the power supply, and because they don’t create electrical interference and mechanical vibrations when the head assembly moves. This holds true even for memory-playback software, as other processes on the host computer may access the hard drive (or other computer resources) while music is playing, creating interference that can be defeated only by the use of a custom operating system. Grounding, too, plays a significant role.

JF: There are general comments about using, for instance, FireWire for the hard drive and USB for the DAC, due to the use of separate buses for each transmission interface. Any thoughts?

MA: The fewer cables and the fewer layers, the better. I would use the onboard controller for the hard drive and a PCI AES interface to DAC. FireWire is better than USB in theory, but it’s practically dead as far as consumers are concerned. Since the FireWire and USB interfaces are actually PCI devices inside the computer, I would bypass them if possible, as Peter does in his Phasure DAC and its PCI Express umbilical. PCI, and even more so PCI Express, handle the bus arbitration properly, so there is no advantage to using separate standards; quite the contrary -- the added complexity can only, possibly, degrade the sound.

JF: Is there any definitive reason why software-based memory playback might be superior to real-time streaming?

MA: Yes, absolutely. Memory players reduce EMI inside the computer because there are fewer electrons to juggle around while music is playing, as they separately buffer the transfer of data between the file on the hard drive and memory in the first phase, and then memory to the DAC in the second phase. The fewer data that are transferred or processed inside the computer, the less EMI is created by the memory controller, the DMA controller, and the CPU, thereby reducing noise. Memory players need less processing and fewer interrupts while playing music, fewer DMA transfers, and fewer CPU context switches, all of which can help lower latency and collectively contribute to better sound.

JF: What interface do you consider the best for communication with the DAC?

MA: That’s a tough question. I like AES, not because it’s a technically superior interface, but because it has the largest body of solid implementation experience behind it. Implementation is far more important than choice of interface, as a properly implemented "inferior" interface will sound better than a poorly implemented "superior" one. Another reason I like AES is, again, due to the "less is more" paradigm -- it’s a simpler, unidirectional interface, with fewer layers, and one that requires less (and less complex) componentry to decode it right next to the DAC. Properly implemented and properly clocked, AES is as good today as it was 15 years ago, despite the emergence of FireWire and asynchronous USB. Ethernet streaming is almost getting to the cusp of commercial viability as well, and to my mind will be the emerging leader.

JF: Is there anything you can do to minimize jitter coming from the music-server side? What about internal jitter generated by the music server itself?

MA: The same practices that were discussed above apply here. There is no inherent jitter inside the computer in the digital domain, because as long as data are moved around inside the computer, the timing doesn’t matter. Jitter becomes a problem when timing starts to matter, and that happens close to the DAC, either on the interface that carries PCM audio to the DAC, or inside the DAC, where a digital protocol such as USB is decoded into the PCM samples that are then fed into the DAC. We’ve gone to great lengths to isolate the AES interface from the EMI horror that is the modern computer, and also to provide the most deterministic environment, devoid of interrupts and context switches, for the AES interface to take data from the computer and send them to the DAC with minimum jitter and latency. Low latency is required to ensure that the audio playback process is optimally predictable and deterministic in its access to the AES interface. Slaving the entire playback chain to a solid clock source as close to the DAC as possible will almost always provide a sonic benefit, as the DAC is the interface with the outside (i.e., analog) world, and everything must therefore march to its rhythm.

201106-5

JF: I understand you’ve designed and built your own custom music server, which is used by Magico. What can you tell us about it?

MA: Alon, Yair, and I worked together on the crossover for the [Magico] Ultimate, which is an active, five-way, four-horn, one-cone hybrid speaker that presents quite an interesting challenge to the crossover designer. Coming from a computer background, and after being exposed to military-grade data storage and DSP technology -- as well as being an avid believer in active speakers -- I was confident that a computer-based playback and DSP system would fit the bill. So, together with a few engineers who used to work for me, we set off to build the ultimate playback and crossover system -- a "matched pair" to the Ultimate speaker, if you will.

It wasn’t easy -- we started out with a clean slate -- but we were lucky enough to be able to optimize for everything except cost. We set out to create a music server from scratch, designing the hardware and the software with as few compromises as we could. We started out by developing an integrated music player and crossover, and a few years later, around 2006, Alon and I figured we could use the same system as a source for Magico’s passive speakers. We added a bypass to allow audio playback without any DSP elements, and the first prototype, based on the Windows platform, was used together with the Magico V3s in January 2008 at CES, where SoundStage! awarded Magico its Golden Jimmy award for best sound in show. Alas, the sound quality with Windows still wasn’t good enough, and even after we’d rewritten significant parts of the audio path, it didn’t sound the way we wanted.

201106-4

As part of the relentless pursuit of perfection shared with Magico, we ended up reimplementing our prototype algorithms into a streamlined, deterministic system running Linux, one that also optically isolated the music-file storage and processing from the playback components and permitted bit-perfect, interference-free, ultra-low-jitter output at sample rates from "Red Book" [CD] to DSD, with automatic sample-rate switching. This system was used with the Magico M5s at CES 2009, again to great accolades.

As we were finally happy with the quality of the algorithms, we started focusing on the mechanical aspects, so we worked with a company that builds the "black boxes" that fly aboard American ICBMs and other aerospace equipment, to design enclosures that optimally isolate the electronics from the environment. These guys mean business, because aboard an ICBM in flight their electronics must work reliably in very hostile environments, some as high as 180dB SPL, hundreds of Gs, very powerful electromagnetic forces, and wild temperatures. We subjected our designs to a battery of tests, including mechanical shaking, baking, and freezing, multi-Tesla electromagnetic fields, and various programmatic stress tests, all while carefully analyzing the output signal and optimizing for resiliency. Thousands of man-hours plus hundreds of CNC hours later, a prototype of this enclosure was used by Magico at CES 2010 as a source for the Q5s.

We continued the development, adapting the playback software to a custom real-time operating system that was designed by software engineers who developed the operating-system software for the B2 Stealth Bomber, and who built massive data-storage systems for Boeing, Goldman Sachs, and many others. This allowed us to rewrite the lowest layers of the operating system, including the PCI bus drivers, the audio path, and the scheduler logic, in order to optimize for sound quality and avoid the overhead and electromagnetic noise that traditional operating systems like Windows or Mac OS X impose on the digital audio interface. Our real-time, low-latency memory-playback software is embedded in firmware, takes up less than 8 megabytes (including the operating system!), and runs completely within the onboard cache of an ultra-low-power CPU.

Around October 2010, we ended up with a system comprising three airtight, nonresonant enclosures, each weighing approximately 50 pounds and CNC-machined from aircraft aluminum, with lots of additional internal isolation against electronic, mechanical, and thermal interference. The first enclosure houses a few terabytes of storage for all the music files, and it also handles the iPad, remote control, and network interfaces. The second box runs our proprietary real-time operating system and playback application, contains the AES interface to the DAC, and is powered by ultraprecise, linear power supplies made by the company that builds the power supplies for the Space Shuttle and the International Space Station. These power supplies are housed in the third enclosure. An optical interconnect connects the playback enclosure to the storage enclosure, and we stream the data from the latter to the former using a custom-developed, asynchronous, proprietary protocol using raw Ethernet layer 2 packets -- again, to reduce overhead and interference to an absolute minimum, while enabling true galvanic and mechanical isolation between the two enclosures. We also replace the air in the playback enclosure with an inert gas that reduces oxidation and increases thermal stability -- to minimize, yet again, any variance and interference. Needless to say, there are no moving parts inside the playback enclosure whatsoever: no fans, no drives, no USB sticks. There are a few patents pending on the various techniques we’ve developed for the music-server and DSP algorithms.

201106-3

JF: Can you give a few more details about your server -- for instance, file formats and resolutions supported, and what type of user interface you employ?

MA: As described, we decouple the management of the audio files on the hard drives from the playback components, which means that we decode all file formats on the CPU inside the storage enclosure, then optically transmit the raw, native audio samples to the playback enclosure, which streams them to the DAC using AES. As such, pretty much any non-DRM’d file format can be used, at any bit depth, sample rate, and wire speed/format accepted by the DAC, including WAV, AIFF, Apple Lossless, DSD, FLAC, and even MP3 and MP4 audio. We provide a native, graphical iPad/iPhone interface as well as a browser-based one, and also support a dedicated remote control with hard buttons, as opposed to a touchscreen. All of these can be used together, and they all update concurrently. Playlists and collections are also supported, and we’re working on a mood-based music-recommendation feature, as well as integration with smart-home systems.

Editor's note: It has not been determined whether the Audeeva Conbrio music server will become a consumer product. If it does, keep an eye out on Ultra Audio for more information.

. . . Jeff Fritz
jeff@soundstagenetwork.com