The circuit described here is an SPDIF output with an USB interface. It allows to connect Personal Computers, Tablets or Smartphones to Audio equipment including AV receivers, high-end stereo amplifiers or stand-alone audio DACs. It has a infrared remote control receiver that allows to control media playback and audio volume.
The SPDIF bitstream is generated in software on the same microcontroller that also provides USB connectivity. The SPDIF output is thus a single chip solution that can be implemented with very low hardware effort but is – compared to special USB audio ICs – still flexible and “hackable”. It is even possible to built it on a small breadboard (see below figure).
The PIC32MX270 microcontroller has been chosen for this project. It has the peripherals needed for USB audio applications an enough RAM to store encoded SPDIF frames. In addition, there are low pin count variants of these chips that allow for simple PCBs.
The source code can be obtained from https://github.com/kiffie/usb-spdif and used under the terms of the GPL.
A small PCB has been developed that includes, besides the microcontroller, supply circuitry for powering over the USB cable, optical and electrical SPDIF outputs, a remote control receiver and LEDs for showing whether the SPDIF output is active and for indicating the audio sampling frequency. The PCB has been designed to fit into small enclosure.
The audio part includes an USB interface according to the USB audio class , , the SPDIF encoder implemented in software, and the SPDIF output using an SPI of the microcontroller. A DMA channel continuously copies the SPDIF frames from a circular buffer to the SPI. The diagram below shows the software and hardware components of the audio part.
Because the standard USB audio specification is implemented, the SPDIF interface does not need any custom device drivers to be installed on the host. The device class definition for audio devices is quite complex. Therefore, the USB-IF has released the USB Audio Device Class Specification for Basic Audio Devices , which includes a small subset of the original Audio Device Class Specification. The USB audio interface implemented here is quite similar to the headphone device described in in . However, additional sampling rates have been added, the volume control has been removed and the USB descriptor has been adapted so that is describes an SPDIF output rather than a speaker. The USB audio interface has the following features:
- three possible sampling frequencies: 44.1 kHz, 48 kHz and 96 kHz
- one audio format: 2 channels, 3 bytes (24 bits) per channels (S24_3LE)
- a mute control to mute the audio output
These features are specified in the USB configuration descriptor (file usb_descriptors.c of the software). Details on how to set or modify these descriptors can be found in ,  and .
Because this project does not aim at reaching a formal USB compliance certification, we chose USB vendor ID and a product IDs that can be used without the risk to cause conflicts with official USB products
SPDIF encoder and serial output
The USB interface uses the isochronous USB endpoint 1 to transfer audio samples to the SPDIF interface. That is, one USB packet is transferred in each 1 ms USB frame. When the USB hardware of the MCU has received a new isochronous packet, an ISR is called that converts the individual PCM frames contained in the USB packet into respective SPDIF frames. Details about SPDIF can be found in  and .
The following diagram illustrates how the software SPDIF encoder works.
On the top of the diagram, the 48 bit PCM frames are shown, each frame containing a left channel and a right channel audio sample. The samples are converted into an intermediate 32 bit representation of SPDIF subframes by adding a four bit preamble tag specifying the type of the SPDIF preamble (X, Y or Z) to be used when encoding the SPDIF subframe and by appending the four bits V, U, C and P. The bits V and U correspond to a validity indicator and to user data bits, respectively. They are always 0 in this implementation. The C bits carry the channel status data and the P bits are parity bits for the individual subframes. The parity bits P are calculated for every subframe using the optimized algorithm described in .
Two subsequent SPDIF subframes correspond to an SPDIF frame. 192 subsequent SPDIF frames constitute one SPDIF block. The sequence of bits U and C are repeated in each block. Accordingly a channel status structure obtained by concatenating the C bits of one block has 192 bits (24 bytes). Here, the channel status bits are used to indicate the momentary sampling frequency to a receiver connected to the SPDIF output.
SPDIF applies biphase-mark coding for transmitting the data and special sequences of eight code bits that contain code violations for encoding the preambles X, Y and Z. Because two code bits correspond to one data bit (the code rate is thus ½), an encoded SPDIF subframe has 64 bits.
A simple implementation of biphase-mark encoding would require iterating over every bit of a SPDIF subframe, which is very time-consuming. Therefore, two different look-up tables LUTF and LUT are used to encode one byte at once. Look-up table LUTF is used to encode the first byte of an SPDIF subframe and look-up table LUT is used to encode the remaining three bytes of the subframe. The special look-up table LUTF contains the preamble bit patterns. Because lookup-table LUT is used three times to encode one SPDIF subframe, it is shown three times in the figure, but the software maintains a single instance of look-up table LUT only. The loop-up tables are generated by an initialization function executed once after a reset of the microcontroller.
The biphase-mark code requires that there is a transition between each pair of code bits representing one data byte. To make sure that these transitions occur, the 16 bit code words obtained from the look-up tables are inverted depending on the value of the last bit of the previous 16 bit code word.
Another function of the loop-up tables is bit order reversal. According to the SPDIF standard, the subframes are transmitted LSB fist but the SPI transmits MSB first. The loop-up tables thus return the code bits in reverse order so that they can be correctly transmitted by the SPI hardware.
The so obtained 64 bit SPDIF code word is then inserted into a ring buffer maintained in the SRAM of the microcontroller and outputted by the SPI via DMA. The data output of the SPI is connected directly to the electrical and optical SPDIF transmitters.
SPDIF bit clock
Nominal bit clock selection based on USB packet length
A programmable fractional frequency divider integrated into the microcontroller generates the SPDIF bit clock from an internal 96 MHz clock signal. The required SPDIF clock frequency is
f_SPDIF = 128 f_s
where f_s is the audio sampling frequency.
The USB audio specification does not include explicit signaling of the sampling frequency f_s chosen by the host to the USB device. Consequently, the software running on the microcontroller detects based on the number of PCM frames contained in the USB isochronous packets. Note that the USB configuration descriptor contains a list of these three allowed sampling rates. A compliant host should thus not use any other sampling frequency than 44.1 KHz, 48 kHz or 96 KHz. As a consequence, the microcontroller can easily guess the sampling rate chosen by host from the number of PCM frames contained in an USB packet.
Clock frequency control based on circular buffer fill level
According to the USB basic audio devices specification , the “synchronous” synchronization type has been chosen for the isochronous audio endpoint. In other words, the sampling rate of the audio stream depends on the USB clock domain, i. e. the frequency of the SOF tokens outputted by the host. The SPDIF bit clock, however, is derived from the local crystal oscillator. To avoid overruns or underruns of the circular buffer, the software regularly adjusts the fractional part of the frequency division ration based on the average fill level of the circular buffer. Currently, a simple three-step controller is used for that purpose.
Although Microchip mentions in their Application Note AN1422, section “Tuning Reference Clock Output”  that the reference clock can be “tuned on-the-fly”, we experienced problems when adjusting the fractional part of the divider when using low clock division ratios, i.e. when selecting the 96 kHz sampling frequency. Each time when the division ratio was adjusted, a slight interruption occurred in the clock signal, which led to audible artifacts.
To design around this problem, we added an external connection between the output of the reference clock generator REFCLKO (pin 6) and the clock input SCK (pin 26) of the SPI, thereby bypassing the baud rate generator of the SPI, which has a minimum clock division factor of two. The external connection between REFLKO and SCK allows thus for higher clock division factors within the reference clock module of the microcontroller.
Infrared remote-control receiver
The infrared remote-control receiver (RC receiver) is completely independent from the audio part, that is, events related to the RC receiver are just sent to the host and do not influence the audio part directly.
To decode the infrared signal, the open source library IRMP  is used. This library can decode many different infrared remote-control signals. The microcontroller software includes an USB HID  implementation that allows to send codes related to the Consumer Page as specified in the HID usage tables document .
A keymap table forms the link between IRMP and the HID implementation, defining a mapping between codes detected by IRMP and the usage IDs specified in . We decided to add codes according to the RC5 standard. As it is not very often used anymore but should still be available on many universal remote-control transmitters, we believe that it will work in many cases without disturbing other devices like TV sets or the like. Some codes used in RC5 are listed in . In addition, codes for a Denon remote control transmitter have been included into the keymap.
The keymap can be found in the source file irhid.c under the name irhid_keymap and can easily be customized by adding lines or remove them and then recompiling the software. To add a key, you need to find out the IRMP codes for the protocol, the address and command. You can get these codes by connecting a terminal emulator (115200 bits/s) to the SPDIF output device and pressing the key you want to add. You should now see the codes on the terminal emulator. If not, support for the respective RC code may need to be activated by changing the file irmpconfig.h. In addition, the respective USB HID usage ID to which the key is to be mapped is needed. The usage ID can be looked-up in .
 Universal Serial Bus Specification Revision 2.0, https://www.usb.org/document-library/usb-20-specification
 Universal Serial Bus Device Class Definition for Audio Devices, Release 1.0, https://www.usb.org/document-library/audio-device-document-10
 Universal Serial Bus Audio Device Class Specification for Basic Audio Devices, Release 1.0, https://www.usb.org/document-library/audio-device-class-spec-basic-audio-devices-v10-and-adopters-agreement
 Universal Serial Bus Device Class Definition for Audio Data Formats, Release 1.0, https://www.usb.org/document-library/audio-data-formats-10
 Universal Serial Bus Device Class Definition for Terminal Types, Release 1.0, https://www.usb.org/document-library/audio-terminal-types-10
 AES3-2003: AES standard for digital audio – Digital input-output interfacing – Serial transmission format for two-channel linearly represented digital audio data
 Crystal Application None AN22: “Overview Of Digital Audio Interface Data Structures”
 Sean Eron Anderson: “Bit Twiddling Hacks”, section “Compute parity in parallel”, http://graphics.stanford.edu/~seander/bithacks.html#ParityParallel
 Microchip AN1422: “High-Quality Audio Applications Using the PIC32”, http://ww1.microchip.com/downloads/en/AppNotes/01422A.pdf
 Infrared Multi-Protocol decoder (IRMP), https://www.mikrocontroller.net/articles/IRMP
 USB Device Class Definition for Human Interface Devices (HID), Version 1.11, https://www.usb.org/document-library/device-class-definition-hid-111
 USB HID Usage Tables, Version 1.12, https://www.usb.org/document-library/hid-usage-tables-112
 A.N. Other: “IR Remote Control Codes (1)”, in Elektor Electronics issue 3/2001, https://www.elektormagazine.com/magazine/elektor-200103/16978
From the lab
This project is based on a project here on labs called ‘SPDIF audio output for Android’ (https://www.elektormagazine.com/labs/spdif-audio-output-for-android). As this title reveals the circuit was specifically designed for Android devices. In principle the original firmware worked with various tablets and phones but not all of theme, not even all from the same manufacturer. So, there's a risk the circuit won’t work with all kinds of phones and tablets. The author was asked if the firmware can be adapted to make the device a generic USB Audio SPDIF Interface. And so, he did. With a few changes to the schematic and new firmware the interface now also supports 24 bit and three sampling frequencies: 44.1 kHz, 48 kHz and 96 kHz. The latter is the maximum possible with a full speed USB interface. For use with an Android device now an OTG cable must be used. But that shouldn’t be a problem, a cable must be used anyway to connect the interface. The interface was successfully tested in the following operating systems: Windows 7, Windows 10, Linux (Lubuntu, Kubuntu), Android and even Raspbian on a Raspberry Pi 3 Model B+ using 2019-09-26-raspbian-buster-full.img.
Windows recognizes the interface automatically and the sample rate can be set in the Control Panel under Sound. Select properties of ‘SPDIF Interface’ and in the Advanced tab 3 formats can be selected as is shown in the following screendump
In Linux (Lubuntu was used) the sampling frequencies depend on the file. Using the simple GNOME MPlayer a 44.1 kHz file is played with 44.1 kHz at the SPDIF output. But ‘USB_SPDIF Digital Stereo (IEC958) (PulseAudio)’ must be selected as Audio Output every time in Preferences, even if it was selected already. Very annoying. To avoid this, turn all other audio devices off (here: Sound & Video – PulseAudio Volume Control – Configuration). A 48 kHz file and a 96 kHz file are played with 48 kHz. Another way is using ALSA directly without PulseAudio in a terminal.
If there’s no sound first start alsamixer in a terminal (Ctrl+Alt+T and enter ‘alsamixer’) and press F6 to select sound card USB_SPDIF (only one small control will be shown displaying 00 and PCM since there’s no volume control in the interface) and close the mixer. Then enter:
aplay -D plughw:CARD=USBSPDIF //…/
This will only play uncompressed wav files. To play mp3 from the command line files consider using mpg123, but there are numerous other ways.
Sadly, after all these years USB Audio DACs exists the default handling of USB-Audio in Linux still needs improvement. Setting the output sampling frequency is also possible by changing the configuration file of PulseAudio (/etc/pulse/daemon.conf). A lot of information about how to do this can be found online. If you try, always make a backup of the original file first. Just in case something goes wrong.
Connect the interface through an OTG cable. Perhaps the best way to use the USB-SPDIF interface is to install an application called USB Audio Player Pro. It bypasses audio limits of Android. All 3 sampling frequencies of the USB-SPDIF interface are supported. After starting the app, the interface is recognized immediately. Playing 3 files with all 3 sampling frequencies worked without any problem, except for a stutter at the beginning of playing.
Finally, the USB-SPDIF Interface was also tested with Raspbian. Like Lubuntu Raspbian is a free operating system also based on Debian and so there’s no surprise that testing was a repetition. So, to play an uncompressed wav file:
aplay -D plughw:CARD=USBSPDIF //…/
To end playback early enter Ctrl+c.
To play mp3 files from the command line, in the fresh installed image of Raspbian (2019-09-26-raspbian-buster-full.img), mpg123 (if not installed: sudo apt-get install mpg123) only worked after installing PulseAudio and its volume control. To install PulseAudio:
sudo apt-get install pulseaudio
Install PulseAudio Volume Control:
sudo apt-get install pavucontrol paprefs
Reboot after installing has finished.
Like in Lubuntu disable the onboard audio device in Sound & Video – PulseAudio Volume Control – Configuration
The pre-installed VLC Media Player only produced sound after installing PulseAudio as well.
A 32 kHz sampled file is played at 96 kHz.
Measurements were mainly performed in the digital domain since the output signal is digital. Of course, a DAC was connected also to have a look at an analog signal. Testing was done using various audio players in the before mentioned operating systems. Some players produce artifacts when the volume is set to 0 dB, others call it 100 %. The volume control is in software only as this interface doesn’t have a volume control build in. This means that something is different (or simply wrong?) in processing computer-generated full-scale encoded files in these players. As an example, in Raspbian playing such a full-scale file with VLC media player at 100 % shows a spectrum that reminds of clipping and is this is also audible. At lower volume settings the spectrum is much cleaner. A 96 kHz sampled file is resampled to 48 kHz and shows numerous harmonics at -100 dB.
Plot 1 shows a FFT of a computer generated 1 kHz full-scale sine wave measured at the SPDIF output with a sampling frequency of 96 kHz in Raspbian using the command aplay in a command line interface. A clean spectrum is visible, as it should be.
Plot 2 shows a FFT of a computer generated 1 kHz full-scale sine wave measured at the SPDIF output with a sampling frequency of 96 kHz in Raspbian using VLC media Player with a volume setting of 100 %. The numerous harmonics remind of serious clipping. But maybe also a result of the original sampling frequency of 96 kHz is resampled to 48 kHz.
Plot 3 shows a FFT of a computer generated 1 kHz full-scale sine wave measured at the SPDIF output with a sampling frequency of 96 kHz in Raspbian using VLC Media Player at a volume setting of 95 %. The spectrum is much cleaner but it should be free of harmonics. This is not as it should be and most likely caused by resampling to 48 kHz.
Plot 4 shows a FFT of a computer generated 1 kHz full-scale sine wave measured at the SPDIF output with a sampling frequency of 48 kHz in Raspbian using VLC Media Player at a volume setting of 95 %. The spectrum is almost clean because there’s no resampling necessary.
Plot 5 shows a FFT of a computer generated 1 kHz full-scale sine wave measured at the SPDIF output with a sampling frequency of 48 kHz in Raspbian using VLC Media Player at a volume setting of 100 %. The spectrum shows numerous harmonics, and maybe caused by a rounding problem of the data?
Similar measurements can be made with other player in different operating systems.
Power supply current depends on whether both leds are on or off, minimum is about 36 mA and maximum is about 43 mA.
The PCB is designed to fit in the aluminum enclosure 1455D601 from Hammond Manufacturing. Dimensions of the enclosure are 60 x 42.5 x 23 mm. It can be slid into the enclosure. Depending on tolerances of PCB and enclosure manufacturing the two sides may need some filing to fit. Appropriate holes must be made in the two side panels. It’s better not to use the plastic bezels especially because of the micro-USB connector, it will otherwise be to deep inside the enclosure for a mating connector to fit properly.
R5 should not be mounted. USBID is not used in the current software.
For the coaxial output a transformer is used to prevent LF earth loops and not to create a galvanic isolation. To minimize RF noise the output must be decoupled, hence the need for C12. The advantage of a full bridge (IC2) is a larger primary voltage and better coupling to the secondary side, better covering of the core by the primary winding and thus better coupling between primary and secondary windings. The advantage of using EXOR gates as drivers is they can be used as inverting and non-inverting buffers by connecting one of the inputs to a high or low level respectively. In both cases propagation delay is the same and makes this circuit an almost ideal full bridge driver. A 74AC86 is used for speed. With the winding ratio of 13:2 the output voltage is almost exactly 0.5 V over a 75 Ω load. It takes less than 30 cm of 0.5 mm enameled copper wire and this transformer is very easy to reproduce. If the four connections are not made too long more than 3 cm of wire is left. Splitting the primary in two halves puts the connections of both windings on opposite sides and reduces common mode coupling.
Bill of materials
R1 = 10 kΩ, 0.1 W, 1 %, SMD 0603
R2 = 1 kΩ, 0.1 W, 1 %, SMD 0603
R3,R4 = 470 Ω, 0.1 W, 1 %, SMD 0603
R5 = 0 Ω, 63 mW, SMD 0603, do not mount
R6 = 270 Ω, 0.1 W, 1 %, SMD 0603
R7 = 75 Ω, 0.1 W, 1 %, SMD 0603
R8 = 47 Ω, 0.1 W, 1 %, SMD 0603
C1,C2 = 33 pF, 50V, 5 %, C0G/NP0, SMD 0603
C3,C9,C14 = 10 µF, 16V, 20 %, X5R, SMD 0603
C4-C7,C10,C13 = 100 nF, 50V, 10 %, X7R, SMD 0603
C8 = 2.2 µF, 10V, 10 %, X7R, SMD 0603
C11,C15,C16 = 1 µF, 50V, 10 %, X5R, SMD 0603
C12 = 47 nF, 100V, 10 %, X7R, SMD 0603
L1 = 600 Ω @ 100 MHz, 0.15 Ω, 1.3 A, SMD 0603 (Murata, BLM18KG601SN1D)
TR1 = Toroid, 12.5x5mm, material T38, Epcos B64290L0044X038
LED1,LED2 = LED, green, 3 mm, through hole
D1 = PMEG2010ER, SMD SOD-123
IC1 = PIC32MX274F256B-I/SO, SMD SO-28
IC2 = 74AC86, SMD SO-14
IC3 = FCR684208T, Toslink (Cliff Electronic Components)
IC4 = TSOP4136
IC5 = MIC5504-3.3YM5-TR, SMD SOT-23-5
K1 = RCA, socket, PCB mounting, THT (Pro Signal, PSG01545)
K2 = Micro USB type B, receptacle, SMD (Molex, 47346-0001)
K3 = 1x6 pin header, vertical, pitch 2.54 mm, THT
K4,K5 = 1x2 pin header, vertical, pitch 2.54 mm, THT
X1 = 8 MHz, 5x3.2mm, SMD (Abracon, ABM3B-8.000MHZ-B2-T)
TR1 = 0,3 m enameled copper wire, diam. 0.5 mm
Enclosure = Hammond 1455D601, 60x42.5x23mm
PCB 180027-1 v2.0