Transmit Internet Radio to AM with an ESP32-S3

ESP32-S3 firmware for streaming HLS/AAC internet radio to AM radio via sigma-delta encoding. 8 selectable radio stations WiFi credential management Real-time AAC audio decoding Automatic audio compression Sigma-delta encoding for AM radio transmission

Streaming internet radio to a vintage AM receiver combines modern digital signal processing with classic analog radio technology in a fascinating way. In this article, I present a complete AM transmitter based on the ESP32-S3 microcontroller that fetches HLS/AAC audio streams over Wi-Fi, decodes them in real-time, and transmits them via a small ferrite antenna on the AM broadcast band. The entire modulation process is performed digitally using a sigma-delta encoding algorithm, with the final RF output generated directly from an I2S peripheral.

The motivation behind this project was to breathe new life into old AM transistor radios while exploring the capabilities of the ESP32-S3 dual-core architecture for real-time audio streaming and RF generation. Additionally, the device includes a built-in test signal generator with selectable audio frequencies, useful for performance testing and possibly AM receiver alignment.

A short demonstration video showing the device in operation is available as an attachment to this article.

Hardware Overview

The hardware is remarkably simple, centered around an ESP32-S3-WROOM-1 module with 16MB flash and 8MB PSRAM (schematic in Figure 1). The PSRAM is essential for buffering the decoded audio stream and ensuring continuous playback without dropouts. The main user interface consists of an 8-position binary-encoded rotary switch connected to GPIOs 40, 41, and 42, allowing selection among eight different radio stations (or one among eight different Wi-Fi credentials when combined with the BOOT button press). A WS2812B RGB LED provides visual feedback for station changes, Wi-Fi credential saves, and timeout events. A jumper connected to GPIO 3 enables the sinusoidal modulator, with frequency selectable by the rotary switch.

The RF output section deserves special attention. The I2S peripheral outputs a digital bitstream at 1.92 Mbit/s on GPIO 17, double of the 960 kHz default RF frequency. This signal is level-shifted from 3.3V to 5V, buffered with a totem-pole output stage and injected on an RF resonating ferrite antenna. Level shifting and buffering are performed with a gate-driver IC (I had a few available...). Other solutions are possible, but a totem-pole driver is highly recommended: I initially tested a simple single-ended driver using a 2N2222A BJT, but this configuration exhibited significant linearity errors—approximately 10% harmonic distortion at 80% modulation depth. Even with the current design, with the resonator perfectly tuned I measured a 2nd harmonic distortion of ~3%, but when slightly detuning the ferrite antenna resonator this distortion disappeared: I have conjectured that the phenomenon may be caused by the ferrite nonlinearity: at perfect resonance, due the high Q factor of the resonator, the ferrite core can reach high flux densities. Clearly, this can give different results on other prototype samples.

The antenna is a ferrite rod assembly salvaged from an old AM transistor radio (Photo 1). The ferrite rod measures 8 mm in diameter and 110 mm in length. The resonating winding (secondary) consists of approximately 100 turns of enameled copper wire wound over a 24 mm section, while the excitation primary is just 2 turns. The resonating capacitor, also recovered from an AM radio, uses both sections in parallel. Resonance tuning is straightforward: I simply adjusted the variable capacitor while monitoring signal strength on a receiver placed 2-3 meters away, maximizing the received signal level.

The radiated power is intentionally kept very modest—reception is reliable up to several meters distance. This low power keeps the device well within legal limits for unlicensed operation in most jurisdictions. Should anyone wish to amplify the output for greater range, appropriate filtering would be essential to avoid interference with other services, because the sigma-delta encoding pushes noise far from the carrier. This is visible in the spectrum analyzer measurements, taken by placing a single loop of copper wire close to the ferrite antenna: Figure 2 shows the close-in spectrum in a 10 kHz bandwidth in the presence of a 562.5 Hz sinusoidal modulation, while Figures 3 and 4 show wider spectrum spans (100 kHz and 1 MHz) in the first case with audio streaming, in the last with 4.7 kHz modulation (note that carrier level varies from one measurement to the other, because the probe loop wasn't in a stable position). Out of band spectral emission is clearly not an issue for the minimal power radiated by this device, but would require more filtering in case of higher power and a larger antenna.

Software Architecture

The firmware is written in C using the ESP-IDF 5.5.1 framework and leverages the ESP32-S3's dual-core architecture for optimal real-time performance. Core 1 handles the time-critical audio processing and sigma-delta encoding, while Core 0 manages HTTP downloads and AAC decoding. This task distribution, combined with carefully chosen FreeRTOS priorities, eliminates audio dropouts even during intensive network operations, provided that a good internet WiFi connection is available.

The audio pipeline follows this sequence: an M3U8 playlist parser fetches HLS segment URLs over HTTPS, a downloader task retrieves AAC audio segments (typically 410 KB files containing ~10 seconds of audio), an AAC decoder converts these to 48 kHz stereo PCM using Espressif's esp_audio_codec library, and finally, the audio processing task performs dynamic range compression and sigma-delta AM modulation before outputting the result via I2S.

One particularly effective optimization worth mentioning is the M3U8 bandwidth reduction strategy. Since HLS playlists are typically ~70 KB and change every ~10 seconds without notifications, the software regularly polls the file by reading just the first 4 KB chunk to check the sequence number. If the sequence hasn't changed, it stops immediately—saving approximately 94% of the bandwidth that would be consumed by repeatedly downloading the full playlist. This optimization also contributes to faster station switching, which now completes in just 1-2 seconds.

Station changing is handled gracefully: when the rotary switch position changes, the download task immediately aborts the current segment and clears all buffers, preventing any mixing of audio from different stations. The M3U8 task detects the change within one second and fetches the new playlist.

Complete source code, detailed documentation, and build instructions are available on GitHub at https://github.com/rvisent/stream2AM. The repository includes comprehensive technical documentation in the CLAUDE.md file, covering task priorities, buffer sizing, and optimization strategies.

Sigma-Delta AM Modulation Theory

The heart of this project is the sigma-delta encoding algorithm that generates the AM modulated carrier directly in the digital domain. This technique, which I adapted and improved from my earlier Tesla coil audio modulator project (see https://github.com/rvisent/TeslaCoil_BT), relies on several clever tricks:

1. Audio preprocessing: The stereo 16-bit PCM audio at 48 kHz is converted to mono and processed through an automatic compressor with a peak detector. The compressor uses fast attack and very slow decay (~680 ms time constant) to maintain consistent AM modulation depth across varying program material while preserving musical dynamics.

2. DC offset and noise injection: A DC component equal to half the allowed peak range is added to the compressed audio, creating a waveform that varies from 0 to double the DC value—corresponding to the envelope of the desired AM modulation (0% to 100%). Additionally, a small amount of high-pass filtered pseudo-random noise (implemented as a derivative filter) is added to improve the sigma-delta encoder's performance during silence periods, preventing limit-cycle oscillations.

3. Sample rate conversion: The mono audio samples are linearly interpolated to the RF carrier frequency (default: 960 kHz). This is done efficiently using fixed-point Q15 arithmetic. A fractional counter tracks the audio sample phase, providing both the interpolation weight and determining when to advance to the next audio sample.

4. Sigma-delta encoding: The interpolated RF samples are processed through a second-order sigma-delta modulator, which converts them to a 1-bit representation. Every time 16 one-bit samples are ready, they are doubled to 32 bits by inserting a zero between each pair of bits. This is the crucial step: during 100% modulation peaks, the output becomes a pure squarewave at the carrier frequency. At intermediate modulation levels, pairs of zero bits appear pseudo-randomly in a pattern computed by the sigma-delta algorithm. The spectral content around the carrier becomes the desired AM modulation sidebands.

5. I2S output: The 32-bit sequences are collected in a buffer and sent to the I2S peripheral configured for 32-bit operation with no synchronization bits, using a single GPIO line. The I2S device clock frequency is fRF×2/32 = 960 kHz×2/32 = 60 kHz (in the default configuration). The I2S DMA engine handles the continuous streaming without CPU intervention.

The mathematics behind the sigma-delta process ensures that, when viewed in the frequency domain, the carrier and its modulation sidebands appear correctly while pushing quantization noise to higher frequency offsets, where it is naturally filtered by the resonant antenna circuit.

For testing and alignment purposes, the firmware includes an optional SIN_TEST mode activated by closing a jumper on GPIO 3. In this mode, the device generates pure sinusoidal test tones at 80% modulation depth, with the rotary switch selecting among eight frequencies ranging from 46.875 Hz to 4.6875 kHz. This feature proved invaluable for characterizing distortion and frequency response.

Performance and Results

The prototype performs remarkably well. Audio quality is very good with the chosen stream sources (256 kbps AAC-LC encoding). The automatic compressor maintains consistent modulation depth without audible pumping artifacts, and the slow decay time constant preserves the natural dynamics of music and speech.

Measured distortion in the final prototype (with MOS driver and slightly detuned resonator) is negligible across the audio frequency range. The sigma-delta encoding itself contributes imperceptible distortion, as confirmed using the built-in test tone generator and a spectrum analyzer on the demodulated audio. As can be seen in Figure 2, modulation sidebands at harmonics of the desired signal are very low and the background noise level is even lower in all the AM channel width of +/-4.5 kHz.

Station switching is fast: turning the rotary switch triggers a complete pipeline reset and new stream acquisition within a few seconds. The M3U8 bandwidth optimization and abort-on-change logic make this seamless user experience possible.

One interesting observation: TLS/SSL certificate validation was deliberately disabled in the HTTPS client to reduce CPU overhead. Initially, I attempted to validate certificates, but the cryptographic overhead occasionally caused brief audio dropouts. Since the device connects to well-known streaming services, the security risk is minimal, and the performance improvement is significant. Besides, even in case of a successful attack, it would cause no harm, being an embedded device used for entertainment purposes.

CPU utilization remains within limits. The dual-core architecture with priority-based task scheduling ensures that audio processing always takes precedence. Heap monitoring shows stable memory usage after initial buffer allocations. The 128 KB ring buffers for PCM and AAC frames provide adequate buffering against network jitter—red LED timeout flashes occur only occasionally during poor Wi-Fi conditions and recovery is automatic.

Conclusion and Future Possibilities

This project demonstrates that modern microcontrollers like the ESP32-S3 can handle sophisticated real-time audio streaming and RF generation tasks that would have required dedicated DSP hardware just a few years ago. The combination of dual cores, PSRAM, hardware-accelerated AAC decoding, and flexible I2S peripherals makes the ESP32-S3 an ideal platform for this kind of experimentation.

The device serves its intended purpose well: vintage AM radios can now receive internet radio streams with excellent audio quality. But it is also a starting point for future enhancements: connecting to different streaming sources, adding a graphical user interface, improving the electronics to enlarge the range to name a few.

For those interested in building this project or adapting it for their own applications, complete documentation, schematics (Figure 1), source code, and build instructions are available on GitHub at https://github.com/rvisent/stream2AM. A CLAUDE.md extensive documentation file is included, allowing to use the Claude AI to modify the project easily.
The project is released under the Apache 2.0 license, encouraging both hobbyist and educational use.

Note: When operating any radio transmitter, even low-power unlicensed devices, please verify compliance with local regulations regarding permitted frequencies, power levels, and emission standards.