5022

ESP32-Chatgpt terminal is talking now. It speaks up the output through a 4 ohm speaker!

Talking ESP32 – Chatgpt terminal

Prelude : Those who have seen my ESP32 Chatgpt terminal for them this project will be a cake walk. Just add an I2S sound module, mostly a MAX98357A module which costs a penny and you are on the rock n roll! There are two varieties of MAX98357A available in market [robu.in amazon.in ]  Price varies from INR:500 [USD $7] to INR:700 [USD $9]. However, both works well on 3.3V to 5V only with varying output power from 1watt to 3.2watt on a 4 Ohms loud speaker with remarkably less THD. The output power is really good and distortion free. It is to be heard by someone to be believed.  The two varieties have been shown here. 

The project : Making Cahtgpt to talk with ESP32 was quite in the queue for sometime. Only the breakthrough with Google TTS [Text-To-Speech] with ESP32 has made the task simple. Earlier tried a few hardware controlled sound libraries that are available for free like - ESP8266SAM.h, AudioOutputI2S.h & AudioGeneratorRTTTl.h. All these libraries are available at GitHub and works for ESP32 but the sound quality of ESP8266SAM.h is very disappointing. For a long speech, it’s difficult to distinguish, as if it speaks through a long metallic pipe. The AudioGeneratorRTTTL.H produces beautiful ring tones, it cannot speak. Therefore, I had to fall back on Google TTS only!

Only three GPIO pins and an Internet connection is required for making the I2S amplifier work with Google TTS , 4 header files [which are also available at Github] are also required to kick the amplifier to life! These header files have been put into a zip file for you to work around this project. Google TTS can speak 1 million characters / month free for your account. Speaking “EFY” is an expense of 3 characters but speaking “Electronics for You” is a 17 character expense! Also Google will speak only 200 characters at a time. Beyond 200 characters at a time requires purchase of paid account from Google which I certainly wants to avoid!

Talking ESP32-Chatgtp terminal: You just ask a question / seek an answer / or ask for a snippet of codes , the ESP32 will help you to write that problem statement in the TFT screen using a PS/2 keyboard connected to the ESP32. The ESP32 in turn will send the question to chatgpt and the output obtained from chatgpt will be reproduced on the same TFT, at the same time the loud speaker connected to the I2S amplifier [MAX98357A] will loudly speak out the output produced on the screen! You don’t need any multimedia computer for this work! Ofcourse you need an internet wifi connection, and a secret key to access the chatgpt API. This is the single point access key for the openai.com API, no other login / password is required.      

                                                         You may ask why I’m using an old PS2 keyboard instead of an USB keyboard. The precise answer is that though I tried to access a sleek small USB keyboard but I could not get it talk to the ESP32 anyway. Same is for the sleekest bluetooth keyboard.

Personal / Secret  Key: Here’s how to get a secret key for your openai.com access.

 BOM:


 | Items | Price / Source
 | ESP32 – MCU  | INR:550 / USD$ 7 / robu.in, aliexpress.com, amazon.in
 | PS2 keyboard |

 | 3.5” TFT  | INR:1500 / USD$20 / robu.in, aliexpress.com, amazon.in
 | Wires, PCB, connector | Robu.in, amazon.in, aliexpress.com
 | IC7805 -5 volt regulator IC | INR:20 / USD$ 20 cent / robu.in, aliexpress.com, amazon.in
 | MAX98357A | INR:500 ~ 700 / USD$ 7 ~ 9

Connections for 3.5” TFT:


 | TFT Pin | ESP32 Pin | TFT Pin | ESP32 Pin
| CS | 33 | D2 | 26
| DC | 15 | D3 | 25
| RST | 32 | D4 | 16 [RX2]
| WR | 4 | D5 | 17 [TX2]
| RD | 2 | D6 | 27
| D0 | 12 | D7 | 14
| D1 | 13 | VCC / 5 volt  | 5 Volt of Regulator
| Gnd | Gnd | ESP32 Gnd to be connected to 5V Regulator Gnd & ESP32 Vin to be connected to 5V regulator output
| PS2 Keyboard Pin | ESP32 Pin
|  Data Pin | 35
| IRQ Pin [Clock pin] | 34
| 5 Volt | 5 volt out of Reg
| Ground | Ground


Connections for PS2 keyboard: A PS2 keyboard like an USB keyboards needs only 4 pins to connect to an MCU – 5 volt, ground, data pin and clock pin [or IRQ pin]. Since this is a pure data-in device, some GPI pins of ESP32 – 34 , 35 can be connected to it. GPIO pins – 34 and 35 are a data-in pins only. They are not data-out pins. Never use them in TFT, LEDs, Relays etc. devices as they need signal out unlike keyboard pins where they only need data in from keyboard. Ofcourse you can connect a PS2 keyboard to other GPIO pins as well.

Project testing: The delays that I’ve used inside loops are very special. You may change them but first start with my values and then once you have got a handle of your responses, you may change them. The silly questions that I started with – ‘who r u?’ The chatgpt meticulously produces the self introduction on the screen and the speaker speaks it up so nicely.

Then I raised the level of questions – Asked him to ‘5 sentences about Elektor magazine’, ‘the distance between earth and sun’. Every time it understood the narrative very clearly and answered it meticulously and the speaker worked it all the way to produce the voice output so clearly & loudly!

The highest level of questioning that I used is – Write 5 sentences about NTPC, write 5 sentences about Google.com etc.

For all the tests, chatgpt came up with flying colors.

However, these answers being more than 200 characters, the google TTS refused to speak up. Therefore, for google TTS, the modified answer string trimmed upto 200 characters such that while it shows up full on the screen, internally it speaks up upto 200 characters only!

Project Schematic: Note the ‘gain’ of MAX98357A is connected to ground to increase output power. Removing this connection will reduce the output power slightly.

Software: The Arduino sketch is included, the python code is also included. However, those who have tried my ESP32 Chatgpt terminal this program is rather far more easier. In the ESP32 Chatgpt terminal snippet I have just added few lines. See it in the Arduino sketch.

#include “Audio.h"//map the GPIO pins
#define I2S_DOUT  21
#define I2S_BCLK  22
#define I2S_LRC   23
//create an instance of the audio function
Audio audio;
//This function is optional just to produce the output & error message.
void audio_info(const char *info) { 
Serial.print("audio_info: ");
Serial.println(info);
}
void loop() {
while (!Serial.available() and !keyboard.available() ) audio.loop();
audio.connecttospeech(response.c_str(), "en");
……
}
Project Picture: attached

Video:  attached

Aftermath:  The next modification warrants for this project is - making it fully voice interactive such that it will listen to our questions and then answers like an obedient robot! That’s for the lab readers now!


Bye bye

S. Bera / Kolkata 06/03/2024