← Back to postsExperimenting with Speech Recognition API

Experimenting with Speech Recognition API

Published: 3/25/2025

Introduction

Welcome back, or welcome, if it's your first time here you know, it's not like I'm tracking who actually reads this. I'm not. Because I don't know how D:Now, that's sidetracking, and I just began writing, we in for something. So, why are here again? Oh yah, the app. PronounceIt.

Let me tell you, this was done mostly for me to practice my Typescript, and once more, for the sake of learning more of React. However, I'd like to share a bit of the process, as usual.

What is it and why?

PronounceIt was made to assist in learning how to - can you guess? Right on. Pronounce certain words. It's actually really simple in scope. You select or type a word of choice, press Start, and it will speak it back to you. I've seen it somewhere before.. Google Translate cof cof.

Well, mine allows you to record yourself and hear it back, HA! But mine does not translate, so take that Google.

In short, it pretty much is a clone of the Google Translate speech feature, without the translation. It seems like I'm downplaying my own app, but that's not the case. The truth is, I think cloning already done features and applications is a great way to learn, and you probably won't implement it the same way, so there's definitely a learning chance. It makes you think of possible ways and tools. You are forced to search - maybe not as much nowadays (AI) - for ways to do something new, or maybe you find out that a library you read about a month ago can actually solve that problem or even discover new API altogether.

What tools, What code, tell us?

As for much of my projects, and I think is fair to point out here that, hey, I do I used React and Tailwindcss ok? Like every other time since the goo'ol days of HTML/CSS/JS stack. Wild. And again, for a lot of these projects I'm skipping a lot of the early necessary work. I know that. I'm not doing any research, or market study, nor am I doing branding strategy, not even the whole necessary thought-process for UI/UX. Dead honest, for these projects my mindset is, what do I need in terms of: Frontend, backend and database. And the occasional deployment. I think this helps me focus on just learning the fundamentals of the tools I'm using. Then once I'm comfortable, I move to other areas of need, like planning, designing, testing, devOps and whatnot. I do want to bring more in-depth content eventually and touch on those topics. In the meantime.

Besides React and Tailwindcss the other main component of this project is Typescript, but I honestly don't have enough experience to talk about it, I will say that it is clear why programmers adopted it so quickly. The benefits of static typed languages are just too good. And as someone who has started coding in C, I can tell you I actually miss it.

Again, I like to structure my code in an organized manner that makes sense to me. I did make use of Layouts again and I strongly recommend it. I might be wrong because I haven't worked with React for that long, and maybe someone can point out a better way but this has been doing wonders for me. The code is actually simple; I have the Home page and four other components which comprise the different parts of the App, WordSelector, Recorder, Feeedback and Player. Home is where the states live! And the way it is setup makes so much sense to me. Home delegates each piece its own necessary prop, this is the flow: WordSelector receives the setWord setter from Home. Recorder receives the setAudio setter from Home. Player receives the word state from Home. Feedback receives the audio from Home. That's it.By doing these exercises I can really start to understand React. What it renders. How and when the component mounts. There are all these nuances to React that can make it really hard to grasp at first, but when it clicks.

Make it Talk!

For text-to-speech I used a package that I found while in research mode on how to make it speak! I could have just used the browser API, and in hindsight, I should've, but I did not and in all fairness, I'm happy with this package, it does what I wanted and it was really simple to implement. Use their custom and you are good to go!

You can try them out: react-text-to-speech

How was it and what did I Learn?

Honestly, maybe out of the need to finish something different, I've cut a lot of corners on this project. I simplified a lo because, again, the goal was and is to continue and experiment with React and better understand it fundamentally. I think if I were to do this again, I would definitely spend a little more time refining some features. I would also build the text-to-speech using the browser API. But I didn't, and so this little app actually took me about 5 days to finish. I did learn how to better organize states in React, and why it matters so much to think about the different render conditions and when a component renders and what causes it to re-render. I learnt more about how React uses the Virtual DOM before actually committing to the real DOM. This was, by all means, a very short and simple project. With a single objective, have someone pick a word and play it out to demonstrate pronunciation. And as of now, that's what it does, so I'm happy.

To the World!

Last but not least, I did deploy it. Of course I had to. Such a magical piece of technology; how could I ever keep it from you guys!? And to make sure my experimentations were on par, I deployed it to Github Pages. Also, because this is just a single one page application. Because routing React applications on github pages can be downright annoying (for a beginner). To deploy to github pages you have to do a little configuring. First install the gh-pages package. Next you have to configure two files. The vite.config.ts:

Notice that you have to specify the exact name of your github repository <repo-name>

Also, package.json

Finally, once I had all the necessary configurations, the app was ready to be deployed. I just had to run `npm run build` and `npm run deploy`. The app was ready. But wait, where to find it? To be able to deploy it, I had to go to github, under the settings tab they have the pages configuration. I must select the right branch for github to deploy from, and by default, using gh-pages it deploys from the gh-pages branch which was created when I ran npm build and deploy.

Conclusion

The more you practice something, the easier it becomes. There's no way around it. Everyone knows. But no one wants to know, I guess. This project gave me a little boost of confidence because I was so much more comfortable using the tools and concepts. I think that's one of the best feelings; better than finishing something, understanding something you were struggling with just a few months ago is a tremendously empowering feeling. So my conclusion is to actually be more forgiving with myself, and I do strongly advise you to do the same. Take your time, spend time with your preferred coding language, build cool stuff, clone projects, build for you and for your friends, and do it because you can, not because you have to. This was really a more personal project, and my conclusion is I still, of course, have a lot to learn, but man, am I ready!

Take care.

0xd