Shreyas Reddy

Work place: International Institute of Information Technology, Bhubaneswar, Odisha, India

E-mail: b419056@iiit-bh.ac.in

Website:

Research Interests: Deep Learning, Machine Learning

Biography

Shreyas Reddy is currently a student pursuing a bachelor’s degree in Information Technology from the International Institute of Information Technology, Bhubaneswar, India. His research interests include working on designing efficient Machine Learning and Deep learning models for various business cases. He also has experience working on projects relating to upper arm prosthetics and gait analysis. He will be graduating with a bachelor's degree from IIIT in 2023.

Author Articles
An Integrated Pipeline with Internal Image Processing for Efficient Image to Text to Speech Conversion

By Shreyas Reddy Rashmi Ranjan Das Anjali Mohapatra

DOI: https://doi.org/10.5815/ijem.2023.06.01, Pub. Date: 8 Dec. 2023

Optical Character Recognition Systems (OCR) is a tool that helps computers read text from pictures of papers. It makes it easier for machines to understand what the words say without needing a person to read it out loud. It allows for easy digitizing of historical documents, archival material, and medical records thereby saving on their retrieval times. However, the accuracy of OCR systems heavily relies on the quality of the input images. To negate the contribution of the quality of input images to the accuracy of OCR systems, in this paper, we propose an integrated image pre-processing pipeline integrated with the OCR systems that enhances the quality of input images for efficient image to text conversion. This method results in an easily understandable text output with a lower Character Error Rate (CER) in comparison to the current methods. In addition, we explore a technique for converting text from a document or image into machine-readable form and then converting it to audio output using gTTS, a Python library that interfaces with Google Translate's text-to-speech API. We assess the effectiveness of this approach and illustrate that it substantially enhances OCR precision when compared to other existing methods. This paper presents a clear overview of the growth phases and significant obstacles, accompanied by compelling comparisons of results achieved through various methods.

[...] Read more.
Other Articles