Cover Image for System.Linq.Enumerable+EnumerablePartition`1[System.Char]

An Improved Text Extraction Approach With Auto Encoder for Creating Your Own Audiobook

Angelin Gladston, Shakkthi Rajkumar, Shruthi Muthukumar, Aparna S. S.

OAI: oai:igi-global.com:289570 • DOI: 10.4018/IJIRR.289570

Published by: IGI Global Scientific Publishing

Abstract

As we all know, listening makes learning easier and interesting than reading. An audiobook is a software that converts text to speech. Though this sounds good, the audiobooks available in the market are not free and feasible for everyone. Added to this, we find that these audiobooks are only meant for fictional stories, novels or comics. A comprehensive review of the available literature shows that very little intensive work was done for image to speech conversion. In this paper, we employ various strategies for the entire process. As an initial step, deep learning techniques are constructed to denoise the images that are fed to the system. This is followed by text extraction with the help of OCR engines. Additional improvements are made to improve the quality of text extraction and post processing spell check mechanism are incorporated for this purpose. Our result analysis demonstrates that with denoising and spell checking, our model has achieved an accuracy of 98.11% when compared to 84.02% without any denoising or spell check mechanism.

Information Retrieval Library and Information Science Information Retrieval