Urban Scene Understanding with Text Recognition of Street View
Keywords:
Image Processing, Text Recognition and Extraction, Canny Edge Detection Technique, Contours, Optical Character Recognition, pyttsx3.Abstract
In developing regions and cities, driving conditions can often be unpredictable and challenging due to difficulties in understanding the surrounding urban scenes. Intelligent Transportation Systems (ITS) can greatly benefit from the ability to comprehend the urban environment, particularly by leveraging the information provided by shop signboards and scene text. In this system, we used the method that utilizes urban scene images to extract valuable data for ITS applications. The system employs various image processing techniques to enhance the images and extract relevant text regions. To begin, the system converts the urban scene images into grayscale representations, which simplifies subsequent processing steps. Then the system employs contour detection techniques to identify potential text regions within the image. The Canny Edge Detection technique is then utilized to filter out non-text regions. Next, Optical Character Recognition (OCR) algorithms are employed to digitize the textual content within the identified text regions. The OCR process plays a crucial role in obtaining accurate and reliable textual information, which is vital for ITS applications. Finally, the system employs pyttsx3, a text-to-speech synthesis library, to convert the detected text into audible speech. By providing this capability, the system enhances accessibility by allowing users to listen to the extracted textual content, thereby enabling them to focus on the road while still receiving important information. By combining image processing techniques, contour detection, OCR, and text-to-speech synthesis, our developed system improves the performance of the understanding of urban scenes in developing regions and cities.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Sagar S. Udgiri, Pratik P. Mahajan, Onkar V. Narwade, Snehal V. Lokhande
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.