Shachar Elisha, Texas A&M University – Commerce

A Computer Vision Based Driving Environment Descriptor:Voice Alerts to Drivers

Abstract: In this study, we aim to address two different problems: distracted driving and exhaustive driving. Our idea to resolve these issues is to enable the drivers with an extra intelligence that tells the drivers to stop on stop signs, drive in speed limits in discrete intervals and many more traffic related warnings. We introduce a computer vision based driving environment descriptor which interprets and understands the contents of images or videos from the live streaming and recognizes the various entities of the environment. Driving environment descriptor is learnt from the dataset BDD100K which has images and videos and 5 textual descriptions of each set of driving environments, to recognize traffic signals, stop signs, etc. and furthermore generate voice alerts to the driver. To reach the goal, we apply a deep Convolutional Neural Network (CNN) designed by the Visual Geometry Group (VGG) and carrying its name. The VGG architecture of the CNN is utilized to extract image features and feature maps which could identify the object of interest in the image. The CNN delivers the extracted features as a sequence of information to a Recurrent Neural Network (RNN). The RNN uses the sequence to generate a text description of the image. To carry out the work, the RNN applies a probabilistic approach which generates a sequence of words. On the next stage of the algorithm, the text description is then transformed into a verbal alert by gTTS (Google Text-to-Speech). To validate the capabilities of our driving environment descriptor system, we conduct and demonstrate a set of experiments using real data.

Presentation Author(s):
Shachar Elisha* and  Kathiravan Natarajan

Judging Forms Official judges only