text detection and recognition for robot localization

Fa | Ar | En

text detection and recognition for robot localization


نویسنده	raisi z. ,zelek j.
منبع	journal of electrical and computer engineering innovations - 2024 - دوره : 12 - شماره : 1 - صفحه:163 -174
چکیده	Background and objectives: signage is everywhere, and a robot should be able to take advantage of signs to help it localize (including visual place recognition (vpr)) and map. robust text detection & recognition in the wild is challenging due to pose, irregular text instances, illumination variations, viewpoint changes, and occlusion factors. method: this paper proposes an end-to-end scene text spotting model that simultaneously outputs the text string and bounding boxes. the proposed model leverages a pre-trained vision transformer (vit) architecture combined with a multi-task transformer-based text detector more suitable for the vpr task. our central contribution is introducing an end-to-end scene text spotting framework to adequately capture the irregular and occluded text regions in different challenging places. we first equip the vit backbone using a masked autoencoder (mae) to capture partially occluded characters to address the occlusion problem. then, we use a multi-task prediction head for the proposed model to handle arbitrary shapes of text instances with polygon bounding boxes. results: the evaluation of the proposed architecture's performance for vpr involved conducting several experiments on the challenging self-collected text place (sctp) benchmark dataset. the well-known evaluation metric, precision-recall, was employed to measure the performance of the proposed pipeline. the final model achieved the following performances, recall = 0.93 and precision = 0.8, upon testing on this benchmark. conclusion: the initial experimental results show that the proposed model outperforms the state-of-the-art (sota) methods in comparison to the sctp dataset, which confirms the robustness of the proposed end-to-end scene text detection and recognition model.
کلیدواژه	text detection ,text recognition ,robotics localization ,deep learning ,visual place recognition
آدرس	university of waterloo, canada. chabahar maritime university, iran, university of waterloo, systems design engineering department, canada
پست الکترونیکی	j.zelek@uwaterloo.ca



Authors