>
Fa   |   Ar   |   En
   empirical evaluation of well-known farsi ocr engines on the idpl-pfod dataset  
   
نویسنده hosseini fatemeh-sadat ,shabaninia elham ,nezamabadi-pour hossein
منبع power, control, and data processing systems - 2025 - دوره : 2 - شماره : 1 - صفحه:35 -47
چکیده    Optical character recognition (ocr), also referred to as text recognition, extracts text from scanned documents, camera images, etc. ocr has numerous applications in reading forms and cheques; converting archived documents to digital files, reading books and papersetc. an accurate ocr system speeds up these processes by removing time-consuming user tasks. however, ocr is challenging especially in languages such as farsi due to the intrinsic characteristics of this language and limited resources such as suitable datasets to evaluate the effectiveness and efficiency of proposed methods. idpl-pfod is a new synthetic farsi printed dataset that offers a wide range of variations including different backgrounds, font types, distortions, blurs, etc. therefore, in this paper, two ocr engines, tesseract and easyocr are evaluated on the idpl-pfod dataset to show the limitations of existing ocr engines for farsi. evaluations using standard metrics reveal that tesseract and easyocr respectively achieve an overall accuracy of 84.41% and 73.28% on this dataset. furthermore, the robustness of these two engines is evaluated against different variations such as textured background, salt & pepper noise, gaussian blur, and distortions. this paper provides valuable insights to the community by reviewing the current challenges of deep learning methods for farsi ocr and serving as a foundation for further research and advancements in the future.
کلیدواژه idpl-pfod dataset ,optical character recognition (ocr) ,easyocr ,tesseract ,ocr engines ,farsi ocr
آدرس shahid bahonar university of kerman, department of electrical engineering, iran, graduate university of advanced technology, department of applied mathematics, iran, shahid bahonar university of kerman, department of electrical engineering, iran
پست الکترونیکی nezam_h@yahoo.com
 
     
   
Authors
  
 
 

Copyright 2023
Islamic World Science Citation Center
All Rights Reserved