المستودع الرقمى

//uquui/

تقرير الوحدة

تقرير المجموعة

 2021

 نموذج الشبكة العصبية التلافيفية للتعرف على النص العربي المطبوع آليًا / المكتوب بخط اليد

 Alghamdi, Fatimah Abdulrahman


//uquui/handle/20.500.12248/131057
0 التحميل
587 المشاهدات

نموذج الشبكة العصبية التلافيفية للتعرف على النص العربي المطبوع آليًا / المكتوب بخط اليد

عناوين أخرى : Convolutional Neural Network Model for Arabic Machine-Printed/Handwritten Script Identification
الناشر :جامعة أم القرى
مكان النشر : مكة المكرمة
تاريخ النشر : 2021 - 1442 هـ
الوصف : 114 ورقة.
نوع الوعاء : ماجستير
اللغة : انجليزي
المصدر : مكتبة الملك عبدالله بن عبدالعزيز الجامعية
يظهر في المجموعات : الرسائل العلمية المحدثة

A large portion of scanned documents contain machine-printed and handwritten texts in a single document, such as application forms, question papers, bank checks and historical documents. Due to this, text classification techniques have become mandatory in the document image analysis and recognition fields. Text classification is the process of assigning categories to text according to its content. This research aimed to use deep-learning techniques to identify Arabic handwritten and machine-printed texts. Machine learning offers a wide range of tools, techniques and frameworks to address this challenge, such as convolutional neural networks (CNNs). Therefore, in this work, three objectives are highlighted: the first was the preparation of Arabic text document images for the classification stage; the second was the proposal of CNN models to classify Arabic machine-printed and handwritten scripts; the third was the determination of the best model among the proposed models based on the performance evaluation. The proposed text classification approach consists of two stages: pre-processing and classification. Two different datasets were used: the Khatt dataset as handwritten text and a self-collected machine-printed dataset. At the pre-processing stage, several methods were applied to the Khatt and self-collected datasets to prepare them for the classification stage. Next, three CNN models with different architectures were designed, built, trained and tested using the prepared dataset. The prepared dataset was split into two portions – 66% training set and 34% testing set – after which the training set was divided equally into a training set and a validation set. Finally, the performance of the three models was evaluated and compared to determine the optimum proposed CNN architecture, i.e. an architecture that would lead to an improvement in the accuracy rate and a reduction in the loss rate. The proposed CNN model with two convolution layers and two pooling layers achieved 95.57% accuracy on the prepared document images of Arabic machine-printed and handwritten text compared to the first and second models, which achieved an accuracy of 93.61% and 94.70%, respectively.

العنوان: نموذج الشبكة العصبية التلافيفية للتعرف على النص العربي المطبوع آليًا / المكتوب بخط اليد
عناوين أخرى: Convolutional Neural Network Model for Arabic Machine-Printed/Handwritten Script Identification
المؤلفون: Almotairi, Khalid
Alghamdi, Fatimah Abdulrahman
الموضوعات :: Computer science Research Methodology
تاريخ النشر :: 2021
الناشر :: جامعة أم القرى
الملخص: A large portion of scanned documents contain machine-printed and handwritten texts in a single document, such as application forms, question papers, bank checks and historical documents. Due to this, text classification techniques have become mandatory in the document image analysis and recognition fields. Text classification is the process of assigning categories to text according to its content. This research aimed to use deep-learning techniques to identify Arabic handwritten and machine-printed texts. Machine learning offers a wide range of tools, techniques and frameworks to address this challenge, such as convolutional neural networks (CNNs). Therefore, in this work, three objectives are highlighted: the first was the preparation of Arabic text document images for the classification stage; the second was the proposal of CNN models to classify Arabic machine-printed and handwritten scripts; the third was the determination of the best model among the proposed models based on the performance evaluation. The proposed text classification approach consists of two stages: pre-processing and classification. Two different datasets were used: the Khatt dataset as handwritten text and a self-collected machine-printed dataset. At the pre-processing stage, several methods were applied to the Khatt and self-collected datasets to prepare them for the classification stage. Next, three CNN models with different architectures were designed, built, trained and tested using the prepared dataset. The prepared dataset was split into two portions – 66% training set and 34% testing set – after which the training set was divided equally into a training set and a validation set. Finally, the performance of the three models was evaluated and compared to determine the optimum proposed CNN architecture, i.e. an architecture that would lead to an improvement in the accuracy rate and a reduction in the loss rate. The proposed CNN model with two convolution layers and two pooling layers achieved 95.57% accuracy on the prepared document images of Arabic machine-printed and handwritten text compared to the first and second models, which achieved an accuracy of 93.61% and 94.70%, respectively.
الوصف :: 114 ورقة.
الرابط: http://dorar.uqu.edu.sa//uquui/handle/20.500.12248/131057
يظهر في المجموعات :الرسائل العلمية المحدثة

الملفات في هذا العنصر:
ملف الوصف الحجمالتنسيق 
24795.pdf
"   الوصول المحدود"
الرسالة الكاملة2.29 MBAdobe PDFعرض/ فتح
طلب نسخة
absa24795.pdf
"   الوصول المحدود"
ملخص الرسالة بالعربي120.53 kBAdobe PDFعرض/ فتح
طلب نسخة
abse24795.pdf
"   الوصول المحدود"
ملخص الرسالة بالإنجليزي79.19 kBAdobe PDFعرض/ فتح
طلب نسخة
cont24795.pdf
"   الوصول المحدود"
فهرس الموضوعات101.4 kBAdobe PDFعرض/ فتح
طلب نسخة
indu24795.pdf
"   الوصول المحدود"
المقدمة223.71 kBAdobe PDFعرض/ فتح
طلب نسخة
اضف إلى مراجعى الاستشهاد المرجعي طلب رقمنة مادة

تعليقات (0)



جميع الأوعية على المكتبة الرقمية محمية بموجب حقوق النشر، ما لم يذكر خلاف ذلك