Rule-based Approach for Arabic Root Extraction: New Rules to Directly Extract Roots of Arabic Words

Fatma Abu Hawas, Keith E. Emmert

Abstract


Extracting word roots in Arabic language is very problematic due to the specific morphological and structural changes in the language. To address this problem, several techniques have been proposed. This paper continues the problem of identifying and exploiting relationship amongst Arabic letters for Arabic root extraction begun in [1]. Eight different rules that detect the root letters according to other letters in the word have been proposed and tested, four of them benefiting from the idea of morphological substitution (MUTATION). The approach has been evaluated using the Holy Quran words. The evaluation results show a promising root extraction algorithm.


Keywords


rule-based stemmer, word root, suffixes, prefixes, words patterns

Full Text:

PDF


DOI: https://doi.org/10.2498/cit.1002174

Creative Commons License
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

Crossref Similarity Check logo

Crossref logologo_doaj

 Hrvatski arhiv weba logo