The Impact Of Stop Words Processing For Improving Extractive Graph-Based Arabic Text Summarization

21-04-2021 02:04

Text summarization is one of the natural language processing’s applications. It is used to reduce the amount of the input text to get the important information, to save the user’s time. Text summarization composed of four stages: preprocessing, features extraction, building the summarization model, and finally applying the summarization algorithms to extract the summary. This paper uses graph-based algorithm which used in many researches before in summarization of Arabic text graph-based procedure still have low performance, due to the complexity of Arabic language. In the pre-processing stage stop-words are deleted relying upon a pre-characterized list. This paper investigates the impact of the stop words in the summarization performance. So, this research done in two phases the first phase is with stop words and the second phase is without stop words. Then the summarization ranking algorithms is applied, then the summary is extracted according to predefined compression ratio with redundancy removal. This research using the following features: nouns, term frequency and inverse document frequency. To evaluate the system EASC corpus is used. The performance of the summarization is increased when stop words are removed.