Authorisation
Building a summarization model for the Georgian language using deep learning
Author: ana shvelidzeKeywords: text summarization, georgian datasets, PEGASUS, Extractive/extractive Text Summarization, Agglutinative languages
Annotation:
The paper concerns the generation of summaries from texts in Georgian language articles, research papers or other types of documents using deep learning methods. Summarizing texts involves extracting important parts from the text in the document and combining them without losing the main information presented in it. Its result is the main text presented in an abbreviated form. The Georgian language belongs to the group of agglutinative languages, it has a rather complex structure, which, unlike other languages, involves the implementation of word formation and form change through affixation. Considering this feature, it is necessary to modify the existing methods and algorithms, or to develop new approaches. Therefore, in the implementation of the task presented in the paper, it was necessary to adapt the existing models for the Georgian language. This, in turn, required additional resources and experience. It is worth noting the fact that currently there are no pre-prepared models for the Georgian language that will implement this task. We can say that this model has given us certain results, which provides the basis for further refinement and development of the system to create perfect summaries, which will fully process and summarize any type of texts written in the Georgian language. In this paper, I discuss various text summarization systems, methodologies, and tools that can help us better understand how short summary texts are generated from long texts using data science methodologies. We will talk about the process of building a concise, coherent and well-organized summary of long text documents, which emphasizes the important points of the text.