Document Details

Document Type : Thesis 
Document Title :
Agent Based Information Retrieval System
نظام ارجاع المعلومات المعتمد على الوكيل البرمجي
 
Subject : Faculty of Computing and Information Technology-Computing Sciences 
Document Language : Arabic 
Abstract : The Web has become the largest easy available repository of data. Hence, it is natural to extract information from it and Web search engines have become one of the most used tools in Internet. Search engines allow the user to search and retrieve information in simple and easy way using terms such as phrase or keyword. Search engines retrieve web pages from its database that match the search terms entered by the searcher. However, while search engines are definitely good for certain search tasks such as finding the home page of an organization, they may be less effective for satisfying broad or ambiguous queries. The results on different subtopics or meanings of a query will be mixed together in the list, thus implying that the user may have to sift through a large number of irrelevant items to locate those of interest. On the other hand, there is a lot of the duplicated or near duplicated webpages in the search results. This thesis addressed and solved the two main problems: Near duplication and Word sense ambiguities (multiple meaning). Word ambiguity and a lot of near duplicate may lead to poor performance in Information Retrieval (IR) systems. Near Duplications Document (NDD) detection is used to solve duplication problem. Removing the NDD has a lot of advantages, it reduces search result list that leads to decrease search time that allow the user find her/his requirement as fast as possible. Search results clustering is an attempt to solve multiple meaning problems by automatic organizing the linear lists of document references returned by a search engine into a set of meaningful thematic categories. Designing a Web search clustering algorithm is a big challenge because readable and unambiguous labels of the thematic groups are an important factor of the overall quality of clustering; also labels should be a good descriptive to the search result cluster. Multi-agent based information retrieval system is proposed to enhance the search process by adding clustering and filtering components. Cluster agent based component was added where agents deal with the web search engine results by clustering them into relevant synonym's category for given queries. WordNet was employed to classify results in search engine result page in appropriate synonym's category according to WordNet synsets. Filtering Agent based component was embedded to eliminate the near duplicate data references. The experiments show that proposed filtering component leads to a precision of over 96%, and a recall of over 97% and the proposed clustering component leads to quality of cluster’s title over 92%. 
Supervisor : Dr. Fathy A. Eassa 
Thesis Type : Master Thesis 
Publishing Year : 1433 AH
2012 AD
 
Co-Supervisor : Dr. Maysoon F. Abulkhair 
Added Date : Sunday, November 18, 2012 

Researchers

Researcher Name (Arabic)Researcher Name (English)Researcher TypeDr GradeEmail
بسمة صالح السلميAlsulami, Bassma SalehResearcherMaster 

Files

File NameTypeDescription
 34322.pdf pdf 

Back To Researches Page