Clustering of Personalized Documents from the Web by Personal Name Aliases
Sucheta Kokate1, D. G. Chougule2, Manjiri Kokate3

1Sucheta Kokate, Department of Computer Science & Engineering, BSCOER, Pune, India
2D.G.Chougule, Department of Computer Science & Engineering, TKIET, Warananagar, India,
3Manjiri Kokate, Department of Computer Science & Engineering, JSPM, Pune, India.
Manuscript received on June 03, 2013. | Revised Manuscript received on June 28, 2013. | Manuscript published on July 05, 2013. | PP: 242-243 | Volume-3 Issue-3, July 2013. | Retrieval Number: C1710073313/2013©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The web is a huge resource for people who use engines to search documents related to specific person. The traditional approach is to organize search results into groups, one for each meaning of the query. According to the topical similarity of the retrieved documents, these groups are usually constructed but it is impossible for documents to be totally dissimilar and still correspond to the same person. To overcome this problem, in this paper we will implement a rigorous technique to find out all the documents regarding personalized information within short period of time. In this novel approach we propose a technique in which we cluster personalized documents from the web by personal name aliases. Given a personal name, the proposed method first extracts a set of candidate aliases and then clusters the documents by these aliases to achieve high accuracy and reduce the complexity.
Keywords: Web mining, information retrieval, web text analysis, searching, surfing.