ENCADEAr: ENCADEAmento automático de notícias
DOI:
https://doi.org/10.5617/osla.1457Abstract
This work aims at defining and evaluating different techniques to automatically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords extraction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.Downloads
Published
2015-03-31
Issue
Section
Artikler