ENCADEAr: ENCADEAmento automático de notícias

Authors

  • Carla Abreu
  • Jorge Teixeira
  • Eugénio Oliveira

DOI:

https://doi.org/10.5617/osla.1457

Abstract

This work aims at defining and evaluating different techniques to automatically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords extraction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.

Downloads

Published

2015-03-31