ENCADEAr: ENCADEAmento automático de notícias

  • Carla Abreu
  • Jorge Teixeira
  • Eugénio Oliveira

Abstract

This work aims at defining and evaluating different techniques to automatically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords extraction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.
Published
2015-03-31