Best sports: a portuguese collection of documents for semantics-concerned text mining research.

Carregando...
Imagem de Miniatura
Data
2018-04
Autores
Rezende, Solange Oliveira
Título da Revista
ISSN da Revista
Título de Volume
Editor
Resumo

The availability of labeled text collections is a common need in the text mining research community. These collections are used for both learning and evaluating text mining models. In this technical report, we present the BEST sports collection. This collection of documents written in Portuguese was collected, prepared, and provided to be used as benchmarking collection in text mining research. Considering real application scenarios, we created four datasets, which correspond to problems of different semantic complexity levels. The use of different datasets of the same collection allows the evaluation of text mining methods at different levels of semantic complexity.

Descrição
Palavras-chave
Mineração de dados e textos, Aprendizado de máquina
Citação