Text-representing Centroid Terms

After only a few lines of reading human readers are able to determine, which category of texts and which abstract topic category given documents belongs to. This strongly demonstrates how well and fast the human brain, especially the human cortex, can process and interpret data. It is not only able to understand the meaning of single words - as representations of real-world entities - but of certain compositions of them, too. In addition, the brain acts as a knowledge data base when topically classifying content not seen before. It tries to match the terms (i.e. words carrying meaning) in such documents with previously learnt terminology and can, in doing so, instantly and unconsciously classify them at least coarsely.

Text-representing centroid terms represent a completely new method and technology inspired from physics and processes in brain to support these tasks in a better way than all conventional approaches mostly based on bag-of-words or term frequency – inverse document frequency (TF-IDF).


Next page















12 February 2018