Empty word


Empty words is the name given to meaningless words such as articles, pronouns, prepositions, etc. which are filtered before or after the processing of data in natural language (text). Hans Peter Luhn, one of the pioneers in information retrieval, is credited with the coinage of the English phrase stop words and the use of the concept in its design. It is controlled by human input and not automatic.

There is no definitive list of empty words that all natural language processing tools incorporate. Not all PLN tools use a list of empty words. Some tools avoid using it specifically to support phrase searches. Using a stemming algorithm can reduce some of the logical basis or dependency of a list of empty words to be filtered.

Empty words can cause problems when using a search engine to find phrases that include them, especially in names like 'The Truth' or 'Never Never'.

wiki

Popular Posts