Difference between Stemming & Lemmatization?
It is a common misconception that stemming and lemmatization are the same. There are a lot of differences between them which can be explained as follows:
Lemmatization is the process of determining the root word from a word with multiple meanings. The lemmatization process would then extract its associated meaning from the dictionary, which is usually based on context. It is often used in corpus linguistics and statistical analysis, where the aim is to reduce words into their simplest form.
In contrast, stemming is a process of removing letters from words to extract their root form. The difference between stemming and lemmatizing is that stemmer does not take inflections into account when reducing words to their root form while lemmatizer does.
Stemming and lemmatization are both morphological operations that can be applied to words in a text. They both make use of the same rules and differ only in what part of speech they work on: stemmers apply their rules to verbs, nouns, adjectives, and adverbs; lemmatizers apply them to nouns, adjectives, and adverbs.
For example, the word ‘stemming’ would be stripped into ‘stem’ in order to find its root form.