MORPHOLOGICAL ANALYSIS


INTRODUCTION



Natural Language Processing (NLP) is one of the most rapidly growing areas of research. The findings of Morphological Analysis and Morphological Generation might be considered highly relevant in most Natural Language Processing applications. Because morphological analysis is a technique for recognising a word, the result can be employed at a later stage. With this in mind, this study explains how morphological analysis and generation may be demonstrated as critical components of several Natural Language Processing domains such as spell checkers and machine translation.

Morphological analysis is a field of linguistics that studies the structure of words. It identifies how a word is produced through the use of morphemes.

A morpheme is a basic unit of the English language. The morpheme is the smallest element of a word that has grammatical function and meaning. Free morpheme and bound morpheme are the two types of morphemes. A single free morpheme can become a complete word.

For instance, a bus, a bicycle, and so forth. A bound morpheme, on the other hand, cannot stand alone and must be joined to a free morpheme to produce a word. ing, un, and other bound morphemes are examples.

Inflectional Morphology and Derivational Morphology are the two types of morphology. Both of these types have their own significance in various areas related to the Natural Language Processing.

What is a morphological analyzer ?

In inflected languages, words are formed through morphological processes such as affixation. For example, by adding the suffix ‘-s’ to the verb ‘to dance’, we form the third person singular ‘dances’.

A morphological analyzer assigns the attributes of a given word by evaluating what morphological processes the form has undergone. If you give it the word ‘bailaré’ in Spanish, it will tell you it is the first person, singular, simple future, indicative form of the verb ‘bailar’.


Morphological Parsing.

It is the process of determining the morphenes from which a given word is constructed. Morphenes are the smallest meaningful words which cannot be divided further. Morphenes can be stem or afix. Stem are the root word whereas afix can be prefix, suffix or infix. For example-

Unsuccessfull -->    un                 success              ful

                                 (prefix)            (stem)           (suffix)                                  

Order of words also decide the morphological parser. To design a morphological parser we require three things- lexicon, morphptactics and orthographic rules.


Types of Morphology: 

Inflectional Morphology:

modification of a word to express different grammatical categories.Inflectional morphology is the study of processes, including affixation and vowel change, that distinguish word forms in certain grammatical categories..Inflectional morphology consists of at least five categories, provided in the following excerpt from Language Typology and Syntactic Description: Grammatical Categories and the Lexicon. As the text will explain, derivational morphology cannot be so easily categorized because derivation isn't as predictable as inflection.Examples- cats, men etc.

Derivational Morphology: Is defined as morphology that creates new lexemes, either by changing the syntactic category (part of speech) of a base or by adding substantial, nongrammatical meaning or both. On the one hand, derivation may be distinguished from inflectional morphology, which typically does not change category but rather modifies lexemes to fit into various syntactic contexts; inflection typically expresses distinctions like number, case, tense, aspect, person, among others. On the other hand, derivation may be distinguished from compounding, which also creates new lexemes, but by combining two or more bases rather than by affixation, reduplication, subtraction, or internal modification of various sorts. Although the distinctions are generally useful, in practice applying them is not always easy.

APPROACHES TO MORPHOLOGY:


 • Morpheme Based Morphology : In these words are analyzed as arrangements of morphemes.Word-based morphology is (usually) a word-and-paradigm approach. The theory takes paradigms as a central notion. Instead of stating rules to combine morphemes into word forms or to generate word forms from stems, word-based morphology states generalizations that hold between the forms of inflectional paradigms.

 • Lexeme Based Morphology: Lexeme-based morphology usually takes what it is called an “item-andprocess” approach. Instead of analyzing a word form as a set of morphemes arranged in sequence , aword form is said to be the result of applying rules that alter a word-form or steam in order to produce a new one. 

Word based Morphology : Word-based morphology is usually a word-and -paradigm approach.instead of stating rules to combine morphemes into word forms, or to generate word forms from stems , word -based morphology states generalizations that hold between the forms of inflectional paradigms.

 Morphological Analysis using Paradigms: Most NLP systems use simple linguistic theories for morphological analysis.words are related to each other by analogical rules. Words can be categorized based on the pattern they fit into.Applicable both to existing words and to new ones. Application of a pattern different from the one that has been used-give rise to a new word.Examples:older replacing elder.

Procedure And Algorithm: A language expert provides different tables of words in the entire language.the roots follow the pattern implicit in the table for generating their word. 



Algorithm:Forming paradigm Table

Purpose:To form paradigm table from word forms table for a root 

Input:Root r, Words forms Table Wft(with labels for rows and columns ) Output:Paradigm Table Pt 

Algorithm: 

1. Create an empty table PT of the same dimensionality, size and labels as the word forms table WFT. 

2. for every entry w in WTF , do if w=r then store (0,0)in the corresponding position in PT else begin let i be the position of the first characters w and r which are different store (size(r)-i+1, suffix(i,w)) at the corresponding position in PT

3.return PT


APPLICATIONS OF MORPHOLOGICAL ANALYSIS :-

Text to Speech Synthesis 

These days people are mainly dependent on technology medium such as computers ,mobiles for their daily need. But when we talk about people who are not aware technology and disabled people they face a lot of difficulty in interfacing of these mediums. So this requires the need of Text to Speech Synthesis. Morphological analysis can be used to reduce the size of lexicon and also plays an important role in determining the pronunciation of a homograph. It also helps in Schwa deletion and the consideration of it. In case of compound words ,this analyser can be used to segregate a compound word into basic form and later this basic root can be combined using Morphological generator to generate the result.

 

Machine Translation

Machine translation mainly helps the people who are belonging to the different communities and want to interact with the data present in the different languages, for this machine translation is one of the prominent solution. For few languages Machine translation have been developed but for the few other languages the work is going on.

In lack of Morphological analysis ,we need to store all the word forms ,this will increase the size of database  and will take more time to search. One more benefit of this analyser is it provides the information of the word such as number, gender. This information can be used in target language to generate the correct form of the word.

 



Spell Checker

A Spell checker is an application that is used to identify whether a word has been spelled correctly or not. Spell checker functionality can be divided into two parts: Spell check error detection and Spell check error correction. Spell check error detection phase only detects the error while Spell check error correction will provide some suggestions also to correct the error detected by Spell check error detection phase. One more advantage of using morphology based spell checker is that it can handle the name entity problem. If any word is not included in the lexicon, can be added easily.

 

Search Engine

Search engine is an application program used to search a particular document over the internet using World Wide Web. We have to provide the input to the engine, after that it will provide the result based on the input given. Morphological Analysis and Generation improves the result of the search engine.

Suppose if a word is provided as a input but this word is not present in the lexicon, it will affect the output. In that case Morphological analysis of that word is done.

Morphology and Syntax

Syntactic expressions of the different semantic elements are expressed as separate and independent words. morphological, phonological defined autonomously during the 1970s and  rest of work on syntax showed syntactic systems could handy the morphology. the researchers

using syntax, in which syntactic and morphological

structures can be derived from semantic representations, must show the word formation without using

 two types of morphemes namely lexical morphemes and grammatical morphemes. English words are generally composed of a stem and an optional set of affixes.

Syntactic processes.

 machine translation in between two languages, database of words plays a significant rule. ... Another important benefit can be that, Morphological analysis provides the information of the word such as number

 

Morphological facts

we use it in our day-to-day life like using plural words i.es adding s to its prefix and its all dependent on token processing

ex                      I eat one melon a day.

Indonesian:      I eat two fruit melon every day

This language won’t use morphological plurals in there wordings

Dependency trees


                                          
                                                    single sentence from John F. Kennedy's Inaugural speech 

 

Morphological Analysis

Two methods to morphology, analytic and Synthetic are the 2two

Analytic principles

Point 1

Forms with the same meaning and the same sound shape in all their occurrences are instances of the same morpheme.

Point 2

Forms with the same meaning but different sound shapes may be instances of the same morpheme if their distributions do not overlap.

Point 3

Not all morphemes are segmental.

Ex:

run ran                                                                       in this the verb and past tense are not being

speak spoke                                                                  segmented rather the main point is look

eat ate                                             at both the present and past tense forms of these verbs

                                                       because it is the contrast between them that is important.

 

How to differentiate or breaking of words

Aztec, spoken in Mexico, Example

ikalwewe ‘his big house’

ikalsosol ‘his old house’                              So here  ikal- means ‘his house’

ikalmeh ‘his houses’                                            -wewe means ‘big’

ikalci·n ‘his little house

 

Summary

We presented some basic beliefs of ours that underlie this that every language is not bit perfect except Sanskrit as there are not proper divisions and also with the help of an example how the natural language processing would work or helps in ml to differentiate or translate a word from its own existing vocabulary. So, Words articulate together to form phrases and sentences, which reflect their syntactic properties words establish relationships with each other to form paradigms & Prefixes are derivational.

Comments