Unsupervised Morphological Segmentation Output
This page is a distribution site for unsupervised morphological segmentation output for Bengali.
This output was generated from the system introduced in the following paper:
High-Performance, Language-Independent Morphological Segmentation.
Sajib Dasgupta and Vincent Ng.
In the annual conference of the NAACL-HLT, New York, 2007.
Here are the files:
Segmented Output : 143.8K words segmented.
Mapping : The transliteration we used to map Bengali to English.