A Free, Standalone and Open-Source Khmer Grapheme-to-Phonemes.
pip install khmerphonemizerfrom khmerphonemizer import phonemize
text = "នៅលើលោកនេះមិនមានមនុស្សណាម្នាក់ចេះអស់ទេ"
result = phonemize(text)
print(result)Output
(['នៅលើ',
'លោក',
'នេះ',
'មិន',
'មាន',
'មនុស្ស',
'ណា',
'ម្នាក់',
'ចេះ',
'អស់',
'ទេ'],
[['n', 'ɨ', 'w', 'l', 'əː'],
['l', 'oː', 'k'],
['n', 'i', 'h'],
['m', 'ɨ', 'n'],
['m', 'i', 'ə', 'n'],
['m', 'ɔ', 'n', 'u', 'h'],
['n', 'aː'],
['m', 'n', 'ĕ', 'ə', 'ʔ'],
['c', 'e', 'h'],
['ʔ', 'ɑ', 'h'],
['t', 'eː']])Check out the examples/ for more examples.
-
phonemizeTokenize input text into words and phonemize each word and returns a tuple with tokens and phonemes.input_str: strText with multiple words.beam: int = 500number of beam search.min_beam: int = 100: minimum number of beam search.beam_score: float = 0.6beam search score.use_lexicon: bool = TrueUse lexicon dictionary for known words.
-
phonemize_singlePhonemize a single word.word: strText with single Khmer or English word only.beam: int = 500number of beam search.min_beam: int = 100: minimum number of beam search.beam_score: float = 0.6beam search score.use_lexicon: bool = TrueUse lexicon dictionary for known words.
MIT
Without these awesome projects from awesome people, this wouldn't be possible.
- Khmer Word Search: Challenges, Solutions, and Semantic-Aware Search (Rina Buoy and Nguonly Taing and Sovisal Chenda)
- CUNY-CL/wikipron (Kyle Gorman, Jackson Lee, and contributors, 2019)
- rhasspy/gruut (Michael Hansen et al., 2020)
- OpenFst (Kyle Gorman et al.)
- AdolfVonKleist/Phonetisaurus (Josef Novak et al., 2017)