AI Research Project

Saw AI Engine

High-performance language model training and subword embedding research for the Myanmar language. Building the foundation for next-generation Burmese NLP.

Subword Tokenization

Custom BPE tokenizer trained on 50M+ Myanmar text tokens for precise language understanding.

Large Corpus

Trained on curated Myanmar news, literature, and social media data for broad linguistic coverage.

Efficient Architecture

Optimized transformer architecture designed for low-resource language settings.

Open Research

Sharing our findings and methodologies with the Myanmar AI research community.

Multilingual Support

Cross-lingual capabilities bridging Myanmar with English and other Southeast Asian languages.

NLP Benchmarks

Evaluated on sentiment analysis, named entity recognition, and text classification tasks.

50M+
Training Tokens
32K
Vocabulary Size
99.2%
Tokenization Coverage
2026
Research Year

Interested in collaboration?

We're open to research partnerships and contributions to Myanmar NLP.

Get in Touch