Vowels and diphthongs

English has some words - “diet”, “poem”, “dial” etc – in which two vowels appearing together are pronounced in separate syllables. Vietnamese has no equivalent; when two vowels appear together, the result is a single-syllable diphthong. This may be an evenly-stressed diphthong (eg. “nói” (to speak) is pronounced like the start of English “noise”), or it may sound almost exactly like one of the vowels (usually the first) on its own. In the latter case, the diphthong will generally be pronounced longer than the solitary vowel (eg. “tiêm” (to inject) is pronounced like “tim” (heart) except slower).

Be aware that this is only a guide – these are pronunciations which are approximately correct and should be comprehensible to a native speaker of Vietnamese, but native pronunciation will often be far more complicated (as well as varying wildly depending on factors such as geographical region). Take these pronunciations as your starting point, and adapt to however your Vietnamese friends and/or teachers speak.


Varies somewhat depending on context, between 'a' as in “ban” and 'u' as in 'fun'. Usually 'a' as in "ban".

ai, ay, ăy, ây

'ai' as in “main”.

ao, au, âu

'a' as in “cart” run into 'oo' as in “foot”. The combination doesn't seem to occur in most English dialects. All other permutations of 'a', 'ă', 'â' and 'o', 'u' should work the same if examples exist.


'u' as in “mutt”.


'a' as in “far”. This is technically the same shape as 'ă', but generally longer.


'e' as in “men”.


'ai' as in “hair”, 'e' as in 'den' – varies only in length, except for some diphthongs.

i, y

'i' as in “bin”, usually. Sometimes pronounced like 'u' as in “fur” (but shorter). Examples include “mình” (me) and “thích” (to like). When either letter appears alone after a consonant and has a lengthening tone on it, eg. Mỹ (USA) or tỉ (billion), the resulting syllable is pronounced as if there were an 'a' between the consonant and the vowel.

o, ô

'o' as in “fog”.

oi, ôi

'oi' as in “boil”.


Somewhere between 'a' as in “far” and 'u' as in “fur”. Varies slightly within that range.


'oo' as in “foot”.


'u' as in “fur”, usually. Note that in some words lengthened by tones, eg. ngư (language), this letter becomes a diphthong – 'u' as in “fur” merging into a vowel somewhere between 'oo' as in “foot” and 'o' as in “moss”. The precise details of the second half aren't important, but it is important to pronounce a diphthong.


'u' as in “fur”.

