Last year, Azure Cognitive Services announced the release of Azure Cognitive Service for Language. During that release, multiple NLP services came together under a unified, state-of-the-art, NLP service available under one resource and one user experience. Three new custom features were introduced at the time: custom text classification, custom named entity recognition, and conversational language understanding.
Today, the Azure Cognitive Service for Language team announces support for 96 languages for their new custom features! The new language support unlocks and facilitates global market penetration for customers of the Language Service.
The full list of supported languages are:
Afrikaans |
Dutch |
Italian |
Norwegian (Bokmal) |
Swahili |
Albanian |
English |
Japanese |
Oriya |
Swedish |
Amharic |
English (UK) |
Javanese |
Pashto |
Tamil |
Arabic |
English (US) |
Kannada |
Persian (Farsi) |
Telugu |
Armenian |
Esperanto |
Kazakh |
Polish |
Thai |
Assamese |
Estonian |
Khmer |
Portuguese (Brazil) |
Turkish |
Azerbaijani |
Filipino |
Korean |
Portuguese (Portugal) |
Ukrainian |
Basque |
Finnish |
Kurdish (Kurmanji) |
Punjabi |
Urdu |
Belarusian |
French |
Kyrgyz |
Romanian |
Uyghur |
Bengali |
Galician |
Lao |
Russian |
Uzbek |
Bosnian |
Georgian |
Latin |
Sanskrit |
Vietnamese |
Breton |
German |
Latvian |
Scottish Gaelic |
Welsh |
Bulgarian |
Greek |
Lithuanian |
Serbian |
Western Frisian |
Burmese |
Gujarati |
Macedonian |
Sindhi |
Xhosa |
Catalan |
Hausa |
Malagasy |
Sinhala |
Yiddish |
Chinese (Simplified) |
Hebrew |
Malay |
Slovak |
Zulu |
Chinese (Traditional) |
Hindi |
Malayalam |
Slovenian |
|
Croatian |
Hungarian |
Marathi |
Somali |
|
Czech |
Indonesian |
Mongolian |
Spanish |
|
Danish |
Irish |
Nepali |
Sundanese |
The most exciting feature of these services announced last year was their ability to train multilingual models. This meant you could build projects and tag data in one language, then deploy and query them for all the other languages supported. Train in English, predict in French, German, Spanish, Japanese, and many others. This eased the burden of effort on customers that previously relied on machine translation, or costly data replication efforts in other languages, to support different languages for their AI solutions. With this new language expansion, enterprises now have a way to reach the world within minutes.
The custom language services also allowed you to add data of any language in a single project. This makes sure that even with the built-in multilingual capabilities, there was always a path to add data for any specific language to further improve the quality of that language.
Let’s walk through a customer complaint classification scenario using conversational language understanding. After signing into the Language Studio with a Language resource and navigating to conversational language understanding, you can create a new project, which will prompt if you’d like to enable multiple languages in your project.
Once creation is complete, we can add 3 different intents as different types of complaints customers may have:
- Refund Status
- Delivery Delay
- Broken Product
We can then go ahead and add a few utterances associated to each intent such as:
- “I haven’t gotten my refund and it’s been 4 days!”
- “My delivery was supposed to be here 3 days ago and it never showed up”
- “My new mugs that arrived yesterday were broken”
You’ll notice the Language column in the utterances allows you to select a different language for an utterance, in case you wanted to add complaints in any of the other 96 supported languages. In practice, you want to add more than just a few examples for each intent for better quality models.
After saving your changes, you can train a model. Click on Train model, then Start a training job, provide a model name such as “v1” and when you’re ready press Train. You should disable evaluation considering there are very few examples in this project.
Once training is completed which may take a few minutes, you’re ready to go to Deploy model. Click on Add deployment and provide a deployment name such as “Test”, select the model you just trained and then Submit.
Now comes the best part, testing this all out in Test model. We can try out a few test queries like:
- “Where is my delivery?” --> predicted as Delivery Delay
- “The money for my refund never showed up” --> predicted as Refund Status
- “My new phone arrived with a broken screen!” --> predicted as Broken Product
When we now try out those same queries but in languages such as French, Spanish, and Chinese, we still get the right predictions! Even though we’ve only trained using English queries, the power of multilingual models has unlocked us for those languages.
- “Où est ma livraison?” which is “Where is my delivery?” in French --> predicted as Delivery Delay
- “El dinero de mi reembolso nunca apareció” which is “The money for my refund never showed up” in Spanish --> predicted as Refund Status
- “我的新手机到货时屏幕坏了!” which is “My new phone arrived with a broken screen!” in Chinese --> predicted as Broken Product
This is just a simple demonstration of how quickly it was to make use of the multilingual capabilities provided by Azure Cognitive Service for Language. The same multilinguality is applicable in both custom text classification and custom named entity recognition, which are services more appropriate classifying categories or extracting information from longer documents such as call transcriptions or legal contracts.
We’re excited to see your businesses benefit globally from these features.
Get started with the Language services today.
Posted at https://sl.advdat.com/3wnSW7dhttps://sl.advdat.com/3wnSW7d