Thursday, July 22, 2021

Summoning a dragon in Mixed Reality using TypeScript and your voice

dragon-screenshot.png

 

Dragons have been fascinating the world for centuries, we find them in the folklore of many cultures worldwide from Mesopotamian art and literature to nearly all Indo-European, Near Eastern, and of course the Chinese mythology. Wouldn't it be cool if we were able to (safely) summon our own dragon in the comfort of our home? Turns out it is possible and probably less complicated than you might think. In this article, we will discuss the high-level concepts and technologies needed and I will share a demo for you to try in your browser (best on mobile) or your Mixed Reality setup (Augmented/Virtual reality). 

 

If, like me, you tend to be a little impatient, you can start the free "Create and deploy a voice activated WebXR app with Babylon.js and Azure Cognitive ServicesLearn Module to learn how to create this application, or you can try the browser demo (click this link or scan the QR code below to open it on your phone) to get a feel of what you will be learning to make. (Hint: In the demo, after you summon your dragon, try telling the dragon to “go red” or “go blue” *wink wink*) If you are feeling adventurous, check out the source code of the demo. 

 

JING1201_1-1626977179519.png      MicrosoftTeams-image.png    dragon-qr-code.png

 

XR Basics 

Let’s take a moment to talk about Extended Reality on the web. Mixed, Extended, or Cross Reality (often abbreviated XR) is a term referring to the use of Virtual Reality (VR) and/or Augmented Reality (AR). The term Virtual Reality probably reminds you of headsets that allow gamers to be fully immersed into the world of games such as Beat Saber (which I am personally a huge fan of).  

On the other hand, Augmented Reality is a technology you probably have seen or even used many times, especially from your mobile phone. You could have been reviewing a piece of furniture in your living room before ordering it online, playing the famous Pokémon Go mobile game, or even just applying TikTok/Snap/Instagram filters.  

 

JING1201_2-1626977179526.png      JING1201_3-1626977179528.png

 

Learn Module

The Microsoft Learn Module I wrote walks you through a concrete example: creating and deploying a voice activated web based mixed reality app using a JavaScript library called Babylon.js and a speech-to-text cloud service provided by Azure. The theoretical use case explored in the Learn Module is showcasing the interactivity of a fictitious new amusement park attraction (which could potentially be real).  

In short, the module will teach you how to code a web app that will listen for a vocal command and render a virtual, animated dragon that will float in the space in front of you. Finally, you will learn how to easily deploy such an app to preview it from the web (no app installation needed) from your mobile phone, headset, or even your desktop/laptop browser (though losing the cool Augmented Reality effect). 

 
Tech Concepts 

To summon a virtual dragon we use four key technologies: Speech-to-text, WebXR, Babylon.js, and Azure Blob storage. 

 

Speech-to-text: the process of transcribing spoken audio to text. 

In the Learn module, we use a great cloud service provided by Azure to be truly cross platform, accurate, and performant. For the linked demo above, we used an experimental browser API illustrating that using a cloud service is avoidable, but such services work much better, more accurately, and across all platforms. This transcription is important when building experiences without a mouse and keyboard, and the way for applications to understand the user's commands/answers is to convert the audio signal into text that your code can easily process. 

 

JING1201_4-1626977179523.png

 

WebXR: a low level Extended Reality web API available on all modern browsers. As described in the MDN entry: "WebXR is an API for web content and apps to use to interface with mixed reality hardware such as VR headsets and glasses with integrated augmented reality features. This includes both managing the process of rendering the views needed to simulate the 3D experience and the ability to sense the movement of the headset (or other motion-sensing gear) and providing the needed data to update the imagery shown to the user." WebXR, however, is not a rendering technology, and most developers don't use WebXR directly. They instead rely on higher-level rendering frameworks using WebGL or WebGPU. So while this is a key technology, it is hidden behind the nice APIs provided by the rendering framework. 

JING1201_5-1626977179514.png

 

Babylon.js: a free, open source, high level JavaScript rendering/game engine that can leverage WebXR to create great Extended Reality experiences on the web. You can create a 3D world for the browser using Babylon.js, then convert it into a Virtual Reality/Augmented Reality scene in a few lines of code. 

JING1201_6-1626977179515.png   JING1201_7-1626977179523.png

 

 

Azure Blob Storage: Microsoft cloud solution to easily deploy and serve our app/assets. Azure Blob Storage is designed to be an optimal solution for storing massive amounts of unstructured data, such as text or binary data, which makes it great for storing and sharing images or videos for distributed access. However, its Static Website feature also allows us to host a website in just two steps: (1) enabling the feature from Azure portal and (2) uploading the files to the storage container.  

 

JING1201_8-1626977179520.png    JING1201_9-1626977179521.png

 

As a developer, the coolest part is how Babylon offers a straightforward and efficient API that makes 3D and Extended Reality development trivial and close to traditional web development, while offering advanced features. Furthermore, WebXR offers such a great user experience since users don’t need to install anything to try/use a Virtual Reality/Augmented Reality app. Users simply need to open their browser, go to the site, and everything just works. This means that native app developers can also embed a web view and extend their app super quickly. 

 

Conclusion 

Using this set of technology, you can safely summon a pet dragon in your living room (or wherever you like). Whether it has been your dream to become a wizard, or the cool speech and extended reality technology just caught your attention, check out the Learn module for creating this app to get going on the journey of becoming a cool WebXR magician! 

 
The dragon 3D model used in this project, which is under the CC Attribution license, is created by RedCoreTimber and can be found on Sketchfab. 

Special thanks to Matt Aimonetti for his help in making this blog post and demo app come to life. 

Posted at https://sl.advdat.com/3BuExX1