Modern applications using artificial intelligence (AI). For example, you can build a healthcare application that can help medical practitioners and doctors dictate the drugs prescription for the patient and the AI-based application will convert the doctor’s verbal dictation into a text-based prescription that the patient can use to procure drugs for the treatment. Building an AI-based application from scratch can be challenging. You need to develop your AI model on top of a huge amount of data. However, public cloud providers provide Platform-as-a-Service (PaaS)-based AI services that you can consume to build modern AI-based applications. The cloud providers take care of the data and model. You simply need to pay for the data you use.
In this chapter we will explore Azure Cognitive Services and how to use its Speech service to convert speech to text.
Structure
Introduction to Azure Cognitive Services
Provision the Speech service
Build a .NET-based desktop application to convert speech to text
Objectives
Understand the fundamentals of Azure Cognitive Services
Work with the Speech service
Introduction to Azure Cognitive Services
Azure Cognitive Services provides PaaS-based artificial intelligence capability for developing AI-based applications. You need not arrange any dataset nor train any model. You simply need to consume these services for your AI use cases. Under the hood, Azure has done the heavy lifting in training the models and exposing these trained models as services that you can consume without concern for the underlying infrastructure. All these services are exposed as REST APIs that you can consume or use SDKs available in popular languages and platforms like .NET, Java, and Python.
Vision
Speech
Language
Decision
Vision
Computer Vision service helps you with capability to process and extract insights from videos and images.
Custom Vision helps you build your own custom image classifiers and deploy them on Azure. You can apply labels to the images based on specific characteristics of the images.
Face service helps you in performing face recognition.
Speech
Speech service helps you build intelligent applications that can convert speech audio to text and vice versa.
Language
Language Understanding Intelligent Service (LUIS) helps you perform natural language processing and helps the applications understand human natural language.
Translator translates machine-based text from one language to another.
Language Service helps you analyze text and derive insights like sentiments and key phrases from the text.
QnA Maker helps you build a question-and-answer database from your semi-structured data.
Decision
Anomaly Detector helps you infer anomalies in any time-series data.
Content Moderator helps you build applications that can moderate data that can be offensive or risky.
Personalizer helps you capture real-time user personal preferences that will help you understand user behaviors.
Provision Speech Service
Build a .NET-Based Desktop Application to Convert Speech to Text
Form1.cs
Summary
In this chapter, we explored the basic concepts of Azure Cognitive Services. Then we created a speech service and invoked the Speech service from a .NET-based desktop application to convert a wav audio file speech to text. In the next chapter, you will learn how to build a multilanguage text translator using Azure Cognitive Services and .NET.