June 11 2019

Introduction to Deep Learning for Audio and Speech Processing

Are you an audio or speech processing engineer working on product development or DSP algorithms and looking to integrate AI capabilities within your projects? In this session you will learn the basics of deep learning for audio applications by walking through a detailed example of speech classification, entirely based on MATLAB code. We will cover creating and accessing labeled data, using time-frequency transformations, extracting features, designing and training deep neural network architectures, and testing prototypes on real-time audio. We will also discuss interoperability with other popular deep learning tools, including exploiting available pre-trained networks.


  • Acquiring, segmenting and labeling audio recordings and ingesting existing datasets
  • Extracting standard speech and audio features and using 2D time-frequency representations
  • Designing and analysing deep networks and exchanging models with other popular frameworks (e.g. via ONNX)
  • Accelerating computations using GPUs and prototyping trained models on real-world signals

< Back to events page