/
Build a low-powered Arm voice assistant with Google TensorFlow Lite Build a low-powered Arm voice assistant with Google TensorFlow Lite

Build a low-powered Arm voice assistant with Google TensorFlow Lite - PowerPoint Presentation

madison
madison . @madison
Follow
65 views
Uploaded On 2024-01-03

Build a low-powered Arm voice assistant with Google TensorFlow Lite - PPT Presentation

Karl Fezer AI Evangelist Arm karlfezer Agenda Industry Trends  How to do machine learning on Arm CortexM CPUs How to use TensorFlow Lite for Microcontrollers Handson workshop Trends ID: 1037591

cortex arm sparkfun tensorflow arm cortex tensorflow sparkfun lite learning machine model developer code www speech spotting bit edge

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Build a low-powered Arm voice assistant..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Build a low-powered Arm voice assistant with Google TensorFlow LiteKarl Fezer AI Evangelist, Arm@karlfezer

2. AgendaIndustry Trends How to do machine learning on Arm Cortex-M CPUsHow to use TensorFlow Lite for MicrocontrollersHands-on workshop!

3. Trends

4. The next wave of computing19901993 onwardsToday

5. Mali GPUsArm NPUsCortex-M/A CPUsML performance (ops/second)ML capabilitiesKeyword detectionPattern trainingAdvanced recognitionSmart camerasImage enhancementAutonomous drivingData centerTypical ML hardware choice Flexible, Scalable ML SolutionsArm enables ML everywhere

6. AI and ML TerminologyArtificial Intelligence:Umbrella term for machines acting as though they are ‘thinking’Machine Learning: Machines adapting algorithms based on experience (e.g. labelled pictures)Deep Learning:Machine learning using Deep Neural Network approaches Algorithms: CNNs, RNNs, etcTerminologyInference97.4% confidenceNew inputNeural network w/modelA Cat96.4% confidenceNota CatModelTraining dataNeural networkTrained to recognise catsTraining

7. ML Edge Use CasesRecognition and creationKeyword spotting, speech recognition, natural language processing, speech synthesis, etc.Any ‘signal’Accelerometer, pressure, lidar/radar, speed, shock, vibration, pollution, density, viscosity, etc.Images and videoObject detection, face unlock, defocus, beautification, etc.VisionVoiceVibration

8. Drivers for running ML on endpoint devicesSecurity and PrivacyPower and CostReliability

9. Machine Learning on Cortex-M CPUs

10. TinyML   Body control unit application Face/object detection for scene wake-up Healthcare sensor module Keyword spotting Delivering on-device intelligence with “mW” of powerIntelligent anomalies detection

11. Arm Cortex-M processorOptimized for cost and power-efficient microcontrollers Enhanced ML capabilities with Cortex-M4 (Arm v7E-M), Cortex-M7 (Arm v7E-M), Cortex-M33 (Arm v8-M), Cortex-M35P (Arm v8-M), supports SIMD instructionsArm Helium technology is a new M-profile vector extension(MVE) for Arm v8.1-M architecture. Helium will deliver 15x more ML performance and 5x signal processing 11* Processor with DSP extensionsFuture Cortex-M enabled by Arm Helium technology

12. Training a Keyword Spotting ModelProblemTrained NN modelDeployable solutionOptimized code on HWNeed compact models: that fit within the Cortex-M system memoryNeed models with less operations: to achieve real time performanceModel: Depthwise separable convolution, fully-connected, softmax

13. Quantizing NN Solutions for Cortex-M processorsProblemTrained NN modelDeployable solutionOptimized code on HWNormalized AccuracyWeight Bit-width1.210.80.60.40.2012345678910111213AlexNet>7bitVGG-16>10bitGoogLeNet>9bitSqueezeNet>7bitVGG-16GoogLeNetAlexNetSqueezeNetNN Model32-bit floating point model accuracy8-bit quantized model accuracyTrainVal.TestTrainVal.TestDNN97.77%88.04%86.66%97.99%88.91%87.60%Basic LSTM98.38%92.69%93.41%98.21%92.53%93.51%GRU99.23%93.92%94.68%99.21%93.66%94.68%CRNN98.34%93.99%95.00%98.43%94.08%95.03%

14. Arm Hardware for AI/MLHardware Abstraction LayerArm Compute LibraryCMSIS-CoreOpenCLCustom APIOpenVXInference EngineCMSIS-NNML Platform Direct InterfaceVoice, Vision & Vibration ApplicationsArm’s ML Computing Platform Arm NNAndroid NN APICPUArm Cortex-A, Cortex-M, Neon, DynamIQGPUArm MaliML processorsPartner IPs@graphics to update this diagram

15. Optimizing with CMSIS-NNCMSIS – Cortex Microcontroller Software Interface StandardCMSIS NN provides optimized low-level NN functions for Cortex-M CPUs Functions implement popular NN layers: convolution, depthwise separable convolution, fully-connected, pooling, activation. ProblemTrained NN modelDeployable solutionOptimized code on HWLayer typeBaseline runtimeNew kernel runtimeImprovementThroughputEnergy EfficiencyConvolution443.4 ms96.4 ms4.6X4.9XPooling11.83 ms2.2 ms5.4X5.2XReLU1.06 ms0.4 ms2.6X2.6XTotal456.4 ms99.1 ms4.6X4.9XTable: Throughput and energy efficiency improvements by layer types

16. OpenMV

17. TensorFlow Lite

18. TensorFlow Lite for MicrocontrollersTensorFlow Lite Micro is designed to run on embedded systems:Less than 20KB binary footprintNo memory allocationNo library dependencies (not even POSIX or standard C/C++)Code is at github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/microTensorFlow is Google’s open source machine learning framework

19. TensorFlow Lite for MicrocontrollersStarted with speech wakeword detectionExample is on GitHubYou will have it working in a few minutes!Still a work-in-progress, but we’re improving through collaborationshttps://aiyprojects.withgoogle.com/open_speech_recording100,000 utterances, but we need moreOne of the key use cases is audio

20. SparkFun Edge Development Board32-bit ARM Cortex-M4F processor with Direct Memory Access48MHz CPU clock, 96MHz with TurboSPOT™Extremely low-power usage: 6uA/MHz1MB Flash384KB SRAMDedicated Bluetooth processor with BLE 5https://www.sparkfun.com/products/15170

21. SparkFun Serial Basic Breakout - CH340C and USB-Chttps://www.sparkfun.com/products/15096

22. A Keyword Spotting Model

23. Let’s get started!codelabs.developers.google.com/codelabs/sparkfun-tensorflow/

24. How well did it do?Why?

25. How well did it do?Model accuracy decreases (slightly) due to the compressionLimitations of hardwareNoisy environmentBias in dataset

26. SummaryML is moving to the IoT endpointMany ML use cases can be easily achieved with Arm Cortex-M CPUs Ecosystem partnerships ensure ability to develop and deploy diverse use cases everywhere on the deviceEnergy gridSmart cityRoboticsAutomotiveWearablesSensorEnvironmentalFarmingIndustrialHome automationIdentity & trackingIoTHealthcareVR/ARSmart lightingEnterpriseBuilding automationSmart watchRetailConnected clothingSpace

27. Next Steps!Check out SparkFun – www.sparkfun.comCheck out TFLite –www.tensorflow.org/lite/microcontrollers/overviewCheck out Arm Developer Pages – developer.arm.com/Retrain the Model – developer.arm.com/armgooglevoice2019​ orhttps://developer.arm.com/solutions/machine-learning-on-arm/developer-material/how-to-guides/build-arm-cortex-m-voice-assistant-with-google-tensorflow-lite/retrain-the-machine-learning-model

28.

29. Backup

30.

31. SparkFun Edge Development Board

32. Driver issuesMachttps://stackoverflow.com/questions/55463159/sparkfun-edge-bootloader-problemsLinux, Mac, Windowshttps://learn.sparkfun.com/tutorials/how-to-install-ch340-drivers