Karl Fezer AI Evangelist Arm karlfezer Agenda Industry Trends How to do machine learning on Arm CortexM CPUs How to use TensorFlow Lite for Microcontrollers Handson workshop Trends ID: 1037591
Download Presentation The PPT/PDF document "Build a low-powered Arm voice assistant..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1. Build a low-powered Arm voice assistant with Google TensorFlow LiteKarl Fezer AI Evangelist, Arm@karlfezer
2. AgendaIndustry Trends How to do machine learning on Arm Cortex-M CPUsHow to use TensorFlow Lite for MicrocontrollersHands-on workshop!
3. Trends
4. The next wave of computing19901993 onwardsToday
5. Mali GPUsArm NPUsCortex-M/A CPUsML performance (ops/second)ML capabilitiesKeyword detectionPattern trainingAdvanced recognitionSmart camerasImage enhancementAutonomous drivingData centerTypical ML hardware choice Flexible, Scalable ML SolutionsArm enables ML everywhere
6. AI and ML TerminologyArtificial Intelligence:Umbrella term for machines acting as though they are ‘thinking’Machine Learning: Machines adapting algorithms based on experience (e.g. labelled pictures)Deep Learning:Machine learning using Deep Neural Network approaches Algorithms: CNNs, RNNs, etcTerminologyInference97.4% confidenceNew inputNeural network w/modelA Cat96.4% confidenceNota CatModelTraining dataNeural networkTrained to recognise catsTraining
7. ML Edge Use CasesRecognition and creationKeyword spotting, speech recognition, natural language processing, speech synthesis, etc.Any ‘signal’Accelerometer, pressure, lidar/radar, speed, shock, vibration, pollution, density, viscosity, etc.Images and videoObject detection, face unlock, defocus, beautification, etc.VisionVoiceVibration
8. Drivers for running ML on endpoint devicesSecurity and PrivacyPower and CostReliability
9. Machine Learning on Cortex-M CPUs
10. TinyML Body control unit application Face/object detection for scene wake-up Healthcare sensor module Keyword spotting Delivering on-device intelligence with “mW” of powerIntelligent anomalies detection
11. Arm Cortex-M processorOptimized for cost and power-efficient microcontrollers Enhanced ML capabilities with Cortex-M4 (Arm v7E-M), Cortex-M7 (Arm v7E-M), Cortex-M33 (Arm v8-M), Cortex-M35P (Arm v8-M), supports SIMD instructionsArm Helium technology is a new M-profile vector extension(MVE) for Arm v8.1-M architecture. Helium will deliver 15x more ML performance and 5x signal processing 11* Processor with DSP extensionsFuture Cortex-M enabled by Arm Helium technology
12. Training a Keyword Spotting ModelProblemTrained NN modelDeployable solutionOptimized code on HWNeed compact models: that fit within the Cortex-M system memoryNeed models with less operations: to achieve real time performanceModel: Depthwise separable convolution, fully-connected, softmax
13. Quantizing NN Solutions for Cortex-M processorsProblemTrained NN modelDeployable solutionOptimized code on HWNormalized AccuracyWeight Bit-width1.210.80.60.40.2012345678910111213AlexNet>7bitVGG-16>10bitGoogLeNet>9bitSqueezeNet>7bitVGG-16GoogLeNetAlexNetSqueezeNetNN Model32-bit floating point model accuracy8-bit quantized model accuracyTrainVal.TestTrainVal.TestDNN97.77%88.04%86.66%97.99%88.91%87.60%Basic LSTM98.38%92.69%93.41%98.21%92.53%93.51%GRU99.23%93.92%94.68%99.21%93.66%94.68%CRNN98.34%93.99%95.00%98.43%94.08%95.03%
14. Arm Hardware for AI/MLHardware Abstraction LayerArm Compute LibraryCMSIS-CoreOpenCLCustom APIOpenVXInference EngineCMSIS-NNML Platform Direct InterfaceVoice, Vision & Vibration ApplicationsArm’s ML Computing Platform Arm NNAndroid NN APICPUArm Cortex-A, Cortex-M, Neon, DynamIQGPUArm MaliML processorsPartner IPs@graphics to update this diagram
15. Optimizing with CMSIS-NNCMSIS – Cortex Microcontroller Software Interface StandardCMSIS NN provides optimized low-level NN functions for Cortex-M CPUs Functions implement popular NN layers: convolution, depthwise separable convolution, fully-connected, pooling, activation. ProblemTrained NN modelDeployable solutionOptimized code on HWLayer typeBaseline runtimeNew kernel runtimeImprovementThroughputEnergy EfficiencyConvolution443.4 ms96.4 ms4.6X4.9XPooling11.83 ms2.2 ms5.4X5.2XReLU1.06 ms0.4 ms2.6X2.6XTotal456.4 ms99.1 ms4.6X4.9XTable: Throughput and energy efficiency improvements by layer types
16. OpenMV
17. TensorFlow Lite
18. TensorFlow Lite for MicrocontrollersTensorFlow Lite Micro is designed to run on embedded systems:Less than 20KB binary footprintNo memory allocationNo library dependencies (not even POSIX or standard C/C++)Code is at github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/microTensorFlow is Google’s open source machine learning framework
19. TensorFlow Lite for MicrocontrollersStarted with speech wakeword detectionExample is on GitHubYou will have it working in a few minutes!Still a work-in-progress, but we’re improving through collaborationshttps://aiyprojects.withgoogle.com/open_speech_recording100,000 utterances, but we need moreOne of the key use cases is audio
20. SparkFun Edge Development Board32-bit ARM Cortex-M4F processor with Direct Memory Access48MHz CPU clock, 96MHz with TurboSPOT™Extremely low-power usage: 6uA/MHz1MB Flash384KB SRAMDedicated Bluetooth processor with BLE 5https://www.sparkfun.com/products/15170
21. SparkFun Serial Basic Breakout - CH340C and USB-Chttps://www.sparkfun.com/products/15096
22. A Keyword Spotting Model
23. Let’s get started!codelabs.developers.google.com/codelabs/sparkfun-tensorflow/
24. How well did it do?Why?
25. How well did it do?Model accuracy decreases (slightly) due to the compressionLimitations of hardwareNoisy environmentBias in dataset
26. SummaryML is moving to the IoT endpointMany ML use cases can be easily achieved with Arm Cortex-M CPUs Ecosystem partnerships ensure ability to develop and deploy diverse use cases everywhere on the deviceEnergy gridSmart cityRoboticsAutomotiveWearablesSensorEnvironmentalFarmingIndustrialHome automationIdentity & trackingIoTHealthcareVR/ARSmart lightingEnterpriseBuilding automationSmart watchRetailConnected clothingSpace
27. Next Steps!Check out SparkFun – www.sparkfun.comCheck out TFLite –www.tensorflow.org/lite/microcontrollers/overviewCheck out Arm Developer Pages – developer.arm.com/Retrain the Model – developer.arm.com/armgooglevoice2019 orhttps://developer.arm.com/solutions/machine-learning-on-arm/developer-material/how-to-guides/build-arm-cortex-m-voice-assistant-with-google-tensorflow-lite/retrain-the-machine-learning-model
28.
29. Backup
30.
31. SparkFun Edge Development Board
32. Driver issuesMachttps://stackoverflow.com/questions/55463159/sparkfun-edge-bootloader-problemsLinux, Mac, Windowshttps://learn.sparkfun.com/tutorials/how-to-install-ch340-drivers