linux.conf.au 2021 | Presentation: Transparent Open Source AI Video Analytics with Panfrost

Presented by

Aaron Boxer

Aaron is a mathematician and developer, and has been building open-source systems for over 10 years. He enjoys the simple pleasure of squeezing every last ounce of performance from both hardware and software. Aaron currently works for Collabora and is based out of Toronto, Canada.
Marcus Edel
@marcusedel
https://kurg.org

Marcus Edel is a core contributor for the mlpack machine learning library and Software Engineer at Collabora, and has worked in several areas including function optimization, computer vision, and automated theorem proving. He holds a Ph.D. in Computer Science from the Free University of Berlin, Germany. Previously he has spoken at NeurIPS, IPIN, and GCAI.

Abstract

AI-powered video analytics is one of the most challenging applications of AI to edge devices, given the edge's power, compute, and memory limitations. This area is currently dominated by NVidia Deep Stream, which suffers from: 1. vendor-lock-in from CUDA language and NVidia hardware 2. lack of transparency into low level tensor operations and algorithms due to closed source drivers and libraries. Can we give freedom of choice back to AI multimedia developers ? Can we build a pure open source stack, running from application to ML framework down to GPU driver, which allows complete transparency into the ML inference workflow ? The new Panfrost open source driver for Mali GPUs is solving this problem on the edge by enabling a fast and efficient machine learning stack running pure open source. Combining this with TensorFlow Lite and GStreamer, we get a powerful open source AI stack for video analytics. And because the stack is open from top to bottom, we get visibility into the complete inference process, allowing us to better understand and explain how an analytic model makes its predictions. The ability to explain how a model infers it's results (explainability) is an increasingly desirable ML feature, particularly in applications that have an impact on privacy, such as video facial recognition. Explainability allows us to build ethical and trustworthy ML systems known to be free from bias. Closed source blobs and libraries interfere with explainability by hiding crucial computations from view. In this talk, we will walk through the process of building an AI-driven multimedia pipeline on top of a completely open source inference stack: open source GPU driver, machine learning framework and machine learning models. We will share what we have learned about optimizing these models to run fast on resource-constrained hardware such as the Rockchip RK3399. And we will discuss how this completely open stack is a critical component of ethical and trustworthy video analytics. Here is a link to the PDF of the slides: https://gitlab.collabora.com/boxerab/conference_slides/-/raw/a4c867fbbf0f271e876ae0378a7c6cd61434068c/TransparentVideoAnalytics.pdf?inline=false