At XDF China in December 2019, Mr. Jiansong Zhang, Staff Engineer of Alibaba, gave a great talk about AI Platforms and Heterogenous Computing.
Alibaba developed a deep learning stack on top of Xilinx FPGAs including IP, shell, runtime, driver, compiler and models to enable various AI workloads.
This solution features –
- Optimized and customized hardware design
- On-chip streaming structure
- 3D-systolic-array conv. engine which makes full use of DSP carry chain and supertile design @ 600MHz
- Configurable dimensions of parallelism
- Resource allocation & task scheduling using a runtime
- Software-hardware co-optimization in compiler
- Model parsers for ONNX and Tensorflow
Mr. Zhang also introduced the following 4 key use cases to demonstrate their excellent results.
Case 1: OCR (Optical Character Recognition) in Public Cloud Services
Case 2: Edge solution for Smart Retail
Case 3: Private Cloud Service
~7x TCO saving achieved in replacement of CPU servers
Case 4: Speech Synthesis
Speech synthesis is an iterative task in which 16,000 iterations are needed to generate one second of audio. NN-based TTS (Text-to-speech) can be indistinguishable from human speech. Alibaba developed a Xilinx FPGA-based solution for real-time WaveNet, a state-of-art NN model for TTS. With the customized autoregressive low-latency IP in hardware and customized on-chip loop implemented in a compiler, they achieved 150x speed-up compared to a GPU implementation!
This is another great example of adaptable AI inference powered by Xilinx!
More from AI and Machine Learning Blog articles
Super-Resolution: Upgrading Image Quality with AI
New Neural Net Model: RefineDet
DPU Reference Design Released
![](https://platodata.io/wp-content/uploads/2021/05/shark-tanks-kevin-oleary-says-hes-betting-big-on-this-nascent-crypto-sector-300x37.png)
AI’s Energy Problem (and what we have done about it) – Part 2
![](https://platodata.io/wp-content/uploads/2021/06/bny-mellons-asset-management-unit-doubts-bitcoin-btc-as-viable-payment-method-300x179.jpg)
Xilinx Edge AI Platform has some new GitHub-based Tutorials
Xilinx OpenPose Demo
Xilinx Zynq Devices Take Charge of Robots
Uncanny AI Box for ALPR Application with Kria KV260 Starter Kit
![coinmarketcap-testing-the-defi-waters-with-token-swap-feature.jpg](https://platodata.io/wp-content/uploads/2021/06/coinmarketcap-testing-the-defi-waters-with-token-swap-feature-1-300x179.jpg)