MODEL OF AUTOMATED SYNTHESIS TOOL FOR HARDWARE ACCELERATORS OF CONVOLUTIONAL NEURAL NETWORKS FOR PROGRAMMABLE LOGIC DEVICES
Journal: Scientific and Technical Journal of Information Technologies, Mechanics and Optics (Vol.20, No. 4)Publication Date: 2020-08-10
Authors : Egiazarian V.A. Bykovskii S.V.;
Page : 560-567
Keywords : convolutional neural networks; neural network accelerators; hardware accelerators; FPGA; CAD; high-level synthesis;
Abstract
Currently, more and more tasks on image processing and analysis are being solved using convolutional neural networks. Neural networks implemented using high-level programming languages, libraries and frameworks cannot be used in real-time systems, for example, for processing streaming video in cars, due to the low speed and energy efficiency of such implementations. The application of specialized hardware accelerators of neural networks is necessary for these tasks. The design of such accelerators is a complex iterative process requiring highly specialized knowledge and qualification. This consideration makes the creation of automation tools for high-level synthesis of such computers a relevant issue. The purpose of this research is a tool development for the automated synthesis of neural network accelerators from a high-level specification for programmable logic devices (FPGAs), which reduces the development time. A description of networks is used as a high-level specification, which can be obtained using the TensorFlow framework. The several strategies have been researched for optimizing the structure of convolutional networks, methods for organizing the computational process and formats for representing data in neural networks and their effect on the characteristics of the resulting computer. It was shown that structure optimization of neural network fully connected layers on the example of solving the handwritten digit recognition problem from the MNIST set reduces the number of network parameters by 95 % with a loss of accuracy equal to 0.43 %, pipelining of calculations speeds up the calculation by 1.7 times, and parallelization of the computing process individual parts provides the acceleration by almost 20 times, although it requires 4-6 times more FPGA resources. Applying of fixed-point numbers instead of floating-point numbers in calculations reduces the used FPGA resources by 1.7–2.8 times. The analysis of the obtained results is carried out and a model of an automated synthesis tool is proposed, which performs the indicated optimizations in automatic mode in order to meet the requirements for speed and resources used in the implementation of neural network accelerators on FPGA.
Other Latest Articles
- IMAGE-BASED DEFECT ANALYSIS FOR 3D-PRINTED ITEM SURFACE USING MACHINE LEARNING METHODS
- NORMALIZATION OF KAZAKH LANGUAGE WORDS
- EFFECTIVE IMPLEMENTATION OF MODERN MCELIECE CRYPTOSYSTEM ON GENERALIZED (L, G)-CODES
- MODERN APPROACHES TO MULTICLASS INTENT CLASSIFICATION BASED ON PRE-TRAINED TRANSFORMERS
- AUTOMATED HAND DETECTION METHOD FOR TASKS OF GESTURE RECOGNITION IN HUMAN-MACHINE INTERFACES
Last modified: 2020-08-26 03:20:26