Pestana, DanielMiranda, Pedro R.Lopes, João D.Duarte, RuiVéstias, MárioNeto, Horácio CDe Sousa, Jose2021-09-072021-09-072021-05-19PESTANA, Daniel; [et al] – A full featured configurable accelerator for object detection with YOLO. IEEE Access. ISSN 2169-3536. Vol. 9 (2021), pp. 75864-758772169-3536http://hdl.handle.net/10400.21/13689Object detection and classification is an essential task of computer vision. A very efficient algorithm for detection and classification is YOLO (You Look Only Once). We consider hardware architectures to run YOLO in real-time on embedded platforms. Designing a new dedicated accelerator for each new version of YOLO is not feasible given the fast delivery of new versions. This work's primary goal is to design a configurable and scalable core for creating specific object detection and classification systems based on YOLO, targeting embedded platforms. The core accelerates the execution of all the algorithm steps, including pre-processing, model inference and post-processing. It considers a fixed-point format, linearised activation functions, batch-normalisation, folding, and a hardware structure that exploits most of the available parallelism in CNN processing. The proposed core is configured for real-time execution of YOLOv3-Tiny and YOLOv4-Tiny, integrated into a RISC-V-based system-on-chip architecture and prototyped in an UltraScale XCKU040 FPGA (Field Programmable Gate Array). The solution achieves a performance of 32 and 31 frames per second for YOLOv3-Tiny and YOLOv4-Tiny, respectively, with a 16-bit fixed-point format. Compared to previous proposals, it improves the frame rate at a higher performance efficiency. The performance, area efficiency and configurability of the proposed core enable the fast development of real-time YOLO-based object detectors on embedded systems.engObject detectionConvolutional neural networkFPGALightweight YOLOA full featured configurable accelerator for object detection with YOLOjournal article10.1109/ACCESS.2021.3081818