Name: | Description: | Size: | Format: | |
---|---|---|---|---|
2.78 MB | Adobe PDF |
Advisor(s)
Abstract(s)
Web Archiving (WA) deals with the preservation of portions of the World Wide Web (WWW) allowing their availability for the future. Arquivo.pt is a WA initiative holding a huge amount of content, including image files. However, some of these images contain nudity and pornography, that can be offensive for the users, and thus being Not Suitable For Work (NSFW). This work proposes a solution to classify NSFW images found at Arquivo.pt, with deep neural network approaches. A large dataset of images is built using Arquivo.pt data and two pre-trained neural network models, namely ResNet and SqueezeNet, are evaluated and improved for the NSFW classification task, using the dataset. The evaluation of these models reported an accuracy of 93% and 72%, respectively. After a fine tuning stage, the accuracy of these models improved to 94% and 89%, respectively. The proposed solution is integrated into the Arquivo.pt Image Search System, available at https://arquivo.pt/images.jsp.
Description
Keywords
Deep learning Deep neural networks Image classification Not suitable for work images ResNet SqueezeNet
Citation
BICHO, Daniel; FERREIRA, Artur; DATIA, Nuno – A deep learning approach to identify not suitable for work images. i-ETC: ISEL Academic Journal of Electronics, Telecommunications and Computers. ISSN 2182-4010. Vol. 6, N.º 1 (2020) ID-3, pp. 1-11