Convolutional Neural Networks (CNNs) have been demonstrated to be one of the most powerful methods for image recognition, being applied in many fields, including civil and structural health monitoring in infrastructure asset management. Current State-ofthe- Art CNN models are now accessible as open-source and available on several Artificial Intelligence (AI) platforms, with TensorFlow being widely used. Besides CNN models, Vision Transformers (ViTs) have recently emerged as a competitive alternative. Several demonstrations have indicated that ViT models, in many instances, outperform the current CNNs by almost four times in terms of computational efficiency and accuracy. This paper presents an investigation into defect detection for civil and structural components using CNN and ViT models available on TensorFlow. An empirical study was conducted using a database of cracks. The severity of crack is categorized into binary states: “with crack” and “without crack”. The results confirm that the accuracies of both CNN and ViT models exceed 95% after 100 epochs of training, with no significant difference observed between them for binary classification. Notably, the cost of this AI-based approach with images taken by lightweight and low-cost drones is considerably lower compared to high-speed inspection cars, while still delivering an expected level of predictive accuracy.
Loading....