HERI3D: A Comparative Analysis of Traditional and Deep Learning-Based 3D Reconstruction Techniques Using UAV Imagery for Cultural Heritage

Risse, BenjaminCatricheo, Constanza Andrea MolinaOliver, Sergio TrillesGuo, Ting-Jia2026-03-232026-03-232026-03-05http://hdl.handle.net/10362/201742Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesThis thesis presents a comprehensive comparative analysis of traditional and deep learning-based 3D reconstruction techniques using Unmanned Aerial Vehicle imagery for cultural heritage documentation. The research evaluates four distinct reconstruction paradigms: a traditional Structure-from-Motion and Multi-View Stereo pipeline implemented in COLMAP, neural implicit surface reconstruction using Neuralangelo, radiance field representation via 3D Gaussian Splatting, and a feed-forward geometry-grounded Transformer model known as VGGT. The study uses datasets from three architecturally diverse castle sites in North Rhine-Westphalia, Germany: Schloss Münster, Burg Lüdinghausen, and Schloss Raesfeld. To ensure a fair comparison, a unified evaluation framework was established. This framework incorporates standardized image preprocessing, point cloud refinement, and geometric registration against airborne LiDAR reference data. The performance of each method was assessed through visual qualitative analysis and quantitative evaluation metrics, including Root Mean Square error for accuracy, Cloud-to-Cloud distance for completeness, and local geometric feature descriptors. The results demonstrate that traditional photogrammetry implemented in COLMAP remains the most reliable method for geometric accuracy. Among the learning-based approaches, VGGT with a moderate image count of 24 images consistently achieved the highest completeness and a balanced trade-off between accuracy and geometric stability across all sites. While 3D Gaussian Splatting provides superior visual continuity and color consistency, increasing the number of training iterations primarily refines surface appearance. Neuralangelo maintained global shape continuity but tended to smooth or underrepresent fine-scale architectural details. The findings highlight that reconstruction performance is strongly mediated by sites-pecific factors, including architectural complexity, UAV flight constraints, and the availability of reference data. This study contributes a reproducible comparative framework that serves as a structured reference for future digital heritage preservation efforts. The results emphasize that no single method dominates across all evaluation criteria and that method selection should align with specific documentation objectives.eng3D ReconstructionUAV PhotogrammetryCultural HeritageDeep LearningComparative FrameworkHERI3D: A Comparative Analysis of Traditional and Deep Learning-Based 3D Reconstruction Techniques Using UAV Imagery for Cultural Heritagemaster thesis204231760