Geometry-Aware Human Noise Removal from TLS Point Clouds via 2D Segmentation Projection

Fuga Komura; Yoshida, Daisuke; Ryosei Ueda

doi:10.51094/jxiv.4204

##article.authors##

Fuga Komura Graduate School of Informatics, Osaka Metropolitan University
Yoshida, Daisuke Osaka Metropolitan University
Ryosei Ueda College of Sustainable System Sciences, Osaka Metropolitan University

DOI:

https://doi.org/10.51094/jxiv.4204

キーワード:

3D point cloud、 noise removal、 image recognition、 deep learning、 principal component analysis、 TLS、 DBSCAN

抄録

Large-scale terrestrial laser scanning (TLS) point clouds are increasingly used for appli-cations such as digital twins and cultural heritage documentation; however, removing unwanted human points captured during acquisition remains a largely manual and time-consuming process. This study proposes a geometry-aware framework for auto-matically removing human noise from TLS point clouds by projecting 2D instance segmentation masks (obtained using You Only Look Once (YOLO) v8 with an instance segmentation head) into 3D space and validating candidates through multi-stage geo-metric filtering. To suppress false positives induced by reprojection misalignment and planar background structures (e.g., walls and ground), we introduce projection-followed geometric validation (or “geometric gating”) using Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and principal component analysis (PCA)-based planarity analysis, followed by cluster-level plausibility checks. Experiments were conducted on two real-world outdoor TLS datasets—(i) Osaka Metropolitan University Sugimoto Campus (OMU) (82 scenes) and (ii) Jinaimachi historic district in Tondabayashi (JM) (68 scenes). The results demonstrate that the proposed method achieves high noise removal accuracy, obtaining precision/recall/intersection over union (IoU) of 0.9502/0.9014/0.8607 on OMU and 0.8912/0.9028/0.8132 on JM. Additional experiments on mobile mapping system (MMS) data from the Waymo Open Dataset demonstrate stable performance without parameter recalibration. Furthermore, quantitative and qualitative comparisons with representative time-series geometric dynamic object removal methods, including DUFOMap and BeautyMap, show that the proposed approach maintains competitive recall under a human-only ground-truth definition while reducing over-removal of static structures in TLS scenes, particularly when humans are observed in only one or a few scans due to limited revisit frequency. The end-to-end processing time with YOLOv8 was 935.62 s for 82 scenes (11.4 s/scene) on OMU and 571.58 s for 68 scenes (8.4 s/scene) on JM, supporting practical efficiency on high-resolution TLS imagery. Ablation studies further clarify the role of each stage and indicate stable performance under the observed reprojection errors. The annotated human point cloud dataset used in this study has been publicly released to facilitate reproducibility and further research on human noise removal in large-scale TLS scenes.

利益相反に関する開示

The authors declare no conflicts of interest related to this manuscript.

ダウンロード *前日までの集計結果を表示します

ダウンロード実績データは、公開の翌日以降に作成されます。

引用文献

Remondino, F. Heritage Recording and 3D Modeling with Photogrammetry and 3D Scanning. Remote Sens. 2011, 3, 1104–1138. https://doi.org/10.3390/rs3061104.

Yang, S.; Hou, M.; Li, S. Three-Dimensional Point Cloud Semantic Segmentation for Cultural Heritage: A Comprehensive Re-view. Remote Sens. 2023, 15, 548. https://doi.org/10.3390/rs15030548.

Zhang, C.; Zhang, X.; Lao, M.; Jiang, T.; Xu, X.; Li, W.; Zhang, F.; Chen, L. Deep Learning for Point Cloud Denoising: A Survey. arXiv 2025, arXiv:2508.11932. https://doi.org/10.48550/arXiv.2508.11932.

Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2014; Volume 8693, pp. 740–755, ISBN 978-3-319-10601-4.

Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN. ACM Trans Database Syst 2017, 42, 19:1–19:21. https://doi.org/10.1145/3068335.

Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 4338–4364. https://doi.org/10.1109/TPAMI.2020.3005434.

Charles, R.Q.; Su, H.; Kaichun, M.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85.

Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11105–11114.

Thomas, H.; Qi, C.R.; Deschaud, J.-E.; Marcotegui, B.; Goulette, F.; Guibas, L. KPConv: Flexible and Deformable Convolution for Point Clouds. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6410–6419.

Zhao, H.; Jiang, L.; Jia, J.; Torr, P.; Koltun, V. Point Transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 16239–16248.

Lai, X.; Liu, J.; Jiang, L.; Wang, L.; Zhao, H.; Liu, S.; Qi, X.; Jia, J. Stratified Transformer for 3D Point Cloud Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 8490–8499.

Choy, C.; Gwak, J.; Savarese, S. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3070–3079.

Qian, G.; Li, Y.; Peng, H.; Mai, J.; Al Kader Hammoud, H.A.; Elhoseiny, M.; Ghanem, B. PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies. In Proceedings of the 36th International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2022; pp. 23192–23204.

Wu, X.; Jiang, L.; Wang, P.-S.; Liu, Z.; Liu, X.; Qiao, Y.; Ouyang, W.; He, T.; Zhao, H. Point Transformer V3: Simpler, Faster, Stronger. arXiv 2023, arXiv:2312.10035. https://doi.org/10.48550/ARXIV.2312.10035.

He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Com-puter Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988.

Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788.

Ultralytics YOLOv8—Documentation. Available online: https://docs.ultralytics.com/models/yolov8/ (accessed on 28 Decem-ber 2025).

Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 833–851, ISBN 978-3-030-01233-5.

Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Computer Vision—ECCV 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2020; Volume 12346, pp. 213–229, ISBN 978-3-030-58451-1.

Cheng, B.; Misra, I.; Schwing, A.G.; Kirillov, A.; Girdhar, R. Masked-Attention Mask Transformer for Universal Image Seg-mentation. arXiv 2021, arXiv:2112.01527. https://doi.org/10.48550/ARXIV.2112.01527.

Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. arXiv 2023, arXiv:2304.02643. https://doi.org/10.48550/arXiv.2304.02643.

Qi, C.R.; Liu, W.; Wu, C.; Su, H.; Guibas, L.J. Frustum PointNets for 3D Object Detection from RGB-D Data. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 918–927.

Shin, K.; Kwon, Y.P.; Tomizuka, M. RoarNet: A Robust 3D Object Detection Based on RegiOn Approximation Refinement. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 2510–2515.

Vora, S.; Lang, A.H.; Helou, B.; Beijbom, O. PointPainting: Sequential Fusion for 3D Object Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4603–4611.

Weinmann, M.; Jutzi, B.; Mallet, C. Geometric Features and Their Relevance for 3D Point Cloud Classification. ISPRS Ann. Pho-togramm. Remote Sens. Spat. Inf. Sci. 2017, IV-1/W1, 157–164. https://doi.org/10.5194/isprs-annals-IV-1-W1-157-2017.

Sander, J.; Ester, M.; Kriegel, H.-P.; Xu, X. Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications. Data Min. Knowl. Discov. 1998, 2, 169–194. https://doi.org/10.1023/A:1009745219419.

Yang, C.-K.; Chen, M.-H.; Chuang, Y.-Y.; Lin, Y.-Y. 2D-3D Interlaced Transformer for Point Cloud Segmentation with Sce-ne-Level Supervision. arXiv 2023, arXiv:2310.12817. https://doi.org/10.48550/ARXIV.2310.12817.

Peng, S.; Genova, K.; Jiang, C. “Max”; Tagliasacchi, A.; Pollefeys, M.; Funkhouser, T. OpenScene: 3D Scene Understanding with Open Vocabularies. arXiv 2022, arXiv:2211.15654. https://doi.org/10.48550/ARXIV.2211.15654.

Yue, H.; Wang, Q.; Zhang, M.; Xue, Y.; Lu, L. 2D–3D Fusion Approach for Improved Point Cloud Segmentation. Autom. Constr. 2025, 177, 106336. https://doi.org/10.1016/j.autcon.2025.106336.

Habibiroudkenar, P.; Ojala, R.; Tammi, K. DynaHull: Density-Centric Dynamic Point Filtering in Point Clouds. J. Intell. Robot. Syst. 2024, 110, 165. https://doi.org/10.1007/s10846-024-02203-2.

Duberg, D.; Zhang, Q.; Jia, M.; Jensfelt, P. DUFOMap: Efficient Dynamic Awareness Mapping. IEEE Robot. Autom. Lett. 2024, 9, 5038–5045. https://doi.org/10.1109/LRA.2024.3387658.

Jia, M.; Zhang, Q.; Yang, B.; Wu, J.; Liu, M.; Jensfelt, P. BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps. IEEE Robot. Autom. Lett. 2024, 9, 6256–6263. https://doi.org/10.1109/LRA.2024.3402625.

Fischler, M.A.; Bolles, R.C. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun ACM 1981, 24, 381–395. https://doi.org/10.1145/358669.358692.

Pauly, M.; Keiser, R.; Kobbelt, L.P.; Gross, M. Shape Modeling with Point-Sampled Geometry. ACM Trans. Graph. 2003, 22, 641–650. https://doi.org/10.1145/882262.882319.

Sun, P.; Kretzschmar, H.; Dotiwalla, X.; Chouard, A.; Patnaik, V.; Tsui, P.; Guo, J.; Zhou, Y.; Chai, Y.; Caine, B.; et al. Scalability in Perception for Autonomous Driving: Waymo Open Dataset. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 2443–2451.

Liu, Y.; Wang, X.; Hu, E.; Wang, A.; Shiri, B.; Lin, W. VNDHR: Variational Single Nighttime Image Dehazing for Enhancing Visibility in Intelligent Transportation Systems via Hybrid Regularization. IEEE Trans. Intell. Transp. Syst. 2025, 26, 10189–10203. https://doi.org/10.1109/TITS.2025.3550267.

Li, Q.; Du, Q.; Tian, L.; Liao, W.; Lu, G. Enhanced Semantic Segmentation of LiDAR Point Clouds Using Projection-Based Deep Learning Networks. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–15. https://doi.org/10.1109/TGRS.2025.3627917.

Geometry-Aware Human Noise Removal from TLS Point Clouds via 2D Segmentation Projection

##article.authors##

DOI:

キーワード:

抄録

利益相反に関する開示

ダウンロード *前日までの集計結果を表示します

引用文献

ダウンロード

公開済

ライセンス

言語