基于YOLOv3模型的人脸检测与头部姿态估计融合算法Fusion Algorithm of Face Detection and Head Pose Estimation Based on YOLOv3 Model
李永杰;周桂红;刘博;
摘要(Abstract):
针对头部姿态估计中的人脸检测框尺寸难于学习问题和将人脸检测、头部姿态估计分为两阶段的模型中流程复杂、耦合程度高、误差累积严重的问题,本文提出一种基于YOLOv3模型的人脸检测与头部姿态估计融合算法。通过K-means聚类方法对训练集中人脸区域的尺寸进行聚类,得出9组聚类结果,以模拟真实情况下人脸区域的尺寸和比例;通过拓展YOLOv3模型,实现人脸检测和头部姿态估计同时进行,并在3个不同层次的特征图上进行人脸检测和头部姿态估计,实现对特征图的多尺度检测,充分利用了特征图中的信息;采用端到端模式进行训练,简化头部姿态估计任务的处理流程。在CAS-PEAL-R1姿态子集上取得99.23%的预测准确率,在Pointing′04数据集上pitch和yaw方向分别取得了3.79°和4.24°的平均绝对误差。结果表明,本模型在满足实时性要求的前提下,能够出色完成人脸区域检测与头部姿态估计任务,充分证实本文方法的可靠性与实用性。
关键词(KeyWords): 头部姿态估计;YOLOv3模型;K-means;多尺度检测;深度学习
基金项目(Foundation): 国家自然科学基金(61972132)
作者(Authors): 李永杰;周桂红;刘博;
DOI: 10.16088/j.issn.1001-6600.2021070911
参考文献(References):
- [1] KUCHINSKY A,PERING C,CREECH M L,et al.FotoFile:a consumer multimedia organization and retrieval system[C]∥ Proceedings of the 1999 SIGCHI Conference on Human Factors in Computing Systems.New York:ACM,1999:496-503.
- [2] 陈得恩,张建伟,柯文俊.稳定的视频内头部姿态估计方法[J].计算机工程与设计,2020,41(12):3438-3443.
- [3] 肖仕华,桑楠,王旭鹏.基于深度学习的三维点云头部姿态估计[J].计算机应用,2020,40(4):996-1001.
- [4] BORGHI G,FABBRI M,VEZZANI M,et al.Face-from-depth for head pose estimation on depth images[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(3):596-609.
- [5] RUIZ N,CHONG E,REHG J M.Fine-grained head pose estimation without keypoints[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW).Piscataway:IEEE,2018:2074-2083.
- [6] YANG T,CHEN Y T,LIN Y Y,et al.FSA-Net:Learning fine-grained structure aggregation for head pose estimation from a single image[C]// Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:1087-1096.
- [7] AHN B,CHOI D G,PARK J,et al.Real-time head pose estimation using multi-task deep neural network[J].Robotics and Autonomous Systems,2018,103:1-12.
- [8] 齐永锋,马中玉.基于深度残差网络的多损失头部姿态估计[J].计算机工程,2020,46(12):247-253.
- [9] 郭赟,张剑妹,连玮.基于头部姿态的学习注意力判别研究[J].科学技术与工程,2020,20(14):5688-5695.
- [10] 方阳,刘英杰,孙立博,等.基于SSD模型的人脸检测与头部姿态估计融合算法[J].江苏大学学报(自然科学版),2019,40(4):451-457.
- [11] MITTAL A,KUMAR K,DHAMIJA S,et al.Head movement-based driver drowsiness detection:a review of state-of-art techniques[C]// 2016 IEEE International Conference on Engineering and Technology(ICETECH).Piscataway:IEEE,2016:903-908.
- [12] 赵磊,王增才,王晓锦,等.基于ASM局部定位和特征三角形的列车驾驶员头部姿态估计[J].铁道学报,2016,38(9):52-58.
- [13] MURPHY-CHUTORIAN E,TRIVEDI M M.Head pose estimation in computer vision:a survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(4):607-626.
- [14] 梁令羽,孙铭堃,何为,等.Bagging-SVM集成分类器估计头部姿态方法[J].计算机科学与探索,2019,13(11):1935-1944.
- [15] GIRSHICK R.Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision(ICCV).Piscataway:IEEE,2015:1440-1448.
- [16] REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
- [17] LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shot MultiBox detector[C]// Computer Vision-ECCV 2016.Cham:Springer,2016:21-37.
- [18] REDMON J,FARHADI A.YOLOv3:an Incremental Improvement[EB/OL].(2018-04-08)[2021-07-09].https://arxiv.org/abs/1804.02767.DOI:10.48550/1804.02767.
- [19] REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway:IEEE,2016:779-788.
- [20] LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway:IEEE,2017:936-944.
- [21] HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway:IEEE,2016:770-778.
- [22] 张晓华,山世光,曹波,等.CAS-PEAL大规模中国人脸图像数据库及其基本评测介绍[J].计算机辅助设计与图形学学报,2005,17(1):9-17.
- [23] MA B P,HUANG R,QIN L.VoD:a novel image representation for head yaw estimation[J].Neurocomputing,2015,148:455-466.
- [24] 章惠,张娜娜,黄俊.优化LeNet-5网络的多角度头部姿态估计方法[J].计算机应用,2021,41(6):1667-1672.
- [25] 梁令羽,张天天,何为.多尺度卷积神经网络的头部姿态估计[J].激光与光电子学进展,2019,56(13):79-86.
- [26] FOYTIK J,ASARI V K.A Two-layer framework for piecewise linear manifold-based head pose estimation[J].International Journal of Computer Vision,2013,101(2):270-287.