[go: up one dir, main page]

Academia.eduAcademia.edu
Real-Time Human Pose Detection About 3D Shape Estimation For Virtual Fitting of Cloths Rahman, Md. Hafizur , ID: 10-17953-3 American International University, Bangladesh Abstract. The main objective of this research is to generate an accurate, dynamic 3D Shape of a human subject in real-time and without processing huge data sets. By doing this, the main obstacle is to handle missing data and pose ambiguities. So, the mission of this research is to find a solution to the problem of acquiring body shape and detecting posture of human subjects efficiently. Utilizing multiple Kinect cameras, human body pose and shape is scanned and the data is used to reconstruct the human body shape. So gathering all the data avails to reconstruct the Human Body Shape and produce an interactive real-time solution for virtual fitting of clothings. Keywords: Human pose detection, Shape estimation, human shape modeling, virtual try-on, Virtual Fitting, Kinect, Reconstruct, Real time 1 Introduction Virtual try out is a system where shoppers can try on any clothes to check one or more of size, fit or style, but virtually rather than physically. Trying on clothes in stores today is one of the most time-consuming tasks. In a typical cloth store, you would select to pick a dress and try it out [1,2], Now, someone may have tried it out just before you; and cerebrating that gives a little prick. Besides, if you try to buy something using E-commerce sites, you can never be sure that the piece would fit perfectly or not. Currently, there are some solutions of virtual try-outs, but they have some limitations. In the solution, multiple Kinect cameras are used to scan the subject. Technically speaking, the fitting room will be based on the Microsoft Kinect. From the depth image of the Kinect the skeleton is extracted and the position and orientation of the cloth are adapted in regard to the joint positions and body measurements. Finally the subject is projected on a screen in 3D and subjects motion can also be synchronized in real-time [3–9]. 2 Literature review The research on human pose and shape estimation has been going on since 2001. Research was additionally done on estimating human body configuration utilizing shape context matching and locating body joints [16]. Human pose estimation predicated on silhouette shape analysis solves the quandary of identifying pose when multiple subjects are present. It additionally reduces dependency on external data source [15]. Another research [14] utilizes 3 cameras and automatically models virtual human subject. Endeavor to capture body kineticism utilizing visual tags has also been performed [13]. Another promising research denominated SCAPE [10] was done in 2005 which is a data driven model for developing human model. This was later utilized in sundry researches [?, 7, 10] .A research done in 2008 show that relegation of gender can be done accurately at a rate of 90.6% . Another research [24] show that human upper-body pose estimation from depth sequences is possible in which coarse human part labeling takes place first, followed by more precise joint position estimation as the second phase [11] .The quandary in this case is that utilizer has to take concrete and unnatural pose to calibrate the model. Another research[9] require that the human body is segmented from the image or rely on background subtraction surmising fine-tuned camera and static background. The latest researches show promising results in utilizing the Kinect camera to obtain human pose and shape in three dimension [2, 12–16]. Besides another research determines body joints from single depth images in real-time [14, 17]. 3 About the Kinect-Camera The Kinect camera has made a big influence not only concerning the gaming section but concerning the whole IT environment. James L. McQuivey, Vice President, Principal Analyst, Forrester, describes it as Kinect is to the next decade what the operating system was to the 1980s, what the mouse was to the 1990s, and what the Internet has been ever since. It is the thing that will change everything [3] .Although this statement is non-scientific and wants to predict the future it somehow reflects the excitement for this device. The Kinect camera was developed by Microsoft in cooperation with PrimeSense. The Kinect itself consists of three different systems which are working together on PrimeSenses Carmine chip (PS1080) as it is described in [3]. 4 System Setup Here the Kinect[1] Camera by Microsoft was acclimated to scan human body. The Kinect Camera has a viewing angle of 43 in vertical direction and 57 in horizontal direction. The camera setup is shown in the image above. The first part of image is the vertical view of the camera setup. And the second part shows why two cameras are needed to get the full image of the subject. 5 Human Pose detection and 3D Shape Estimation Utilizing multiple cameras, consummate 3D model of a human subject can be developed [3, 10, 18]. Moreover rectifications can be done by utilizing per pixel classification[20] and graph predicated authentic-time pose estimation. [19] The error due to casual habiliments can be reduced utilizing silhouette predicated shape analysis [5]. Determinately the data can be acclimated to engender a model of subject utilizing [6, 17, 20]. At first any user or person image is captured by kinect camera.Then masking the body of that person to get his or her body boundary.After that I use pose tracking pipeline algorithm [3] to know the pose of that person.then i may obtain the 3D shape of the body of that perosn.This person body shape with try to directly match with person’s input dress size and dress choice.This dress choice may found in the image database that is pre created by the admin.After that when our dress size is match with the perosn’s body shape choosen by the person then it will show on the display for virtual try out.Otherswise it will show the dress is unavailable. 6 Limitations Human body shape estimation is often ambiguous and this can be reduced by utilizing data sets Silhouette predicated shape analysis can provide shape of casually clothed subjects,but it still needs development. Besides, due to low viewing angle of Kinect, each camera must be at least 3.6 feets away from the subject [8, 9]. The distance can be reduced by developing a better camera. Conclusively all the physics of authentic life, like smooth kineticism, gravity, moving of cloths due to wind, stretching of elastic, smooth folding have not yet been developed. A physics engine could solve the quandary [?, 21]. 7 Conclusion The research has been done in assumption or theoretic knowledge. All verbal expressions in this paper needs to be proved by experimentation .The research shows that a plenarily functional virtual try out system can be developed if present limitations can be solved. With the introduced solution, a better solution can be achieved. The future steps will be to performing experimentation on the theory and prove it and develop the clothing system of model and develop the complete virtual fitting system. Acknowledgments I would like to thank my teacher Dr. Tabin Hasan for his guidance and full co-operation in doing the research. References 1. P. Decaudin, D. Julius, J. Wither, L. Boissieux, A. Sheffer, and M.-P. Cani, “Virtual Garments: A Fully Geometric Approach for Clothing Design,” Computer Graphics Forum, vol. 25, no. 3, pp. 625–634, Sep. 2006. [Online]. Available: http://doi.wiley.com/10.1111/j.1467-8659.2006.00982.x 2. P. Presle, “A Virtual Dressing Room based on Depth Data,” vol. 2012, 2012. 3. M. Ye, S. Member, H. Wang, and N. Deng, “Real-Time Human Pose and Shape Estimation for Virtual Try-On Using a Single Commodity Depth Camera,” no. January, 2014. 4. M. Ye and R. Yang, “Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2353–2360, Jun. 2014. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6909698 5. J. Ren, M. Rahman, N. Kehtarnavaz, and L. Estevez, “Real-Time Head Pose Estimation on Mobile Platforms.” 6. G. Mori and J. Malik, “Estimating Human Body Configurations using Shape Context Matching,” pp. 1–8. 7. V. Gulshan, V. Lempitsky, and A. Zisserman, “Humanising GrabCut: Learning to segment humans using the Kinect,” 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1127–1133, Nov. 2011. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6130376 8. J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake, “Real-time human pose recognition in parts from single depth images,” Cvpr 2011, pp. 1297–1304, Jun. 2011. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5995316 9. M. V. D. Bergh, E. Koller-meier, and R. Kehl, “Real-Time 3D Body Pose Estimation,” pp. 1–28, 2009. 10. D. Anguelov, S. Thrun, J. Rodgers, and J. Davis, “SCAPE : Shape Completion and Animation of People,” pp. 408–416. 11. S. Z. Masood, C. Ellis, M. F. Tappen, and J. J. L. Jr, “Measuring and Reducing Observational Latency when Recognizing Actions.” 12. S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, and A. Fitzgibbon, “KinectFusion : Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera *.” 13. K. Jahrmann, “3D Reconstruction with the,” vol. 2013, 2013. 14. K. Kjæ rside, K. J. Kortbek, and H. Hedegaard, “ARDressCode : Augmented Dressing Room with Tag-based Motion Tracking and Real-Time Clothes Simulation,” 2005. 15. N. Magnenat-Thalmann, H. Seo, and F. Cordier, “Automatic modeling of virtual humans and body clothing,” Journal of Computer Science and Technology, vol. 19, no. 5, pp. 575–584, Sep. 2004. [Online]. Available: http://link.springer.com/10.1007/BF02945583 16. a. Mittal and L. Davis, “Human body pose estimation using silhouette shape analysis,” Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003., pp. 263–270. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1217930 17. M. V. Steinkirch, “Introduction to the Microsoft Kinect for Computational Photography and Vision,” pp. 1–4, 2013. 18. J. Tong, J. Zhou, L. Liu, Z. Pan, and H. Yan, “Scanning 3D full human bodies using Kinects.” IEEE transactions on visualization and computer graphics, vol. 18, no. 4, pp. 643–50, Apr. 2012. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/22402692 19. W. R. Schwartz, “Human Detection Based on Large Feature Sets Using Graphics Processing Units 2 Related Work 3 Serial Approach,” vol. 35, pp. 473–479, 2011. 20. G. Mori, J. Malik, and S. Member, “Recovering 3D Human Body Configurations Using Shape Contexts,” vol. 28, no. 7, pp. 1052–1062, 2006. 21. D. Droeschel and S. Behnke, “3D Body Pose Estimation using an Adaptive Person Model for Articulated ICP,” no. December, 2011.