1. Introduction
In recent decades, light detection and ranging (LiDAR) has been used for obtaining a three-dimensional (3D) geometric representation of the indoor environment in the architectural, engineering, and construction (AEC) industry [
1,
2,
3]. The LiDAR-based geometric representation indicates the shape descriptors of indoor building components by a format of point cloud data (PCD) [
4,
5]. Given the high speed and accuracy of LiDAR, methods for deriving the information of building components (e.g., structural member, non-structural member, and furniture) from the indoor PCD are developed to secure the effectiveness of facility management [
6,
7,
8]. The information includes the location and size of mechanical, electrical, and plumbing [
9,
10,
11], building components [
12,
13,
14], and furniture in 3D space [
15,
16,
17]. In the case of an indoor environment, the 3D scanning method for collecting PCD of the object as many as possible is selected for minimizing the occlusion caused by numerous objects [
18,
19]. The mobile laser scanner (MLS) has been widely used for scanning the indoor environment, which minimizes interference by moving the LiDAR.
The existing methods for deriving the information of building components from indoor PCD detect and segment the components based on the trajectory information of MLS. Considering the non-rigid shape of the indoor environment, the existence of openings (e.g., door, and window) is used for properly separating adjacent rooms using the trajectory of MLS [
20,
21,
22,
23]. In the process of deriving building components, the proper separation of room decreases the error on the segmentation of the inner wall’s PCD [
4,
24,
25]. Since the adjacent rooms share an inner wall whose thickness is thinner than the vertical structural member, the existing method uses the LiDAR-oriented normal vector to minimize the error. The normal vector-based segmentation divides the under-segmented inner wall into two clusters that are included in different rooms respectively [
26].
However, the existing methods have a limitation on applying to an environment that restricts the acquisition of trajectory information of MLS. In particular, the indoor environment has building components made of concrete, which makes it hard to use navigation systems that provide the real-time trajectory. Thus, MLS’s trajectory in building interior is estimated by inversely calculating from the moving direction and distance obtained by the inertial measurement unit [
27,
28,
29]. The approximate trajectory causes the error in detecting building components [
30,
31]. If the number of scanning positions increases, it has limitations on applying the existing methods to PCD without additional sensors because the accumulated error in PCD causes confusion of detecting the existence of building components based on the existing method suitable for high-quality PCD [
32,
33,
34].
In this regard, this study proposes a building component detection algorithm that does not need trajectory information of MLS by using random sample consensus (RANSAC)-based region growing for low-quality indoor PCD captured through MLS. This study is organized as follows: In
Section 2, a literature review is conducted to derive the limitations of existing object detection methods for building components in indoor PCD. In
Section 3, a building component detection algorithm is proposed that improves the limitation of existing object detection methods. In
Section 4, a case study is conducted to verify the algorithm, and in
Section 5, the proposed method is verified by analyzing the result of the case study. In
Section 6, our conclusions are presented.
2. Related Work
2.1. Object Detection on Indoor PCD
Object detection on indoor PCD is conducted by localizing objects based on their feature descriptors (e.g., planar, cylindrical, and cone shape) followed by segmenting the clutter that belongs to a corresponding descriptor. This detection is used in many fields, especially the AEC industry to recognize obstacles (e.g., building components, doors, furniture, and stairs). For example, the path optimization for pedestrians and the assessment of structural members are conducted using recognized objects of indoor environments [
35,
36]. To accurately detect objects from indoor PCD that includes various objects, highly accurate PCD is needed because it is difficult to detect closed clutters such as the inner wall sharing adjacent rooms [
21]. Furthermore, the density of points including the detected object must be uniform since a segmentation of the object is conducted by comparing it with a constant uniformity model to ensure the accuracy of geometric representation [
37]. However, indoor PCD acquired through MLS has low accuracy and uniformity because of the LiDAR movement, which requires manipulation of PCD to suit object detection through pre-processing (e.g., segmentation, rotation, transferring, and labeling) [
38]. Accordingly, this study analyzes the limitations of existing object detection caused by low accuracy and uniformity MLS, and derives why applying existing object detection methods is difficult. Finally, this study presents a measure to solve these difficulties.
2.2. Limitation on the Accuracy of MLS
LiDAR captures PCD by converting the flight time of light from the laser to the object surface into a distance (
Figure 1a). In the case of indoor 3D scanning, the PCDs captured from one scanning are aligned according to the LiDAR’s origin (
Figure 1b). The aligned indoor PCD captured by an MLS has an error because the movement of origin is ignored (
Figure 1c) [
39]. Furthermore, the PCD acquisition for continuous 3D scanning on a large-scale environment accumulates the error in an integrated PCD.
The accumulated error provides a thickness of planar objects, decreasing the accuracy of object detection by comparing clutters with predefined models. Since the models for comparing with clutters need uniformly distributed points on their surface, the thickness of the object decreases the accuracy of object detection. The existing methods transform the raw PCD into a suitable format for object detection to ensure accuracy [
40]. However, it is hard to provide a proper format for the PCD that has a non-rigid shape, such as in an indoor environment. Thus, this study reviews the literature about the Manhattan-world assumption to transform the raw indoor PCD into a suitable format for object detection. Also, the literature of RANSAC is surveyed to solve the accuracy limitation using an MLS because of clutter with irregular thickness.
2.2.1. Manhattan-World Assumption
The Manhattan-world assumption is a hypothesis that defines the locational characteristics between objects that exist in a building [
41]. The assumption regarding indoor PCD is that indoor building components (e.g., walls, floors, and ceilings) are mostly composed of horizontal components (i.e., floors and ceilings) and vertical components (i.e., walls and columns) [
42]. This assumption yields effective analysis because it uses the topological feature to simplify the PCD acquired from the AEC industry whose size is larger than that of existing data from small objects (e.g., rabbits and statues).
However, the Manhattan-world assumption regarding indoor PCD can classify building components based on the angle between the normal vector of individual components and the ground after recognizing that an object exists based on the ground in the PCD. In other words, using the Manhattan-world assumption is difficult in object detection with indoor PCD that only has geometric data. Thus, this study segments points into components through aligning the components into planes (i.e., an XY-, XZ-, or YZ-plane) based on the normal vector. The raw indoor PCD is transformed into suitable formats of data using the Manhattan-world assumption.
2.2.2. RANSAC
RANSAC is a detection method for unstructured clutters that exist in the PCD by comparing the prior input model and clutters [
43]. It selects points that are similar to the model’s shape out of all points in the PCD as the RANSAC result, and all other unselected points are classified as outliers. To apply RANSAC, the number of points (N) and the shape of the model for comparison are set. Subsequently, the RANSAC conducts processes to change the locations of the model and to derive the least gross error of distance among the points of the clutter with the model (e.g., lines, curves, planar surfaces, and curved surfaces).
The RANSAC result is similar to that of regression. The difference is that regression finds the optimal line for all points that exist in the data, whereas RANSAC sets N for the detection of objects and removes unnecessary outliers to optimize the result (
Figure 2). In particular, if more than two objects exist in the PCD, a threshold of N is set to ensure the efficiency and accuracy of object detection. Because RANSAC is used to derive points with the least gross error based on the feature descriptors of the model regardless of the PCD density, it can be highly applicable when the object type and number are limited [
44,
45,
46]. For example, indoor PCD has limited simple shapes such as planes, cylinders, and boxes as building components among objects, which is highly applicable to RANSAC [
47,
48].
However, when the number of objects is over two, the number of objects that satisfy the prior inputted N for RANSAC-based detection can be larger than the actual number of objects if RANSAC is applied to all PCD. In particular, the curved part of connected planes causes the wrong detected objects concerning the points located in the part.
Figure 3a,b show examples of RANSAC-applied objects that exist in indoor PCD, in which gray points refer to raw PCD, and blue points refer to segmented points by RANSAC.
Figure 3a shows the PCD with one object presented. Although the actual object is recognized by RANSAC based on pre-inputted N, many points are classified as outliers.
Figure 3b shows fewer segmented blue points than
Figure 3a because of the shared portion of objects in the edge [
49]. To solve this problem, this study applied RANSAC to the segmented PCD, which was separated into each plane of the coordinate space based on the Manhattan-world assumption to minimize the error of building components detection.
2.3. Limitation on the Uniformity of MLS
The performance of LiDAR is determined by the number of lasers inside it. As the number of lasers increases, the scannable angle in the vertical direction (
in
Figure 4a) and the scannable area (
Figure 4b) are increasing. However, there were unscanned surfaces among the object surfaces as shown in
Figure 4b, and the number of points may differ even for the planar surfaces of the same area. In addition, the distance between points increases as the distance between the LiDAR and object increases, resulting in an increased unscanned surface ratio compared to the scanned surface. Thus, the density of points on the object surface differs in an environment where the distance between LiDAR and object changes according to the LiDAR location.
This characteristic degrades the uniformity of the indoor PCD acquired through an MLS, reducing the accuracy of object detection using a feature descriptor suitable for a constant density PCD [
50]. To solve this problem, 3D scanning is conducted while moving a scanner in the x-, y-, and z-axes. However, it is difficult to acquire points on the surfaces of all objects at the same density. Thus, this study conducted a literature review of space decomposition and region growth to solve the aforementioned problem of an uneven PCD.
2.3.1. Space Decomposition
Space decomposition simplifies 3D PCD (w) into regular size box (
Figure 5b) in which each box with points is called a voxel (
Figure 5c). PCD acquired from large objects such as the indoors have numerous points, requiring excessive time for data processing if object detection is conducted on all points [
51,
52]. Thus, PCD with many planar objects (e.g., walls, floors, and ceilings) such as the case with indoor environments have a low accuracy reduction rate for data through data decomposition, which is why voxels are mainly used in object detection [
53,
54,
55]. However, if voxels were produced according to whether points exist, a voxel with many points and one created by outlier would be regarded as the same point. For example, the space decomposition on indoor PCD where outlier frequently occurs by openings (e.g., windows, doors, and mirrors) degrades the accuracy of object detection [
56]. Thus, this study aims to minimize the effect of outlier through individual data processing on voxels created through space decomposition.
2.3.2. Region Growing
Region growing aggregates points into larger regions to detect planar objects that exist in the PCD. As shown in
Figure 6, points are connected based on preset criteria (e.g., distance and normal vector) starting from seed points or regions that exist in the PCD [
56]. Region growing in 2D PCD combines points positioned closer than the preset distance based on points that belong to the seed region [
57]. However, region growing in 3D PCD is conducted based on both the distance and the point’s normal vector because many object shapes in 3D PCD are not planes but rather 3D curves [
58,
59]. This can be applied to all cases when the objects’ surfaces are smooth and when the distance between points does not exceed the threshold of distance.
However, because points that share the same object surface are combined based on the normal vector, accurate normal vectors are required for individual points. In particular, because MLS cannot provide accurate normal estimation when there is no trajectory information, it is difficult to use region growing based on normal vectors [
60,
61]. Accordingly, the present study aims to minimize the accuracy reduction of object detection through directly applying region growing to voxels.
3. Proposed Algorithm
This study conducted a literature review of object detection in the case of indoor PCD acquired through an MLS to verify that existing methods have limitations on applying to that PCD. Moreover, the alternatives for indoor PCD are derived through a literature review of MLS trajectory-based building component detection to propose a suitable algorithm for indoor PCD where trajectory information is not available. The algorithm in this study applies RANSAC to voxels based on the space decomposition of indoor PCD to remove a thickness of planar objects. In addition, region growing is applied to conduct building component detection where the lack of uniformity is solved by RANSAC.
3.1. Overview
There are four steps in the algorithm: pre-processing, seed region generation, region growing, and building component detection. Each step is a data process on indoor PCD captured through an MLS. The pre-processing step has the following substeps: alignment the raw PCD with the coordinate planes, normal estimation based on the k-nearest neighbor (k-NN) algorithm, PCD segmentation based on the normal vectors of points, and voxel generation by applying space decomposition on PCD. In the next region growing step, component candidates are made of connected regions through checking the connectivity between the seed region with the surrounding voxels. In the pre-processing and region growing steps, outliers are detected by applying RANSAC on voxels based on the inputted number of points. Finally, outlier is removed in the building component detection step based on the number of points, and building component detection is conducted through the normal vectors of regions without outlier.
3.2. Pre-Processing
As described, the pre-processing step has four substeps: alignment, normal estimation, PCD segmentation, and space decomposition. The raw indoor PCD captured through MLS without additional sensors which can provide the location information of MLS needs manual alignment with coordinate planes (i.e., XY, YZ, and XZ-plane). In this study, the raw PCD is rotated through manually aligning in CloudCompare to maximize the orthogonal feature of indoor environment as defined in the Manhattan-world assumption: the indoor environment has planar objects which are perpendicular or parallel with others [
41]. In the normal estimation, the normal vectors of points are calculated based on the k-NN algorithm. The normal estimation based on k-NN algorithm derives the normal vector calculating a mean value of cross products concerning the k-th adjacent points located in planar surface [
62]. In fact, the planar surface derived from k-NN differs from the real object. However, the PCD used in this study has no LiDAR trajectory; thus, the normal vector indirectly estimated by the k-NN algorithm is used. To create a planar surface that belongs to clutter, k points that are close to the target point are grouped into one, and a 3D planar surface is derived as a planar surface whose normal vector is mean value of cross product among target point k points. After this, PCD is segmented based on the normal vector of each point. Please note that the normal vector calculated through k-NN is rounded with an infinite decimal; thus, the rounded normal vector to two decimal places identifies the tendency of the normal vector. This is to use the characteristic that most points existing in indoor PCD belong to a building component, as shown in
Figure 7 [
42]. Candidates of possible building components are segmented by the normal vector, as shown in
Figure 7a–c, using the fact that indoor building components are either horizontal or vertical to each other based on the Manhattan-world assumption [
41]. PCDs of
Figure 7a–c are points that have the normal vector perpendicular to the YZ-plane, XZ-plane, and XY-plane, respectively. This process is applied to minimize the error when applying RANSAC to many building components.
After completing the segmentation of raw PCD based on the normal vector, space decomposition is conducted to build voxels where RANSAC will apply. This process creates voxels including the minimum points required for RANSAC, through which it prevents applying RANSAC to outliers that are included in objects. The required variables in space decomposition in this study are the edge length of voxel (E in
Figure 8a) and the minimum number of points inside a voxel to distinguish outliers (N in
Figure 8b). The edge length of the voxel is related to the rate of PCD simplification. The longer the edge length, the simpler the PCD. However, this may cause a problem where the indoor features decrease. Thus, this study sets the length of edge between the maximum and minimum values of the LiDAR error. In addition, the minimum number of points inside a voxel is set to distinguish between points on the planar surface of the actual object and outliers. The number is set to three in this study, which is the minimum number of points that can make a planar surface. If the number of points selected through RANSAC based on the degree for filtering outlier, as shown in the selected points in
Figure 8b, is more than three, the voxel is recognized as a region.
3.3. Seed Region Generation
Seed region generation is a process of inputting the selected points in
Figure 8b into a seed region after selecting a voxel without outliers. The voxel excluding outliers is a voxel whose N is greater than three. This process filters points that have a large difference with the normal vector of the plane where the voxel is present (
Figure 8b). Based on the parameter of difference regarding the normal vector, the algorithm calculates the included angle between the mean value of normal vector of points in the voxel with each point, and removes the point whose normal vector has larger degree than the parameter. Model fitting was conducted with all points (
Figure 9a) including outliers. The result obtained by voxel-based RANSAC proposed in this study was similar to the surface of the object where the voxel was present, as shown in
Figure 9c, in contrast with the previous RANSAC where an unsuitable plane fitting was made as shown in
Figure 9b. The seed region of the raw PCD was derived by applying the voxel-based RANSAC, and outliers was removed to improve the accuracy of the region growing.
3.4. Region Growing
This study checks the connectivity between the seed region with adjacent n regions, as shown in
Figure 10a, to apply region growing to seed region derived by RANSAC. However, each region has no shared points because of space decomposition. Thus, this study conducted connectivity checking on 26 adjacent regions around the seed region (the red region in
Figure 10b). The 26 adjacent regions (the gray regions in
Figure 10b) are the maximum number of regions that share edges, faces, or vertices with the seed region. First, adjacent regions are checked by determining whether points are present in adjacent regions. If points are present in regions, the connectivity checking is conducted among the seed region and adjacent regions through comparison between the normal vector of points that exist inside regions. The normal vector of points located in the colored regions is extracted, and the mean value of each colored regions are compared to check the connectivity as shown in
Figure 10c.
3.5. Building Component Detection
The building component detection in this study first distinguishes planar objects created through a region growing into candidates of horizontal and vertical components (
Figure 11a). Components are distinguished based on the normal vector (i.e., XY-plane = (0, 0, 1), XZ-plane = (0, 1, 0), YZ-plane = (1, 0, 0)), and then a building component is classified through checking whether it overlaps with other candidates. In the process of classifying, the candidate that the number of points of it is lower than the predefined number of points for eliminating outliers is removed. Overlapping between candidates is determined by checking the points included in the candidates, as shown in
Figure 11, thereby eliminating overlapping candidates (
Figure 11b). If candidates satisfy the presented two criteria, they are classified as building components according to the normal vector.
4. Case Study
This case study verifies the building component detection algorithm, detecting and segmenting building components from the PCD without trajectory information acquired through MLS. In particular, this study focuses on verifying the applicability of proposed algorithm for indoor PCD which has high error described as the thickness of planar surface caused by several reasons such as the absence of additional sensors and movement of MLS. Although the limitation on detecting building components can be improved by employing the trajectory information, this study proposes the novel approach based on the geometry data for detecting components from PCD regardless of quality of data (e.g., registration error, interference points, etc.).
This algorithm was implemented in MATLAB R2020a. First, the precision and recall of the building components are calculated after applying the algorithm to indoor PCD. The precision and recall used in this study are indicators of the recognition rate for this algorithm. The results were derived by setting three cases: true positive refers to the recognition of actual components, false positive refers to cases where the recognized component is not an actual component, and false negative refers to cases where an actual component is not recognized. The precision and recall of the result are calculated by the Equations (1) and (2) [
63]. Next, the over- and under-segmentation rates of the recognized building components are derived. Over- and under-segmentation refer to cases where a single component is segmented into two or more components and where two or more components are combined into one, respectively, indicating the recognition precision [
64].
4.1. Overview
The datasets in this case study were acquired through an MLS using HDL-32E and VLP-16.
Table 1 presents the PCD’s information, whose environment includes various object obstacles (e.g., tables, chairs, and whiteboards) that occlude building components. To verify the accuracy and applicability of the proposed algorithm, this case study used the different three datasets that have different shape, type, interior objects, and accuracy of PCD. In particular, the huge obstacle of segmentation for indoor PCD is non-rigid shape of it, thus this study employed the datasets whose shapes are definitely different. The number of rooms in each dataset are seven, four, and five, respectively. Furthermore, the Dataset #2 and Dataset #3 have spaces that are not the room: corridor and kitchen-integrated with living room. Through comparing the results of Dataset #1 and #2 with Dataset #3, the applicability of this study would be verified for indoor PCDs that have non-rigid shape.
Next, the datasets that have different type and interior object of indoor environment were employed to test that the proposed algorithm could properly detect and segment the building components from the different conditions. Dataset #1, #2, and #3 were captured from detached house, office, and apartment, respectively.
Lastly, the applicability for capacity of LiDAR will be verified through applying the proposed algorithm to two PCDs that have different density captured through HDL-32E and VLP-16. If the algorithm were able to ensure the accurate results of case study on different LiDAR, this study could be possible to extend the applicability for low-density indoor PCD.
The parameters used in this case study is presented in
Table 2. The number of points for estimating normal vector based on k-NN algorithm is 100 since the number is the maximum number of points calculated by capacity of hardware used in this study. Although the estimated normal vector is more similar if the number of points for calculating the normal vector was increasing, the excessive number of points causes the inaccurate normal vector since the points gathered by the number are regarded as one object. This study set the number of points 100, which can derive the approximate value of normal vector based on the iterative testing. In addition, the threshold for filtering outlier and the number of points for RANSAC are three. The three is minimum number of points for organizing the surface; thus, this study set the threshold three for filtering the outliers that are insufficient for creating the triangular plane. The degrees for filtering outlier in voxel and checking the connectivity between regions are two and five degrees. The degrees for filtering and checking are optimized value derived through iteratively testing the result of building component detection; however, the degree depends on the thickness of planar objects located in PCD. In particular, the LiDAR sensors used in the have different accuracy and thickness of PCD; thus, the degrees for checking of Dataset #2 and #3 are five since the VLP-16 has lower accuracy (i.e., ±3 cm) than the HDL-32E (±2 cm).
4.2. Result
Figure 12,
Figure 13 and
Figure 14 presents the raw PCD and results of the building component detection from the datasets derived through the algorithm, and
Table 3,
Table 4 and
Table 5 present the precision, recall, and over- and under-segmentation rates.
Table 6 presents the running time for processing steps of Dataset #1, #2, and #3.
The precision set in this study refers to the ratio of objects that correspond to building components, and the recall refers to the recognized ratio of actual building components that exist in the building. As the values are higher, the algorithm is used to recognize building components that exist indoors more accurately. As presented in
Table 3,
Table 4 and
Table 5, the precision and recall values of the floor and ceiling in all datasets were all 100% except the ceiling of Dataset #3; this was because there were few floors and ceilings in the building, and the interference caused by indoor objects in the building was lower for floors and ceilings than that for walls. In
Section 4.2.1,
Section 4.2.2,
Section 4.2.3 and
Section 4.2.4, this study analyzed the results of horizontal components (i.e., floors and ceilings) and vertical components (i.e., walls) to verify the accuracy and applicability of this algorithm.
The running time for processing steps of Dataset #1, #2, and #3 are presented in
Table 6. The time-consuming step is pre-processing since the normal estimation using k-NN algorithm calculates the normal vector of all points. Furthermore, the time for normal estimation is proportional with the number of points in PCD; thus, the Dataset #3 consumed the 514.2 seconds for estimating the normal vector based on the number of points for k-NN algorithm in MATLAB R2020a since the Dataset #3 has largest number of points. Compared to the Dataset #3, the cases of Dataset #1 and #2 consumed less time for estimating, 138.6 and 385.3 seconds, but the step take most portion of time for pre-processing. The next time-consuming step is region growing because the algorithm iteratively checked all regions generated from the seed region generation step. Although the time is decreasing through minimizing the size of voxel since the time for region growing is affected by the size of voxel used in seed region growing, it is difficult to properly segment the PCD since the big size of voxel removes the topological feature of objects. The average rate of time for seed region generation and building components detection took 1.8 and 3.4% of total processing time. Compared with the pre-processing and region growing whose processing time depends on the number of points, the other steps depend on the size of voxel; thus, it can be optimized by manipulating the size.
4.2.1. Precision and Recall of Vertical Component Detection
The detection of walls in all datasets showed higher precision than recall. The average precision was 91.63%, meaning that the proposed algorithm accurately detected components of the indoor PCD. Dataset #1 was acquired from a house with seven rooms (
Figure 15a).
Figure 15b shows the comparison between the detected walls and the floor plan, and
Figure 15c presents the yellow-color walls that were not detected. In particular, the area of No. 1 wall in
Figure 15c was very narrow, so the number of points that existed in the voxel was significantly small, creating an error in the calculation of the normal vector based on the distance between points. The points in the No. 1 wall were removed through region growing, causing the non-detection. In addition, the No. 2 wall in
Figure 15c was not detected because it was removed from PCD segmentation in pre-processing. However, most undetected components in Datasets #2 and #3 were columns inside the buildings, the same as was observed for the No. 1 wall in
Figure 15c. These columns were very narrow and thus were not detected. Except for such special circumstances, the results verified that the precision of all walls exceeded 90%.
4.2.2. Segmentation of Vertical Component
The accurate segmentation of recognized components is also important because it is necessary to calculate the accurate dimension and location of the building components. In particular, the thickness of inner wall that shares two rooms is affected by their partition types (e.g., structural member, non-structural member, etc.), which contrasts with outer walls. PCD of inner walls acquired through MLS is expressed as two planes that are close to each other. Therefore, the inner walls that have little thickness may be segmented into a single plane. To prevent this under-segmentation, the size of voxel used in the RANSAC and region growing was optimized for separating the surfaces of inner wall into two rooms. This study verified that no under-segmentation occurred in all datasets. In contrast, over-segmentation occurred in all datasets because of wall segmentation by indoor curtains. In particular, Dataset #2 had the segmentation of one side wall into many walls because curtains had been installed in all windows to minimize the LiDAR error caused by light from outside (
Figure 16).
4.2.3. Precision and Recall of Horizontal Component Detection
Floors and ceilings, all of which were horizontal components, were fully detected in all datasets except for the ceiling in Dataset #3, as verified in the precision and recall of the components. The reason for the undetected ceiling in Dataset #3 was that points of the ceiling were not acquired because of the narrow ceiling space in the bathroom attached to the master bedroom. In particular, the ceiling was not scanned because of the interference of light emitted, while an MLS entered the master bedroom because of the curb that existed between the ceilings of the master bedroom and bathroom in the master bedroom (
Figure 17). Except for this detection error caused by these characteristics, the results verified that the precision of the horizontal building components was high for all datasets.
4.2.4. Segmentation of Horizontal Component
The over- and under-segmentation of horizontal building components in this study occurred because of excessive separation or combinations of floors and ceilings. In particular, ceilings were separated by the curb of the ceiling that separates each room, whereas floors in all rooms were connected. In this study, accurate detection was obtained when a single floor was accurately segmented in a single building. As presented in
Table 3,
Table 4 and
Table 5, this algorithm accurately detected the ceilings in the segmentation of ceilings, whereas over-segmentation occurred in floors in Datasets #1 and #2. In particular, the objects inside the room interfered with the floors and LiDAR in Dataset #2 (
Figure 18).
5. Discussion
This study verified the building component detection algorithm through the case study of indoor PCD captured through MLS. The mean values of detection precision and recall secured over 92% using the proposed building component detection algorithm; in addition, the precision and recall of application whose result derived from International Society for Photogrammetry and Remote Sensing (ISPRS) benchmark datasets got over 95%. The ISPRS benchmark datasets ae public datasets provided by Working Group IV 5. Based on the difference between the case of raw PCD and ISPRS datasets, it is found that the quality of PCD affects the result of detection since the ISPRS datasets has more uniform point density than PCD used in the case study; however, the proposed algorithm can recognize over 92% of building components from low-quality data without trajectory information through hierarchical steps. In particular, the detection error caused by the lack of normal vector was minimized through the process of the segmentation on indoor PCD, and the accuracy of the total process was ensured by filtering outliers in raw points, segmented clutter, and grown regions in the detection step. Furthermore, RANSAC was used for separating outlier that was not eliminated in either the pre-processing or seed region generation steps. The PCD without outlier could provide the more uniform feature descriptor, which improved the accuracy of detection.
The main parameter affecting the result of detection is degrees for checking since the parameter helps the algorithm properly connecting the adjacent regions. Compared with the degree for filtering that removes useless points, the parameter can connect among different planar surfaces or separate one surface into more than two surfaces. In particular, the algorithm divided the one surface into numerous planes if the parameter has low degree since the raw indoor PCD with high registration error generates more than two seed regions from planar surface. Furthermore, a high degree caused under-segmentation since the joint part located between adjacent planar building components has gradually changing normal vector from one of components to another; thus, the algorithm has trouble to decide whether the adjacent regions located in the joint part due to the similarity of normal vector. The next parameter is the number of points for k-NN because the parameter affects the accuracy of normal estimation. Since the algorithm uses the degree calculated from normal vector, the accuracy of normal estimation affects the whole processes of the algorithm. However, the normal vector can be accurately estimated by k-NN algorithm for planar surface then the excessive points for normal estimation cause meaningless processing time. The parameter for k-NN is commonly used in normal estimation for indoor PCD. To optimize the accuracy and efficiency of the algorithm, this study iteratively manipulated the parameter and set as presented in
Table 2. In addition, the parameters—threshold for filtering and the number of points for RANSAC—are determined based on the minimum number of points for generating the plane. The degree for filtering was set for removing the outliers located in voxel, which has same role in detecting outlier with the degree for checking; however, the parameter has limitation on removing the region because the normal vector of points located in same voxel is same if the number of points was lower than the number of points for k-NN. In this study, the parameter is used for removing the outliers located in the joint part.
Through applying the proposed algorithm, building component detection can be done on indoor PCD that has high registration error without trajectory information of MLS. The proposed algorithm provides three benefits to contribute to the body of knowledge in the AEC industry. First, the high accuracy of this algorithm makes it possible to derive 3D geometric representation from scanning data of an indoor environment without trajectory information of LiDAR. The proposed algorithm employs the Manhattan-world assumption to check overlap between the building component’s position. The overlap increases the accuracy of derived building components in the detection step on an indoor PCD without trajectory of LiDAR. Second, the proposed algorithm secures the consistent accuracy of building component detection regardless of the quality of indoor PCD. The accuracy and proper segmentation rate of detection are over 80% in the case study using datasets captured through two kinds of LiDAR with different capacities. In particular, the result of building components detection on Dataset #1 and #3 prove that the proposed algorithm can detect components accurately from the low-quality indoor PCD. Consequently, the requirement for LiDAR has low relation with the accuracy of detection, which makes it possible to use low-cost LiDAR on building component detection. Lastly, the 3D geometric representation derived through the proposed algorithm can be used for supporting the accurate Building Information Modeling (BIM) generation with low-quality PCD. The executors who use the traditional method for generating BIM from PCD manually draw the lines in accordance with the structural members. The unstructured PCD emits the information of clutters for distinguishing them into structural member because it only has geometry data; thus, executors who generate as-built BIM must judge whether the clutter is structural members. In contrast, the proposed algorithm provides the existence and location of building components, which lets the executors choose a structural member from the properly segmented components. The candidates of structural member can improve the accuracy of the judgment since the segmentation of this algorithm removes the outlier that causes the false geometric representation of member.
Nonetheless, the proposed algorithm had difficulty to apply to non-aligned PCD whose building components are not aligned with the coordinate planes. To solve the limitation, the algorithm needs additional process for alignment of raw PCD. In addition, the comparison of the result derived from the case study with the previous studies is not conducted because it has limitations on applying the existing methods to our datasets with higher registration error than the ISPRS benchmark datasets [
62,
63,
64]; thus, this study verified the algorithm about applicability of it for low-quality indoor PCD.
6. Conclusions
This study proposed a building component detection algorithm that was suitable for indoor PCD without MLS trajectory information by using the features of the indoor environment. The proposed algorithm was verified through the case study, in which acquiring MLS trajectory information was difficult. In particular, over 90% accuracy of building components detection was ensured by this algorithm from the low-quality raw PCD without the additional sensors. Furthermore, the building components detected by this algorithm has 3D geometric representation which can be used for reconstructing the interior environment. Through applying the proposed algorithm, we believed that the reverse engineering of the as-is indoor is possible using the indoor PCD without trajectory. For future study, 3D reconstruction algorithms for the indoors of buildings will be developed to segment the physically connected components regardless of condition of indoor environment.