1. Introduction
Fossil fuels burning in boilers results in the release of combustion gases into the atmosphere containing a gaseous phase and suspended solids or liquid particles pollutants. Environmental regulations have been defined to limit ambient pollutions that can be emitted into the environment around the world.
In order to control pollution in industrial boilers, the first step is to employ an adequate measurement system. A typical system for polluting emissions measurements in industrial boilers consists of Continuous Emissions Monitoring Systems (CEMS). CEMS are nevertheless operationally expensive, and their readings reliability is sensitive to environmental conditions, with difficulties related to their installation and maintenance. Predictive Emissions Monitoring Systems (PEMS) are a recent approach solution planned to replace online analyzers like CEMS by estimating emissions concentrations from process data. PEMS provide a relationship between the process and the emissions through nonlinear modelling data.
Methods based on the use of measurements data from a process have been used for process modelling. Modelling based on linear systems are popular for their simplicity; however, in complex problems with strong non-linearity, the results are limited. It is necessary, then, to develop methodologies that allow non-linear modeling, seeking to preserve the advantages of linear models.
There are different tools to solve the modelling identification problem, depending on the system complexity and knowledge. If the identification is based only on input and output measured data, considering there is no knowledge about the system physics, the identification process is classically called black box modelling. When detail system physics is known, purely mathematical models can be calculated, and the identification process is called white box modelling.
In the process of modelling pollutant emissions in the combustion process, it is practically impossible to have the knowledge of all the thermal, physical, and chemical processes that could exist, and, although this knowledge is available, it is always dispersed. This complicates the development of a complete description of the system in terms of a continuous-time mathematical model. Transferring the knowledge required for a description in discrete time is often difficult, since it is often lost in the process of discretization. Getting a black box model in complex systems is not easy to achieve, even if one has experience and knowledge of the system. Although the basic understanding of the system dynamics sometimes is not available, it is always useful and helps to establish the model structure.
Air quality is a widespread concern, which is generated mainly by population centralization in large cities, which increases traffic, industrialization, and the indiscriminate use of conventional energy [
1,
2]. Technological advances and the development of clean energy are outweighed by the indiscriminate use of fossil fuels. As a result, the emission of dangerous particles into the atmosphere is increasing [
3], having catastrophic effects on all scales and bringing unimaginable damage to public health [
4]. Environmental pollution is regulated by global organizations, but most countries do not satisfy the requirements [
1,
5], and, in addition, in some cases there are records of measurements that widely exceed the limits, thereby causing millions of deaths [
4,
6,
7]. One of the most dangerous environmental pollutants with indiscriminate emissions is particulate matter (PM), which includes large particles (PM
10), small particles (PM
2.5), and fine particles (<100 nm) that are not regulated [
8], which makes the problem larger [
9]. A World Health Organization report on environmental atmosphere contamination stated that the average density of PM
10 was augmented about 5% between 2008 and 2013 in 720 cities around the world [
10]. It has been reported that a diminution in the content of PM
10 by 5 μg per cubic meter in Europe would avoid between 3000 and 8000 annual premature deaths [
11]. Alike estimates for PM
2.5 advise a decrease of 7 to 8 months in the life likelihood [
12]. In equivalent studies of fine particulate matter, it is estimated that they also increase the damage to health together with large particles, whose health effect is known, and, in consequence, are expected to increase the costs to maintain public health and avoid premature deaths [
6].
Total suspended particles (TSP) emissions formation is widely reported but poorly understood. In order to reduce TSP emissions, theoretical TSP generation models have been studied [
13] and mainly discuss theories for linking discrete to continuum modelling but do not propose a general theory at the micro- and macroscale levels. Regression modelling of the spatial particulate matter has been developed [
14] for modelling the concentrations of suspended particles in the time and space domains in specific localities with a high population concentration. The studies on modelling TSP emissions generally focus on analyzing the effects of suspended particles on the population health [
15,
16,
17,
18,
19,
20]. Modelling of TSP emission sources has been studied [
21], conducting research on source apportionment of ambient particulate matter in Europe using receptor models such as principal component analysis, enrichment factors, classical factor analysis, and positive matrix factorization, [
22] and building and applying a numerical air quality model that relies on scientific first principles to predict the concentration particles. A study [
23] reported a computer-controlled ambient-simulation method to determine the source characteristic profiles of emissions from an oil-fired boiler through isokinetic withdrawal.
In relation to non-linear modelling, artificial neural networks (ANN) are widely applied in engineering processes, in particular for TSP modelling, and neural networks with retro-propagation, such as multi-layer perceptron (MLP), are the most used. However, there is a great diversity of types of neural networks for non-linear modelling based on data. For instance, Ye [
24] presented Bayesian–Gaussian neural networks (BGNN), a new methodology for the application of neural networks improved in training time, minimum location, and auto-tuning for online applications. A redefinition of the BGNN algorithm was presented by Liu [
25] using genetic algorithms (GA) for the offline adjustment of the threshold matrix and a sliding data window for online applications suitable for non-linear systems that change over time, a feature that makes this algorithm very attractive for online dynamic systems modelling.
The radial basic function (RBF) network offers an alternative in signal processing applications using two-layer neural networks. Common learning algorithms for the RBF network are based on the selection of random functions centers, which has some drawbacks. Reference [
26] presents a method using orthogonal least squares, selecting the basic functions centers radially one by one, until an adequate network is built. Kassam [
27] presents a training algorithm for the RBF network based on the stochastic gradient (SG) error and shows its versatility in non-linear signal processing applications. Because of its numerical features, the stochastic gradient algorithm is a training algorithm widely used in networks with adaptive radial basis functions, but it presents a compromise between convergence speed and precision due to fixed values in the steps size. Zeng [
28] solved this problem by presenting the convex combination of multiple RBF (MCRBF) network algorithm, applying the SG learning algorithm and varying the configuration parameters.
The Volterra polynomials have also been used for system identification [
29]. Here, the orthogonal least-squares method is used for offline model framework determination.
Considering that the combustion process is complex, multivariable, and non-linear, which hinders the application of white-box modelling techniques, fitting models for PEMS development is a challenge. The goal of this work was to develop a nonlinear model for TSP emissions in a 350 MW conventional boiler that burns heavy fuel oil in order to provide a good TSP prediction that can be used in PEMS applications using operational boiler data. The non-linear model proposed is based on the polynomial expansion of a multiple-variable function in the neural network framework, considering that there is a finite number of basic functions for an ANN to estimate a non-linear function. The orthogonal least-squares algorithm was used as a parameter estimator because it allows to easily determine a reduced subset of basic functions that best represent the TSP emission dynamics. A classical MLP three-layer ANN was implemented and compared with the polynomial expansion network proposed.
2. Methodology
Regardless of the fact that all systems are non-linear, most of the literature on systems identification refers to the identification of linear systems. The main reason for this is that the assumptions may become very restrictive because of the process complexity, which forces the designer to use strong simplifications or fix the model components. Also, process innovation for improvement together with diverse local environments often results in significant differences between two apparently similar plants.
Power plant equipment and installation are usually fitted to suit the local conditions of a specific place. The construction depends on factors such as fuel availability, innovations and local ambient conditions towards better thermal efficiency and emission control, etc. To make the existing models adequate for different constructions, redesigning and tuning are required. Model equations solving might also add problems to highly detailed first-principle models. Mathematical knowledge is required to develop the model, and time-consuming interactive computations need to be performed.
Identification is the experimental approach to process modelling [
30,
31]; this approach includes the following steps:
An experimental design must be carried out in order to obtain data that represent the behavior of the process in the whole operating field of the dynamic system. In other words, it is necessary to establish physically support values throughout the input range and measure the effect on the output. The corresponding input and output data sets are finally used to infer a system model. Some parameters to be defined in the experimental phase are: tests preparation, sampling time choice, suitable experiments design, and data pre-processing. Pre-processing data includes, for example, testing the response time, removal of irregularities, control of noise and another unexpected behaviors of the data.
Defining the model framework is called structure model selection. That is a framework that must be explored to get a good model where the model input–output signals and the internal interaction of the model are determined. The model structure is derived using prior knowledge. In general, structure model selection implies the selection of the model approach (multilayer perceptron networks, radial basis function, etc.) considered appropriate to describe the system, and the selection of a subset of this structure model defining the appropriate number of parameters for a specific problem.
The values of the unknown parameters of a parametrized model structure are estimated. Normally, the model that best performs according to the design specifications is chosen. The design specifications can be expressed in many different forms; preferably, they should be formulated with the final model application in mind. In general, the model is selected on the basis of the best possible model predictions following some criterion of error magnitude measurement between the observed output and the model estimation. The method for calculating the model parameters of the selected structure is developed according to the statistical theory and is called estimation. The equivalent process in the ANN theory is often denominated training or learning.
Once the model has been determined, it must be tested to verify whether it meets the design specifications, by evaluating precision, robustness, convergence, and good generalization abilities (interpolation). Validation is closely related to the final model application. The validation criteria of the tracking error compensation algorithm will be defined on the basis of the characteristics of the current compensation algorithms of relevance.
2.1. Polynomial Basic Functions
The model structure is a candidate model set, i.e., a set within which a model must be sought. In general, the problem with the model selection implies a model family selection. For TSP modelling, a polynomial expansions network was used as a model family because of its ability to model non-linear dynamics.
Consider a non-linear dynamic multiple-inputs–simple-output (MISO) system, which is represented by
where
is a non-linear function,
is the input vector,
is the delay samples number in the p-th input, and
is the input variable number.
Define
as
re-index
as
where
, and the Equation (1) is
Polynomial multivariable expansions have been suggested as candidate base functions [
32,
33] and they are usually applied in function structure, mainly in one-input variable functions [
34]. Recently, the polynomial basis function (PBF), in the multivariable functions context, has been launched within the neural network model structure. Its functional representation is described by
where
is the concatenated parameters,
is the set of basic functions formed from the polynomial input terms,
is the polynomial basic functions number,
is the polynomial expansion order, and
indicates the approximation error generated by the order
from the input vector. The basic functions are polynomials of some specific order of the input vector
. This process can be viewed as the transformation of the multivariable input vector to a space of higher dimensions.
There are a certain number of basic functions to approximate a non-linear function with accuracy [
35]. However, a practical method is necessary to determine these basic functions. The necessary approximation precision can be achieved by an acceptable number of linearly independent non-linear basic functions.
Polynomial expansion base functions offer a good approximation of non-linear functions. The structure of non-linear identification is shown in
Figure 1. It is assumed that the non-linear function
in the polynomial expansion basic functions is estimated by a single-layer ANN, which is a linear combination of non-linear polynomials.
With the increase in order , the basic functions number gets bigger and bigger. So, the problem is function estimation using a suitable ANN, dimensioned such that the estimate precision is according to the specified requirements. The framework model selection and parameters estimation of one-layer ANN are detailed here.
2.2. Model Structure Selection and Parameter Estimation
Obtaining the model that represents the TSP emissions dynamic of a boiler with good precision is a problem that requires looking for the best model structure, which means defining the number of parameters (basic functions) appropriate for the model and estimating the parameters to obtain exact values for the model.
There are many ways to the selection of basic functions. In this case, the structure selection was executed offline using the orthogonal least-squares algorithm [
36] to determine the more meaningful basic functions number for TSP modelling.
This assumes that the data
from the input and output systems are known. On the basis of Equation (6), the estimation function can be formulated in vector form through:
where the input vector
, the parameter vector
, the error vector
, and the basic functions matrix
are
The parameter vector
is generally found by optimizing the error vector norm, that is,
obtaining the least-squares solution.
The vector of , for , forms a basic vector set, and the orthogonal least-squares solution satisfies the condition that will be the projection of on the space generated by the basic function vectors . The orthogonal least-squares method implies the basic vector set transformation into an orthogonal basic vector set, and, therefore, makes it possible to compute the individual impact on the output, for all base vectors. An orthogonal factorization of can be obtained by means of a construction known as Gram–Schmidt orthogonalization process.
First, set
. The following vectors are then given inductively in the following way: suppose that
have been chosen, so that for each
is an orthogonal base for the vector subspace that is generated by that
. For building the next vector
, set
So , since, otherwise, is a linear combination of .
Furthermore, if
, then
Therefore,
is an orthogonal set consisting of
nonzero vectors in the subspace generated by
. Because orthogonal nonzero vectors are linearly independent, this is a base for this subspace. So, the vectors
can be constructed one after the other according to the Equation (14). In general, we have, for
= N:
The matrix
can be written as
where the matrix
has the size
with orthogonal columns, and
is a unitary
upper triangular matrix with 1 on the main diagonal and 0 below the main diagonal.
The properties of orthogonality of are advantageous from the logarithmic point of view, in such a way that Equation (7) can be represented by
and, defining
the Equation (21) can be represented by
and
where
The vector
can be deduced as the optimal estimate
as follows:
From Equation (22),
can be represented by
pre-multiplying Equation (24) by
in vector form
So,
is minimal. The equivalent optimal parameters vector is
The orthogonalization Gram–Schmidt algorithm can be applied to calculate Equation (28), therefore, to solve the least squares algorithm and to estimate and evaluate the variance for each basic function .
The variance of the output can be written as
Note that
is the part of the variance of the looked-for output which can be represented by the basic functions, and
is the variance not represented by
. Thus,
is the increment of the variance of the desired output represented by
, and the reduction ratio of the error due to
can be defined by:
This ratio allows a simple and efficient mechanism to search for a significant basic functions subset. The implementation is based on the classical Gram–Schmidt method [
36,
37], see
Appendix A.
PBFs order changing will result in an error reduction ratio change,
. For PBFs, there are
combinations;
denotes the error reduction
corresponding to the j-th PBFs arrangement. The conventional Gram–Schmidt algorithm can be applied to find the actual arrangement of the basic functions
, which represents the best arrangement, in such a way that
So, the priority of the basic functions is determined. Therefore, the best PBFs arrangement is denoted by and the corresponding parameter vector is .
4. Conclusions
Orthogonal least-square algorithms are a great tool that provide extra information about internal model behaviors. Finite expansion polynomial basic functions can be implemented with one-layer ANN and agree with the universal approximation theorem. The user can decide the model complexity accuracy for model selection, selecting the polynomial order and most significant basic function number.
PBF networks provide a viable alternative for the complex multivariable non-linear systems modelling in the industry, where there is little or no process knowledge. The model structure developed allows TSP emissions estimation due to fossil fuel combustion. Non-linear combinations of oxygen excess, atomization pressure, fuel temperature, load, and flow fuel values are sufficiently informative to predict TSP emissions with excellent precision. The TSP emissions estimation can be improved by increasing the training set with experimental tests. The estimation algorithm requires few resources for its implementation, providing a viable alternative for estimating pollutant emissions into the atmosphere during fossil fuel combustion processes.
This methodology can be replicated for other pollutants estimation emitted into the atmosphere in combustion processes, such as, for example, NOx and CO emissions, etc., and has a great application potential, regarding process design, process control combustion optimization, emission control, fault detection, etc., that would bring great benefits in the prevention of diseases, climate change, etc.