Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Application of multi-gene genetic programming technique for modeling and optimization of phycoremediation of Cr(VI) from wastewater

Application of multi-gene genetic programming technique for modeling and optimization of... 1 Background manageable equations in differential/algebraic form that Modeling of chemical processes requires a deep under- relate output variables to input features so that they can standing of phenomenology. The bulk of chemical have a better understanding of the desired benefits. In processes is enormously complicated, with limited this work, an attempt has been made to explore alterna- knowledge of their mechanisms. Due to a lack of under- tive computational methods for modelling the less under- standing, developing the first principle-based phenom - stood chemical processes. enological model is challenging and time-consuming. It is found in the literature that genetic programming Different researchers have tried different numerical (GP), which is a branch of evolutionary modeling tech- and mathematical algorithm to model the system from niques, has the capability to remove the above draw- input–output data where process phenomenology is not backs of ANN and SVM models. Genetic programming known. Alternative route of numerical analysis currently automatically generates nonlinear structured models as emerging is artificial intelligence (AI)-based modeling closed-form equations relating the input and output of technique [7, 23, 28, 33, 42]. In this scenario, an artificial the system from available data. Not only does GP iden- intelligence-based data-driven modeling technique can tify the structure of the equation, but also it estimates the be a feasible alternative where process phenomenology different parameters of the equations so that it accurately is not required. In this decade, artificial neural network predicts the output. At the time of building up MISO (ANN) and support vector machine (SVM) algorithms (multiple input, single output) models by using GP, the have established themselves as powerful data-driven probability of survival of a particular model to its next black-box modeling technique. In the last 20 years, ANN generations depends on its prediction accuracy and fol- and SVM have been applied to a large number of research lowed ‘survival of the fittest’ principles. Recombination of and industrial applications in diverse fields. ANN, on the components of survived models continuously takes place other hand, generates a complex sigmoidal function at to form a new model aiming at increasing predictability the end with several tuning constants known as weights in each generation. In the literature, the breakthrough in and biases. These complex sigmoidal equations are black GP can be seen in the late 1980s with the experiments of box in nature and do not give any insights into the pro- Koza on symbolic regression [17]. The versatility of the cess phenomenology. In addition, it is very difficult to get GP algorithm is proved by Koza and Rice by applying it in a possible explanation of the processes from these equa- diverse fields of robotics, games, control, etc. [15]. tions. Both ANN and SVM models suffer from these lim - Multi-gene genetic programming (MGGP) is one of itations of explainability despite their superior prediction the robust variants of GP and claims to be more effec - capability. Due to these reasons, ANN and SVM models tive than GP in nonlinear modeling [35, 36]. MGGP is are not favoured by the engineers working in the indus- designed to generate mathematical models of predictor try and by the researchers working in R&D laboratories. response data that are ‘multi-gene’ in nature, i.e. linear Plant engineers and R&D researchers typically want combinations of low-order nonlinear transformations S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 3 of 17 of the input variables. The conventional GP is mainly in human beings and genetic changes that result in birth based on the evaluation of a single tree (model) expres- abnormalities in unborn infants [25]. The World Health sion. In the multi-gene approach, several individual genes Organization (WHO) states that the acceptable limits for are combined to construct a single one [36]. It has been total chromium in drinking water, which includes Cr(III) proved that MGGP regression is capable of providing a and Cr(VI), are 0.05 mg/L and 2 mg/L, respectively [14]. more accurate and computationally efficient model in For public water systems, the Environmental Protection comparison to conventional GP [35, 36]. It can be seen Agency (EPA), USA, limits the maximum total chro- in many cases that MGGP performed better than other mium contaminant level in drinking water to 100 µg/L or machine learning methods, viz. ANN, SVM, etc., in terms 100  ppb [41]. Therefore, before discharging wastewater of predictability and model simplicity [6, 10, 12, 24]. into the environment, chromium must be removed from Since the beginning of industrialization, industrial it. Chromium removal from wastewater has been accom- wastewater carrying various contaminants has been plished using several conventional treatment methods, dumped directly or indirectly into water bodies, disrupt- including solvent extraction, membrane separation, ion ing the aquatic biota. This has made water pollution a exchange, and chemical precipitation [26]. Conven- serious environmental problem. Due to their poisonous tional methods, however, have drawbacks such as high and carcinogenic nature, heavy metals in wastewater play chemical or energy requirements, secondary pollution a key role among other contaminants including organic development, high cost, production of toxic sludge, etc. or inorganic compounds, dyes, and pesticides [37]. Addi- Additionally, they are ineffective for metal concentrations tionally, they cannot biodegrade but rather bioaccumulate below 100 mg/L [4, 14]. As a result, research is currently in living tissue, posing major health risks and even fatali- leaning towards alternative environmentally benign, eco- ties [40]. The specific gravities of heavy metals are five nomically feasible processes. There are about 7000 differ - times larger than those of water, and their atomic weights ent types of algae worldwide. Microalgae can bio-reduce range from 63.5 to 200.6 [39]. In 2015, the Agency for CO from waste gases and the atmosphere; they can use Toxic Substances and Disease Registry (ATSDR) ranked low-quality water, including municipal runoff and indus - Cr(VI) as the sixteenth most dangerous substance [11]. trial wastewater containing toxic metals and organic According to a recent study by Yen et  al. [43], Cr(VI) is matter. Moreover, algal bodies can survive in natural the second most prevalent heavy metal (pollutant) in the weather conditions. Algae can produce significantly more environment, with concentration in groundwater ranging biomass when grown in wastewater; thus, high-quality from 0.008 to 173 mM [43]. Natural sources and anthro- agricultural land is not necessary to grow the algal cells pogenic sources are the two main sources of chromium [29]. Even though there have been many studies on the pollution. Leaching from rocks and topsoil are two natu- reduction of Cr(VI) utilizing microalgal/cyanobacterial ral sources of chromium pollution of surface water. The biomass, very few of them address the intricate math- effluents from industrial sectors such as electroplating, ematical analysis supported by experimental results. Sen leather tanning, anodizing, ink manufacturing, pigments, et al. [38] used Chlorococcum sp. to remove Cr(VI) from dyeing, glass, ceramics, glues, wood processing, paint simulated wastewater [38]. industry, metal cleaning, mining, and textile industries In the present article, the phycoremediation method for are some of the main anthropogenic sources of chro- the removal of Cr(VI), a very complex process, was chosen mium in surface water bodies [30]. The most stable forms as a case study. The experimental data as reported by Sen of chromium are Cr(III) and Cr(VI). Cr(III) is used as an et  al. [38] were used for modeling and optimization using essential nutrient for animals; however, due to its strong MGGP and grey wolf optimization (GWO) techniques, teratogenic, mutagenic, and carcinogenic effects, Cr(VI) respectively. In the present study, an attempt was made to is over 300 times more hazardous than Cr(III) [27]. Due explore the relationship between the removal of Cr(VI) with to its high solubility, Cr(VI) can easily pass through cell input parameters like pH, inoculum size, initial concentra- membranes. The inappropriate disposal of solid waste tion of Cr(VI), and time. Once a model was built through from chromate-processing facilities is mostly to blame the application of MGGP, that model  was utilized to get for groundwater pollution with chromium. The dose of the  phenomenological insights  of  the phycoremediation chromium, the length of exposure, and the type of com- process. One major objective of the application of MGGP pound all affect one’s health. Acute exposure to chro - was to develop an accurate phycoremediation model which mate leads to several health problems in human beings, is a closed-form equation, portable, and can be used by the like gastrointestinal disorders, haemorrhagic diathesis, plant engineers to gain insight into the process. The second and convulsions. In extreme circumstances, it could also objective of the study was to use the developed model to result in cardiovascular shocks and death [32]. Addition- maximize the removal of Cr(VI). This was accomplished by ally, it is known that hexavalent chromium causes cancer using a nature-inspired metaheuristic optimization method Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 4 of 17 for the optimization of the input process parameters [1–3, Table 2 Some portion of input–output data used for GP training 5]. To produce Pareto optimal solutions that meet the goals pH (x ) Inoculum Initial conc. Time (x ) (day) Removal 1 4 in the most efficient way possible, the grey wolf optimiza - size (x ) of Cr(VI) (x ) of Cr(VI) (y) 2 3 (%) (mg/L) (%) tion (GWO) was utilized to improve the input space of the MGGP model for phycoremediation. GWO has the fol- 7 10 22.23 12 45.667 lowing advantages: (i) it is a nature-inspired modern opti- 9 10 10 5.68 25.000 mization technique and has world-spread applicability; (ii) 9 10 10 10 42.580 GWO has not been used so far to optimize complex phy- 7 10 17.64 12 61.351 coremediation models. 10.62 10 20 12 60.840 Therefore, overall novelties of the present study are: (i) 11 10 20 11.467 53.778 MGGP has been used to solve an industrial problem; (ii) as 9 2 20 13.03 29.826 far as known, MGGP has not been used to build a phycore- 9 5 20 5.37 21.005 mediation model; (iii) out of several good performing mod- 9 4.249 20 12 40.759 els, those models are sorted out which actually obey the 7.85 10 20 12 65.200 internal phenomenology; and (iv) grey wolf optimization 9 10 25 5.611 3.986 (GWO), a nature-inspired metaheuristic optimization algo- 9 10 20 9.82 57.358 rithm, has been used to optimize the process parameters. 9 10 10 10.60 44.583 9 10 20 3.63 26.307 7 10 8.26 12 40.939 2 Methods 9 10 15 5.12 25.904 2.1 G eneration of data for model building Sen et  al. [38] performed experiments on the removal of Cr(VI) using the Chlorococcum sp., a cyanobacterial spe- cies. They had 82 experimental data points. In the present y = f (X , β) (1) study, using 82 data points and ‘Plot Digitizer’ software 897 data points were generated for better MGGP model build- where y indicates the process output variable (removal of ing. After building a reliable MGGP model for the phy- Cr(VI)); X is the N-dimensional vector of input variables coremediation process, the model equation is optimized like pH, inoculum size, initial concentration of Cr(VI), through the grey wolf optimization (GWO) technique. The etc. ranges of four variables, including pH, inoculum size (IS), X = [x , x , . . . , x , . . . , x ] (2) 1 2 n N initial concentrations (IC) of Cr(VI), and time, are shown in Table  1. Table  2 shows some of the experimental data and f denotes a nonlinear function whose parameters which are generated by ‘Plot Digitizer’ software. are defined in terms of a P-dimensional vector, β β , β . If phycoremediation data of input and 1 2,...,β output variables are given, the GP algorithm tries to best 2.2 Multi‑ gene genetic programming: at a glance fit the data by changing its functional form and parame - GP is a type of metaheuristic symbolic modeling tech- ter vector β . Genetic programing iterative modelling pro- nique that creates equations for solving problems based cess is shown in Fig. 1. on Darwinian natural selection’s ‘survival of the fittest’ concept [13]. The following is the general form of the process model to be obtained (Eq. 1) 2.2.1 Executional steps of genetic programming The algorithm of GP is illustrated in Fig.  2. The execution steps of GP are mentioned below [31]: Table 1 Name and range of input and output parameters Step 1 (Initialization) In the first step, the GP algo - rithm creates random equations to fit the data defined in Range Eq.  (1). These equations are commonly called the popu - Input parameters lation of strings (chromosomes) representing candidate pH (x ) 7–11 solutions. Basically, a population member consists of Inoculum size (x ) 2–10% (v/v) functions and terminals combined in a hierarchical man- Initial concentration of Cr( VI) (x ) 5–30 mg/L ner, which is termed the tree. The function set may con - Time (x ) 2–14 day tain algebraic operators and Boolean logical operators. Output parameters The terminal set may consist of variables and numerical Removal of Cr( VI) (y) 0–100% and logical constants. A typical tree is shown in Fig. 1. S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 5 of 17 Fig. 1 Genetic programming iterative modelling process Step 2 (Generation) This step is an iterative procedure fitness of a particular population. In this study, root- to generate the population with a high fitness value and mean-squared error (RMSE) was considered an consists of the following sub-steps (Fig. 2): error parameter. (b) Select individual equations from the population with the help of probabilistic determination of fit - (a) The fitness of each population is evaluated using a ness. prespecified fitness function. A coefficient of deter - (c) Create new individual equations with the help of mination (R ) dependent or error-dependent fitness genetic operators such as (Fig. 2): function can be used for this purpose. Higher the value of R or lower the value of error, the more the Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 6 of 17 Fig. 2 Algorithm of genetic programming (i) Reproduction During reproduction, the algo- (ii) Crossover In the crossover, offspring is pro - rithm copies the existing population into a new duced by the interchanging of chromosomes of generation without any change. the parent generation. S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 7 of 17 time, inoculum size, and initial concentrations of heavy metals. 2.2.4 S election of input and output variables for modeling Since the removal percentage of Cr(VI) has a large impact on wastewater, this parameter is kept as the out- put variable. All the input variables which can impact the Cr(VI) removal are studied in the literature [8, 34, 37]. Four input variables are finally shortlisted and shown in Table 1. Fig. 3 A typical MGGP model 2.2.5 Mo deling through multi‑gene genetic programming (MGGP) (iii) Mutation In mutation, the replacement of The experimental data were taken for GP-based mod - existing elements in offspring with other ele - eling. Random partitioning of the dataset comprising ments takes place. four input variables and one output variable (Table  1) into a training set (80% of whole data) and test set (20% The reproduction, crossover, and mutation steps are of whole data) was performed. The training set was used illustrated in Fig. 1. to compute the model by maximizing the fitness value, Step 3 (Termination) When the termination criteria are whereas the test data set was used for cross-validation of met, the best program will be the approximate solution the expression developed. The main objective of cross- to the problem (Fig. 2). validation is to make the model more generalizable. Open-source GPTIPS toolkit coupled with subrou- 2.2.2 Structure of MGGP tines written in MATLAB 2019a was utilized to construct Usually, symbolic regression uses GP to create a popula- the GP-based model [36]. In this study, the root-mean- tion of trees. Each of these trees is basically a mathemati- squared error (RMSE) between actual and predicted out- cal expression. In MGGP, a multiple tree structure is used puts was employed as a fitness function, and the program to predict the output as shown below. was run in such a way that the RMSE value was kept as y = C + C tree + C tree + . . . + C tree 0 1 1 2 2 n n (3) low as possible. Due to the stochastic character of the GP, the software was run 100 times to build the model. Each of these trees can be considered a gene. A typical multi-gene model is shown in Fig. 3. This model predicts 2.3 Optimization through grey wolf optimization (GWO) an output variable using three input variables ( x , x , x ). 1 2 3 Mirjalili et  al. [20] developed GWO which is a stochas- The characteristics of this model structure are though it tic and metaheuristic optimization methodology. This contains nonlinear terms (like a square), it is linear in the bionic optimization algorithm stimulates the rank-based parameter concerning the coefficients C , C , andC . The 0 1 2 mechanisms and attacking behaviour of the grey wolf maximum allowable number of genes (G ) for a model max pack. The lead wolf helps the other wolves to capture and the maximum tree depth ( D ) have a profound max the prey through the surrounding, haunting, and attack- effect on the final model structure and are usually speci - ing process. This large-scale search methodology centred fied by the users. From the literature, it is found that a on the three best grey wolves, but there is no elimina- maximum tree depth of 4 or 5 nodes usually gives a com- tion mechanism. The optimization technique is different pact efficient model. The linear coefficients C , C , andC 0 1 2 from others in terms of modeling. It constitutes a strict are usually evaluated by applying the ordinary least hierarchical pyramid. The group size is 5–12 on average. square method to the training data. In the literature, Layer α, consisting of a male and a female leader, is the multi-gene symbolic regression is reported as more accu- strongest and most capable individual for deciding the rate and efficient than the standard GP. team’s predation actions and other activities. Layer β and layer δ are the second and third layers, respectively, in the 2.2.3 Application of MGGP in phycoremediation process hierarchy, responsible for assisting α in the behaviour of modeling group organizations. The bottom ranking of the pyramid This paper tries to explore the MGGP modeling tech - is occupied by the majority of the total, named ω. They nique for the phycoremediation process to obtain a are mainly responsible for satisfying the entire pack by meaningful nonlinear relationship between the removal balancing the internal relationship of the populations, of heavy metals with various input parameters like pH, Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 8 of 17 Fig. 4 Pseudocode of GWO algorithm looking after the young, and maintaining the dominant trial-and-error approach and literature survey and are structure [20]. The social hierarchy, encircling, hunting, shown in Table  3 [36]. A sensitivity analysis was per- attacking prey (exploitation), and searching for prey are formed where the population size, maximum genera- the main key elements of the GWO model (exploration). tion, and maximum number of genes were varied. A The detailed mathematical modeling of GWO has been balance has been made between model complexity and described by Mirjalili et al. [20]. prediction accuracy as shown in Pareto diagram (Fig.  5). This metaheuristic approach is applied in various real- The maximum allowable number of genes ( G ) for a max world problems because of its efficient and simple perfor - model and the maximum tree depth ( D ) have a pro- max mance ability by tuning the fewest operators [9, 16, 18, found effect on the final model structure and are usu - 21, 22]. Recent research in this regard looks forward to ally specified by the users. From the literature, it is found the further development of the optimization algorithm. that a maximum tree depth of 4 or 5 nodes usually gives The detail of the GWO algorithm is depicted in Fig. 4. a compact efficient model. Population size, maximum generations, G andD were kept at 500, 150, 6, and max max 3 Results 4, respectively. The linear coefficients C , C , andC are 0 1 2 3.1 P erformance of the MGGP model usually evaluated by applying the ordinary least square The main objective of the present study is to gener - method to the training data [36]. Each run consists of ate a closed-form model equation of the phycoreme- more than 10,000 iterations as long as MGGP is able to diation process which is accurate, simple, and portable. improve the R value and minimize the RMSE value. The Eight hundred and ninety-seven data sets were quali- stopping criteria for MGGP are when RMSE drops below fied for model building. The values of MGGP param - the threshold value of 0.001 or maximum execution time eters required for modelling were decided based on a is achieved. Since MGGP is stochastic in nature, 20 such S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 9 of 17 Table 3 Run parameters percentage of Cr(VI). For each run of GP, one Pareto dia- gram was developed (Fig.  5). Pareto diagrams represent Run parameter Value the plot of expressional complexity (represented by the Population size 500 number of nodes in the equation) vs. prediction accuracy Max. generations 150 represented by (1 − R ). Generations elapsed 150 Input variables 4 3.3 Controlling model complexity Training instances 718 The use of regression models frequently suffers from the Tournament size 25 phenomenon called ‘bloat’, vertical and horizontal bloat. Elite fraction 0.7 The tendency to evolve trees with terms that provide little Lexicographic selection pressure On or no performance benefit is known as vertical bloat. This Probability of Pareto tournament 0.8 is connected to the phenomena of overfitting in terms Max. genes 6 of model development. Researchers have used a differ - Max. tree depth 4 ent approach to handle overfitting. Miriyala et al. in their Max. total nodes Inf paper conducted six tests for the overfitting, viz. infor - ERC probability 0.1 mation theory test, cross-validation test, held-out sample Integer 0.6 test, testing with noisy data, and goodness-of-fit test [19]. Crossover probability 0.84 In the present study, by the restriction on tree depth and High level 0.2, low level 0.8 use of the Pareto tournament between expressional com- Mutation probabilities 0.14 plexity and accuracy the vertical bloats were ameliorated. Subtree 0.9, Input 0.05, Perturb ERC 0.05 Tree depth was kept 4 by trial-and-error method, and the Complexity measure Expressional Pareto diagram is generated as shown in Fig. 5. Function Set TIMES MINUS PLUS MULT3 ADD3 Sometimes during execution multi-gene models add additional genes which do not give significant perfor - mance enhancement but add to the model complexity. This phenomenon is known as horizontal bloat. The sim - plest technique to avoid horizontal bloat in multi-gene regression is to limit the number of genes allowed in the model. In this study, the maximum number of genes is kept at 6 after the trial-and-error method. 3.4 Shortlisting the models Out of potential candidates or representative model equations (Table  4) with varying degrees of complexity and accuracy, for the selection of a suitable model, the following criteria were kept in mind: (i) Simplicity: The model complexity should be as low as possible. (ii) Prediction accuracy: The developed model should Fig. 5 Pareto diagram of model complexity vs. accuracy have low RMSE and high R . (iii) The model equation should capture the underly - ing physics of the process. In other words, model equations are not merely a predictive correlation but also should have a physical sense of the sys- runs were done to conclude the final model. Further tem under study. This is a prime consideration to increase in run up to 100 has no substantial benefit on develop real-life models. To judge this capabil- prediction accuracy. ity, domain experts’ qualitative knowledge about phycoremediation behaviour is collected from the literature and experiment. The operating skill, cir - 3.2 De veloping closed‑form model equations cumstance, and surveillance of phycoremediation GP generates a lot of promising model equations dur- behaviour are summarized in Table 5. ing its run. Table  4 summarizes the most prominent closed-form equations generated by GP for the removal Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 10 of 17 Table 4 Model equations: expressional complexity/performance characteristics (on training data) of symbolic models on the Pareto front Model ID Goodness of fit (R ) Model complexity Model 2 2 3 555 0.917 56 0.0957x x − 2.64 exp (−4)x x 3 1 3 − 7.13x − 3.47 exp −4 x x (4) ( ) 3 4 + 0.179x (2x + x ) − 0.0641x + 8.81 4 2 3 2 2 3 560 0.915 52 0.108x x − 3.01 exp (−4)x x − 10.2x 1 3 3 1 3 − 3.97 exp (−4)x x + 0.189x (2x + x ) + 20 4 4 2 3 3 (5) 2 3 2 596 0.895 50 0.105x − 2.23 exp (−4)x x + 0.0892x x 2 1 1 3 3 + 0.174x (2x + x ) − 0.354x − 0.0154x x x − 24.3 4 2 3 1 3 4 3 (6) 0.36x + 0.18x + 2.65x 2584 0.944 147 1 3 4 − 0.526x (2x − 9.19) − 0.126x x 4 3 4 + 0.18x x x 1 3 4 2 2 + 2.34 exp −6 x x x − 18.4 x − 12.7 ( ) ( )( ) 3 3 3 4 + 3.64 exp (−6)x x x (x − 12.7)(x − x ) + 0.394 3 4 3 2 3 1 (7) 0.173(x + x + x ) + 2.75x − 0.508x (2x − 9.19) 2730 0.945 159 1 2 3 4 4 3 − 0.121x x + 0.173x x x 4 1 3 4 + 3.87 exp (−6)x x (x − 18.4)(x − 12.7)(x − 12.9) 3 4 3 3 3 + 3.47 exp (−6)x x x (x − 12.7)(x − x ) + 0.415 3 4 3 2 3 1 (8) 0.173(x + x + x ) + 2.74x − 0.507x (2x − 9.19) 2871 0.945 152 1 2 3 4 4 3 − 0.121x x + 0.173x x x 4 1 3 4 2 2 + 3.83 exp (−6)x x (x − 18.4)(x − 12.7) 3 3 3 + 3.47 exp (−6)x x x (x − 12.7)(x − x ) + 0.447 1 3 4 3 2 3 (9) 7 2 3 4535 0.961 171 2.68 exp (−8)x − 1.62 exp (−5)x x + x + x x x 3 1 2 4 3 3 3 − 1.73 exp (−4)x (x + 2x − 5.02) + 0.0497x x x 1 4 1 2 4 − 1.81 exp(−7x x x + x ( ) 1 1 4 + 9.38 exp (−5)x x (x + 5.02)(x + x ) 1 3 1 4 + 2.07 exp −5 ( ) (10) 4812 0.96 165 7 2 3 2.86 exp (−8)x − 1.64 exp (−5)x x + x x x x 3 1 2 4 3 3 3 −1.07 exp (−4)x (x + 3x ) + 0.0511x x x 1 4 1 2 4 +1.28 exp (−4)x x (x + x ) 1 1 4 −2.12 exp (−7)x x (x + x ) + 2.1 exp (−5) 1 1 4 3 (11) 7 4 4924 0.96 161 2.69 exp (−8)x − 1.64 exp (−4)x (x + 2x − 5.02) 1 4 3 1 2 3 −1.57 exp (−5)x x + x − 5.02 + 0.0453x x x 1 1 2 4 3 3 −1.86 exp −7x x (x + x ) 1 1 4 +9.24 exp (−5)x x (x + 5.16)(x + x ) 1 3 1 4 +2.19 exp (−5) (12) 7484 0.934 64 0.644x + 1.77x x − 1.56x x + 0.0115x x 1 2 4 3 4 4 3 2 −0.23x x (x − x ) + 2.35 exp (−6)x x x (x − x ) + 0.0206 1 4 2 3 4 2 3 1 3 (13) 8003 0.935 72 3.12(x − x ) − 13.4x + 0.225x x + 0.0186x x x 4 1 3 1 2 3 4 +1.35 exp (−5)x (x − x )(x + x − 7) 3 3 4 1 3 −0.00576x x (x + x ) + 48.9 1 1 3 3 (14) S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 11 of 17 Table 4 (continued) Model ID Goodness of fit (R ) Model complexity Model 8122 0.926 59 2.92x − 11.4x + 1.27 exp (−5)x (x − x ) 4 3 3 4 +0.201x x + 0.0178x x x 1 2 3 4 −0.00522x x (x + x ) + 19.1 1 1 3 3 (15) 2 3 8591 0.922 58 0.706x − 10.1x − 2.83 exp (−4)x x 4 3 1 3 2 2 +0.103x x − 0.00238x x x 1 2 4 3 3 +0.0661x x x + 25.7 2 3 4 (16) Table 5 Rules to select the best model from real observations Sl. No. Parameter changed keeping all other parameters constant Whether process phenomenology matches with model prediction or not 1 pH (x ) increase/decrease Yes 2 Inoculum size (x ) increase/decrease Yes 3 Initial concentration of Cr( VI) (x ) increase/decrease Yes 4 Time (day) (x ) increase/decrease Yes 3.5 S hortlisting the model based on phenomenology a closed form  equation (like Eq.  10), which is portable, All thirteen equations in Table 4 are subjected to the above and easy to understand. GP develops a closed-form equa- scrutiny to judge whether the developed equation is in tion with a high prediction capability, but the created agreement with real observations. All the developed mod- equation is vast and complex, and it can be difficult to els are passed through rigorous testing. For example, ten interpret directly. In the present study, a methodology is test data set was generated where all the variables are kept at developed to enhance the interpretability of the devel- their 50-percentile value except pH which was varied from oped equations. Figures  6 and 7 summarize the devel- its minimum value to maximum value by 10 equal intervals. oped methodology. Figure  6 is created by varying one These test data were put in equations of Table  4, and respec- variable at a time from its minimum to the maximum tive 10-set removal percentage data were generated. After value (10 steps) while maintaining the average value of that, the first pH vs removal percentage was plotted. From the other three input variables. Cr(VI) removal equations these plots, observation number 1 of Table  5 was verified. developed by GP are used to predict the output value in In the same way, all other plots were generated which are each case of these simulated test data. After plotting was shown in Fig. 6. Final model equation is as follows: done, a trendline was drawn through each data. Based on eye inspection and R value, a trend line curve was cho- 7 2 3 y = 2.68 exp (−8)x − 1.62 exp (−5)x x + x + x x x 3 1 2 4 sen (such as a straight line, or a polynomial of degree 2 or 3 3 3 3 or more) that almost matches the data. − 1.73 exp (−4)x (x + 2x − 5.02) 1 4 On the other hand, Fig.  7 depicts the actual phenom- + 0.0497x x x − 1.81 exp(−7x x (x + x ) 1 2 4 1 1 4 enology of the phycoremediation process for the removal of Cr(VI). From the figure, it is seen that the removal of + 9.38 exp (−5)x x (x + 5.02)(x + x ) 1 3 1 4 Cr(VI) increases with the initial concentration of Cr(VI) + 2.07 exp (−5) up to 20  mg/L. Further increase in initial concentration (4) decreases the removal percentage. pH was varied from 7 to 11, and it is seen that removal of Cr(VI) increases with Table  6 shows the coefficient of determination (R ), time and reaches its maximum value at pH 9. Further root-mean-squared error (RMSE), and mean absolute increases in pH decrease the removal of Cr(VI). Inocu- error (MAE) values for training and test data. lum size was varied from 2 to 10%, and it is observed that the removal of Cr(VI) monotonically increases with 3.6 G eneration of explainable model equations inoculum size. Kinetic study was performed with respect One of the major advantages of the GP modeling tech- to the initial concentration (10–25 mg/L), pH (7–11), and nique over ANN and SVR techniques is that it generates Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 12 of 17 Fig. 6 Influence of each parameter on the removal of Cr( VI) Table 6 Performance of MGGP model4 Discussion 4.1 MGGP modeling R RMSE MAE In the current research work, basic arithmetic opera- Training 0.96 3.532 2.64 tors and functions, the population size of 500, maximum Testing 0.96 3.735 2.79 generation of 150, maximum tree depth of 4, and a maxi- mum number of genes of 6 were proved to be the best for modeling (shown in Table  3). If these parameters are taken with high values, the model accuracy may increase inoculum size (2–10%). In all the cases of kinetic study, it but the complexity of the solutions also increases, and is seen that removal of Cr(VI) increases with time. also the program becomes computationally expensive. As shown in Table 4, there is always a conflicting objec - 3.7 Optimization through GWO tive between model complexity (denoted by a number of Once the reliable models were successfully developed, tree nodes) and model prediction accuracy (denoted by the models were subjected to optimization. The purpose coefficient of determinations (R )). More accurate mod- of optimization is to find out the finest combination of els are more complex and vice versa. Complex models are input parameters that correspond to the highest Cr(VI) difficult to interpret and unnecessarily overfit the data. removal. One of the most significant jobs for any process Pareto diagrams represent the plot of expressional optimization is to define the search space in which the complexity (represented by the number of nodes in optimal process conditions should be found. the equation) vs. prediction accuracy represented by GWO code in MATLAB is used to optimize the input (1 − R ). An optimum point was found as the final model parameters (pH, inoculum size, and the number of days) with  a low value of (1  −  R )  and an acceptable value of which gives maximum removal of Cr(VI). Optimiza- model complexity. The red triangular point in the Pareto tion conditions and the optimal solutions are shown in diagram was chosen as a model. Tables 7 and 8, respectively. S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 13 of 17 Fig. 7 Phycoremediation process phenomenology for the removal of Cr( VI) All thirteen equations in Table 4 are subjected to scru- mind, firstly accuracy and secondly obeying underlying tiny to judge whether the developed equation is in agree- physics. ment with real observations. All the developed models From the shortlisted model equations, only one model are passed through rigorous testing. The scrutiny is nec - equation for the removal percentage of Cr(VI) was finally essary because in any modeling, besides the accuracy of selected (Eq.  10): This equation is considered the repre - the model, the developed relationship between output sentative model equation for the removal of metals as it is and input variables should be in agreement with the highly accurate, obeys Table 5 observations, and captures underlying physics of the process. In most cases, data- the internal physics of the process. Figure  7 shows the driven models with high accuracy end up being only actual phenomenology of the phycoremediation process. black-box models without any insight into the system. The high R and low RMSE values for the Cr(VI) In this procedure, the two parameters have been kept in removal model (Table  6) show that the predicted and Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 14 of 17 Table 7 Lower bounds and upper bounds for some optimization percentage of Cr(VI) and the operating parameters of the cases phycoremediation process. These trend lines can be used by plant operating engineers to get insights into how a Cases Bounds x x x x 1 2 3 4 particular input parameter affects the removal percentage Case 1 (IC: 5 mg/L) LB 7 2 5 2 of Cr(VI). UB 11 10 5 14 Case 2 (IC: 10 mg/L) LB 7 2 10 2 4.2 GWO optimization UB 11 10 10 14 In the present study, since the initial concentrations of Case 3 (IC: 15 mg/L) LB 7 2 15 2 Cr(VI) are not under our control and they are fixed as UB 11 10 15 14 per wastewater data, it is not considered an optimization Case 4 (IC: 20 mg/L) LB 7 2 20 2 parameter (Case 1–Case 6). Here, there are only three UB 11 10 20 14 optimization variables (pH, inoculum size, and the num- Case 5 (IC: 25 mg/L) LB 7 2 25 2 ber of days). The input parameters pH (x ), inoculum size UB 11 10 25 14 (x ), and time (x ) are varied in the range of 7–11, 2–10%, 2 4 Case 6 (IC: 30 mg/L) LB 7 2 30 2 and 2–14  days, respectively. In case 7, initial concentra- UB 11 10 30 14 tion of Cr(VI) also has been considered as optimization Case 7 (IC: 5–30 mg/L) LB 7 2 5 2 parameter. The lower bounds (LB) and upper bounds UB 11 10 30 14 (UB) explored in these scenarios are shown in Table 7. It is observed from Table  8 that removal of Cr(VI) increases with the initial concentration of Cr(VI) up to actual output values are comparable, and the models cre- 20  mg/L (Case 1–Case 4). Further increase in Cr(VI) ated are trustworthy, fairly accurate, and represent the concentration decreases the removal percentage (Case phycoremediation process’ underlying physics. The high 5–Case 6). In Case 7, pH, IS, IC, and time are the opti- R value on unseen test data and low RMSE further sug- mization parameters. It is observed from Table  8 that if gest the model’s generalizability and accurate learning on wastewater contains Cr(VI) at a concentration level of nonlinear input and output relationships. 14  mg/L, then if pH is kept at 9.48 (x ), IS (x ) at 10%, 1 2 Figure 8 depicts the models’ prediction performance on then within 13.4 days, 99% Cr(VI) removal is possible. training and testing data. The almost overlapping nature of the actual vs predicted curve indicates the model’s good prediction accuracy.5 Conclusions The generated model is highly accurate and dependa - This study uses multi-gene genetic programming tech - ble, as evidenced in Table 6, Figs. 6, 7, and 8. It also works niques to develop an accurate model of a phycoreme- well with unknown test data. diation process from the experimental data. The study’s As seen from Fig.  6, the developed trend lines are very key contribution is the development of an extremely pre- decisive and monotonically increasing/decreasing. As cise and explainable model equation that comes with an mentioned earlier, they all match the actual observations understanding of the process. The produced model equa - (Fig.  7) and obey Table  5. In short, developed models tions are based on the process’s underlying physics and capture the nonlinear relationship between the removal correspond to the expert’s observations and experiences. Table 8 Optimal solutions Cases Optimum values of input variables Optimum removal of pH (x ) Inoculum size (%) Initial concentration Time (day) (x ) 1 4 Cr(VI) (%) (v/v) (x ) (mg/L) (x ) 2 3 Case 1 (IC: 5 mg/L) 7 10 5 14 39.24 Case 2 (IC: 10 mg/L) 7.23 10 10 14 52.04 Case 3 (IC: 15 mg/L) 8.7 10 15 14 73.44 Case 4 (IC: 20 mg/L) 9.06 10 20 14 75.48 Case 5 (IC: 25 mg/L) 7 10 25 14 58.36 Case 6 (IC: 30 mg/L) 7 9.15 30 8.57 42.89 Case 7 (IC: 5–30 mg/L) 9.48 10 14.02 13.4 99.06 S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 15 of 17 Fig. 8 Actual versus predicted plots of removal of Cr( VI) with a training data and b test data LB Lower bounds Developed model equations are then used to generate UB Upper bounds optimal solutions to maximize the removal percentage AI Artificial intelligence of Cr(VI). Also, it has been shown that if the operating ATSDR Agency for Toxic Substances and Disease Registry mM Millimole parameters are adjusted as prescribed by optimization EPA En vironmental Protection Agency results, it can be possible to increase the removal per- USEPA United States Environmental Protection Agency centage up to 99%. C, C, C Linear coefficients 0 1 2 ppb Parts per billion Acknowledgements Abbreviations The authors sincerely acknowledge the computational and analytical instru- GP Genetic programming mentation support received from DST-FIST, GOI by the Department of Chemi- MGGP Multi-gene genetic programming cal Engineering, National Institute of Technology, Durgapur, India. GWO Grey wolf optimization ANN Ar tificial neural network Author contributions SVM Support vector machine BS is the investigator and performed the process optimization, modelling, and R&D R esearch and development paper writing. SS is the investigator and was involve in paper writing. SD con- MISO Multiple input single output tributed to conceptualization, supervision, and reviewing and final editing the WHO World Health Organization paper thoroughly. SKL was involved in conceptualization, supervision, model- IS Inoculum size ling, optimization, and reviewing and final editing the paper thoroughly. All IC Initial concentration authors read and approved the final manuscript. RMSE Root-mean-squared error R Coefficient of determination Funding G Maximum number of allowable genes max Not applicable. D Maximum tree depth max MAE M ean absolute error Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 16 of 17 Availability of data and materials 14. Gupta VK, Rastogi A (2009) Biosorption of hexavalent chromium by raw All the data presented in the article are produced or evaluated during the and acid-treated green alga Oedogonium hatei from aqueous solutions. J research. Hazard Mater 163:396–402. https:// doi. org/ 10. 1016/j. jhazm at. 2008. 06. 104 15. Koza JR, Rice JP (1992) Genetic programming: the movie. The MIT Press, Cambridge Declarations 16. Kohli M, Arora S (2018) Chaotic grey wolf optimization algorithm for con- strained optimization problems. J Comput Des Eng 5:458–472. https:// Ethics approval and consent to participate doi. org/ 10. 1016/j. jcde. 2017. 02. 005 Not applicable. 17. Koza JR (1994) Genetic programming: on the programming of computers by means of natural selection. In: Koza JR (ed) A bradford book. MIT Press, Consent for publication Cambridge Not applicable. 18. Lu Q, He ZL, Graetz DA, Stoffella PJ, Yang X (2010) Phytoremediation to remove nutrients and improve eutrophic stormwaters using water let- Competing interests tuce (Pistia stratiotes L.). Environ Sci Pollut Res 17:84–96. https:// doi. org/ There are no competing interests between the authors. 10. 1007/ s11356- 008- 0094-0. 19. Miriyala SS, Mittal P, Majumdar S, Mitra K (2016) Comparative study of sur- Author details rogate approaches while optimizing computationally expensive reaction Department of Chemical Engineering, National Institute of Technology networks. Chem Eng Sci 140:44–61. https:// doi. org/ 10. 1016/j. ces. 2015. 09. Durgapur, Durgapur 713209, India. 20. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw Received: 8 December 2022 Accepted: 14 February 2023 69:46–61. https:// doi. org/ 10. 1016/j. adven gsoft. 2013. 12. 007 21. Mirjalili S, Saremi S, Mirjalili SM, Coelho LDS (2016) Multi-objective grey wolf optimizer: a novel algorithm for multi-criterion optimization. Expert Syst Appl 47:106–119. https:// doi. org/ 10. 1016/j. eswa. 2015. 10. 039 22. Mittal N, Singh U, Sohi BS (2016) Modified grey wolf optimizer for global References engineering optimization. Appl Comput Intell Soft Comput. https:// doi. 1. Abo-Hammour Z, Alsmadi O, Momani S, Arqub OA (2013) A genetic algo- org/ 10. 1155/ 2016/ 79503 48 rithm approach for prediction of linear dynamical systems. Math Probl 23. Modanli M, Go E, Khalil EM, Akgu A (2022) Two approximation methods Eng. https:// doi. org/ 10. 1155/ 2013/ 831657 for fractional order Pseudo-Parabolic differential equations. Alex Eng J 2. Abo-Hammour Z, Arqub OA, Momani S, Shawagfeh N (2014) Optimiza- 61:10333–10339. https:// doi. org/ 10. 1016/j. aej. 2022. 03. 061 tion solution of Troesch’s and Bratu’s problems of ordinary type using 24. Pan I, Pandey DS, Das S (2013) Global solar irradiation prediction using novel continuous genetic algorithm. Discrete Dyn Nat Soc. https:// doi. a multi-gene genetic programming approach. J Renew Sustain Energy. org/ 10. 1155/ 2014/ 401696 https:// doi. org/ 10. 1063/1. 48504 95 3. Abu O, Abo-hammour Z (2014) Numerical solution of systems of second- 25. Pradhan D, Behari L, Sawyer M, Rahman PKSM (2017) Recent bioreduction order boundary value problems using continuous genetic algorithm. Inf of hexavalent chromium in wastewater treatment: a review. J Ind Eng Sci. https:// doi. org/ 10. 1016/j. ins. 2014. 03. 128 Chem 55:1–20. https:// doi. org/ 10. 1016/j. jiec. 2017. 06. 040 4. Anjana K, Kaushik A, Kiran B, Nisha R (2007) Biosorption of Cr( VI) by 26. Qasem NAA, Mohammed RH, Lawal DU (2021) Removal of heavy metal immobilized biomass of two indigenous strains of cyanobacteria isolated ions from wastewater: a comprehensive and critical review. npj Clean from metal contaminated soil. J Hazard Mater 148:383–386. https:// doi. Water. https:// doi. org/ 10. 1038/ s41545- 021- 00127-0 org/ 10. 1016/j. jhazm at. 2007. 02. 051 27. Qu Y, Zhang X, Xu J, Zhang W, Guo Y (2014) Removal of hexavalent chro- 5. Arqub OA, Abo-hammour Z, Momani S, Shawagfeh N (2012) Solving mium from wastewater using magnetotactic bacteria. Sep Purif Technol singular two-point boundary value problems using continuous genetic 136:10–17. https:// doi. org/ 10. 1016/j. seppur. 2014. 07. 054 algorithm. Abstr Appl Anal. https:// doi. org/ 10. 1155/ 2012/ 205391 28. Qureshi ZA, Sultana M, Botmart T, Zahran HY, Yahia IS (2022) Mathemati- 6. Barati R, Neyshabouri SAAS, Ahmadi G (2014) Development of empirical cal analysis about influence of Lorentz force and interfacial nano layers models with high accuracy for estimation of drag coefficient of flow on nanofluids flow through orthogonal porous surfaces with injection of around a smooth sphere: an evolutionary approach. Powder Technol SWCNTs. Alex Eng J 61:12925–12941. https:// doi. org/ 10. 1016/j. aej. 2022. 257:11–19. https:// doi. org/ 10. 1016/j. powtec. 2014. 02. 045 07. 010 7. Bilal S, Ali I, Akgu A, Botmart T, Sayed E, Yahia IS (2022) A comprehensive 29. Ramanan R, Kannan K, Deshkar A, Yadav R, Chakrabarti T (2010) Enhanced mathematical structuring of magnetically effected Sutterby fluid flow algal CO sequestration through calcite deposition by Chlorella sp. and immersed in dually stratified medium under boundary layer approxi- Spirulina platensis in a mini-raceway pond. Bioresour Technol 101:2616– mations over a linearly stretched surface. Alex Eng J 61:11889–11898. 2622. https:// doi. org/ 10. 1016/j. biort ech. 2009. 10. 061 https:// doi. org/ 10. 1016/j. aej. 2022. 05. 044 30. Rangabhashiyam S, Selvaraju N (2015) Adsorptive remediation of 8. Dorman L, Rodgers JH, Castle JW (2010) Characterization of ash-basin hexavalent chromium from synthetic wastewater by a natural and ZnCl waters from a risk-based perspective. Water Air Soil Pollut 206:175–185. activated Sterculia guttata shell. J Mol Liq 207:39–49. https:// doi. org/ 10. https:// doi. org/ 10. 1007/ s11270- 009- 0094-9 1016/j. molliq. 2015. 03. 018 9. Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization 31. Sadhu T, Banerjee I, Lahiri SK, Chakrabarty J (2020) Modeling and optimi- approaches for feature selection. Neurocomputing 172:371–381. https:// zation of cooking process parameters to improve the nutritional profile of doi. org/ 10. 1016/j. neucom. 2015. 06. 083 fried fish by robust hybrid artificial intelligence approach. J Food Process 10. Floares A, Luludachi I (2014) Inferring transcription networks from data Eng 43:1–13. https:// doi. org/ 10. 1111/ jfpe. 13478 XX. 1 Introduction and background. In: Springer Handbook of Bio-/Neu- 32. Saha R, Nandi R, Saha B (2011) Sources and toxicity of hexavalent chro- roinformatics, pp 311–326 mium. J Coord Chem. https:// doi. org/ 10. 1080/ 00958 972. 2011. 583646 11. ATSDR (Agency for Toxic Substances and Disease Registry). 2017. ATSDR’s 33. Sajid M, Waqas M, Ahmed N, Akgül A, Rafiq M, Raza A (2023) Numerical substance priority list. Accessed April 28, 2017. https:// www. atsdr. cdc. simulations of nonlinear stochastic Newell-Whitehead-Segel equation gov/ spl/ and its measurable properties. J Comput Appl Math 418:114618. https:// 12. Gandomi AH, Alavi AH (2012) A new multi-gene genetic programming doi. org/ 10. 1016/j. cam. 2022. 114618 approach to nonlinear system modeling. Part I: materials and structural 34. Salama ES, Roh HS, Dev S, Khan MA, Abou-Shanab RAI, Chang SW, Jeon engineering problems. Neural Comput Appl 21:171–187. https:// doi. org/ BH (2019) Algae as a green technology for heavy metals removal from 10. 1007/ s00521- 011- 0734-z various wastewater. World J Microbiol Biotechnol. https:// doi. org/ 10. 13. Grosman B, Lewin DR (2002) Automated nonlinear model predictive con- 1007/ s11274- 019- 2648-3 trol using genetic programming. Comput Chem Eng 26:631–640. https:// doi. org/ 10. 1016/ S0098- 1354(01) 00780-3 S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 17 of 17 35. Searson D, Willis M, Montague G (2007) Co-evolution of non-linear PLS model components. J Chemom 21:592–603. https:// doi. org/ 10. 1002/ cem. 1084 36. Searson DP, Leahy DE, Willis MJ (2010) GPTIPS: an open source genetic programming toolbox for multigene symbolic regression. In: Proceed- ings of the International multiconference of engineers and computer scientists 2010, IMECS 2010 I, pp 77–80 37. Sen S, Dutta S, Guhathakurata S, Chakrabarty J, Nandi S, Dutta A (2017) Removal of Cr( VI) using a cyanobacterial consortium and assessment of biofuel production. Int Biodeterior Biodegrad 119:211–224. https:// doi. org/ 10. 1016/j. ibiod. 2016. 10. 050 38. Sen S, Rai A, Chakrabarty J, Lahiri SK, Dutta S (2021) Parametric modeling and optimization of phycoremediation of Cr( VI) using artificial neural network and simulated annealing, Algae. Multifarious Appl Sustain World. https:// doi. org/ 10. 1007/ 978- 981- 15- 7518-1_6 39. Shanab S, Essa A, Shalaby E (2012) Bioremoval capacity of three heavy metals by some microalgae species (Egyptian isolates). Plant Signal Behav 7:392–399. https:// doi. org/ 10. 4161/ psb. 19173 40. Tangahu BV, Sheikh Abdullah SR, Basri H, Idris M, Anuar N, Mukhlisin M (2011) A review on heavy metals (As, Pb, and Hg) uptake by plants through phytoremediation. Int J Chem Eng. https:// doi. org/ 10. 1155/ 2011/ 939161 41. USEPA. 2017. Chromium in drinking water. Accessed April 28, 2017. http:// www. epa. gov/ dwsta ndard sregu latio ns/ chrom ium- drink ing- water 42. Xu C, Farman M, Hasan A, Akgu A, Zakarya M, Albalawi W, Park C (2022) Lyapunov stability and wave analysis of Covid-19 omicron variant of real data with fractional operator. Alex Eng J 61:11787–11802. https:// doi. org/ 10. 1016/j. aej. 2022. 05. 025 43. Yen H, Chen P, Hsu C, Lee L (2017) The use of autotrophic Chlorella vulgaris in chromium ( VI) reduction under different reduction conditions. J Taiwan Inst Chem Eng 74:1–6. https:// doi. org/ 10. 1016/j. jtice. 2016. 08. 017 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in pub- lished maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Beni-Suef University Journal of Basic and Applied Sciences Springer Journals

Application of multi-gene genetic programming technique for modeling and optimization of phycoremediation of Cr(VI) from wastewater

Loading next page...
 
/lp/springer-journals/application-of-multi-gene-genetic-programming-technique-for-modeling-WEux8NBvwW

References (46)

Publisher
Springer Journals
Copyright
Copyright © The Author(s) 2023
eISSN
2314-8543
DOI
10.1186/s43088-023-00365-w
Publisher site
See Article on Publisher Site

Abstract

1 Background manageable equations in differential/algebraic form that Modeling of chemical processes requires a deep under- relate output variables to input features so that they can standing of phenomenology. The bulk of chemical have a better understanding of the desired benefits. In processes is enormously complicated, with limited this work, an attempt has been made to explore alterna- knowledge of their mechanisms. Due to a lack of under- tive computational methods for modelling the less under- standing, developing the first principle-based phenom - stood chemical processes. enological model is challenging and time-consuming. It is found in the literature that genetic programming Different researchers have tried different numerical (GP), which is a branch of evolutionary modeling tech- and mathematical algorithm to model the system from niques, has the capability to remove the above draw- input–output data where process phenomenology is not backs of ANN and SVM models. Genetic programming known. Alternative route of numerical analysis currently automatically generates nonlinear structured models as emerging is artificial intelligence (AI)-based modeling closed-form equations relating the input and output of technique [7, 23, 28, 33, 42]. In this scenario, an artificial the system from available data. Not only does GP iden- intelligence-based data-driven modeling technique can tify the structure of the equation, but also it estimates the be a feasible alternative where process phenomenology different parameters of the equations so that it accurately is not required. In this decade, artificial neural network predicts the output. At the time of building up MISO (ANN) and support vector machine (SVM) algorithms (multiple input, single output) models by using GP, the have established themselves as powerful data-driven probability of survival of a particular model to its next black-box modeling technique. In the last 20 years, ANN generations depends on its prediction accuracy and fol- and SVM have been applied to a large number of research lowed ‘survival of the fittest’ principles. Recombination of and industrial applications in diverse fields. ANN, on the components of survived models continuously takes place other hand, generates a complex sigmoidal function at to form a new model aiming at increasing predictability the end with several tuning constants known as weights in each generation. In the literature, the breakthrough in and biases. These complex sigmoidal equations are black GP can be seen in the late 1980s with the experiments of box in nature and do not give any insights into the pro- Koza on symbolic regression [17]. The versatility of the cess phenomenology. In addition, it is very difficult to get GP algorithm is proved by Koza and Rice by applying it in a possible explanation of the processes from these equa- diverse fields of robotics, games, control, etc. [15]. tions. Both ANN and SVM models suffer from these lim - Multi-gene genetic programming (MGGP) is one of itations of explainability despite their superior prediction the robust variants of GP and claims to be more effec - capability. Due to these reasons, ANN and SVM models tive than GP in nonlinear modeling [35, 36]. MGGP is are not favoured by the engineers working in the indus- designed to generate mathematical models of predictor try and by the researchers working in R&D laboratories. response data that are ‘multi-gene’ in nature, i.e. linear Plant engineers and R&D researchers typically want combinations of low-order nonlinear transformations S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 3 of 17 of the input variables. The conventional GP is mainly in human beings and genetic changes that result in birth based on the evaluation of a single tree (model) expres- abnormalities in unborn infants [25]. The World Health sion. In the multi-gene approach, several individual genes Organization (WHO) states that the acceptable limits for are combined to construct a single one [36]. It has been total chromium in drinking water, which includes Cr(III) proved that MGGP regression is capable of providing a and Cr(VI), are 0.05 mg/L and 2 mg/L, respectively [14]. more accurate and computationally efficient model in For public water systems, the Environmental Protection comparison to conventional GP [35, 36]. It can be seen Agency (EPA), USA, limits the maximum total chro- in many cases that MGGP performed better than other mium contaminant level in drinking water to 100 µg/L or machine learning methods, viz. ANN, SVM, etc., in terms 100  ppb [41]. Therefore, before discharging wastewater of predictability and model simplicity [6, 10, 12, 24]. into the environment, chromium must be removed from Since the beginning of industrialization, industrial it. Chromium removal from wastewater has been accom- wastewater carrying various contaminants has been plished using several conventional treatment methods, dumped directly or indirectly into water bodies, disrupt- including solvent extraction, membrane separation, ion ing the aquatic biota. This has made water pollution a exchange, and chemical precipitation [26]. Conven- serious environmental problem. Due to their poisonous tional methods, however, have drawbacks such as high and carcinogenic nature, heavy metals in wastewater play chemical or energy requirements, secondary pollution a key role among other contaminants including organic development, high cost, production of toxic sludge, etc. or inorganic compounds, dyes, and pesticides [37]. Addi- Additionally, they are ineffective for metal concentrations tionally, they cannot biodegrade but rather bioaccumulate below 100 mg/L [4, 14]. As a result, research is currently in living tissue, posing major health risks and even fatali- leaning towards alternative environmentally benign, eco- ties [40]. The specific gravities of heavy metals are five nomically feasible processes. There are about 7000 differ - times larger than those of water, and their atomic weights ent types of algae worldwide. Microalgae can bio-reduce range from 63.5 to 200.6 [39]. In 2015, the Agency for CO from waste gases and the atmosphere; they can use Toxic Substances and Disease Registry (ATSDR) ranked low-quality water, including municipal runoff and indus - Cr(VI) as the sixteenth most dangerous substance [11]. trial wastewater containing toxic metals and organic According to a recent study by Yen et  al. [43], Cr(VI) is matter. Moreover, algal bodies can survive in natural the second most prevalent heavy metal (pollutant) in the weather conditions. Algae can produce significantly more environment, with concentration in groundwater ranging biomass when grown in wastewater; thus, high-quality from 0.008 to 173 mM [43]. Natural sources and anthro- agricultural land is not necessary to grow the algal cells pogenic sources are the two main sources of chromium [29]. Even though there have been many studies on the pollution. Leaching from rocks and topsoil are two natu- reduction of Cr(VI) utilizing microalgal/cyanobacterial ral sources of chromium pollution of surface water. The biomass, very few of them address the intricate math- effluents from industrial sectors such as electroplating, ematical analysis supported by experimental results. Sen leather tanning, anodizing, ink manufacturing, pigments, et al. [38] used Chlorococcum sp. to remove Cr(VI) from dyeing, glass, ceramics, glues, wood processing, paint simulated wastewater [38]. industry, metal cleaning, mining, and textile industries In the present article, the phycoremediation method for are some of the main anthropogenic sources of chro- the removal of Cr(VI), a very complex process, was chosen mium in surface water bodies [30]. The most stable forms as a case study. The experimental data as reported by Sen of chromium are Cr(III) and Cr(VI). Cr(III) is used as an et  al. [38] were used for modeling and optimization using essential nutrient for animals; however, due to its strong MGGP and grey wolf optimization (GWO) techniques, teratogenic, mutagenic, and carcinogenic effects, Cr(VI) respectively. In the present study, an attempt was made to is over 300 times more hazardous than Cr(III) [27]. Due explore the relationship between the removal of Cr(VI) with to its high solubility, Cr(VI) can easily pass through cell input parameters like pH, inoculum size, initial concentra- membranes. The inappropriate disposal of solid waste tion of Cr(VI), and time. Once a model was built through from chromate-processing facilities is mostly to blame the application of MGGP, that model  was utilized to get for groundwater pollution with chromium. The dose of the  phenomenological insights  of  the phycoremediation chromium, the length of exposure, and the type of com- process. One major objective of the application of MGGP pound all affect one’s health. Acute exposure to chro - was to develop an accurate phycoremediation model which mate leads to several health problems in human beings, is a closed-form equation, portable, and can be used by the like gastrointestinal disorders, haemorrhagic diathesis, plant engineers to gain insight into the process. The second and convulsions. In extreme circumstances, it could also objective of the study was to use the developed model to result in cardiovascular shocks and death [32]. Addition- maximize the removal of Cr(VI). This was accomplished by ally, it is known that hexavalent chromium causes cancer using a nature-inspired metaheuristic optimization method Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 4 of 17 for the optimization of the input process parameters [1–3, Table 2 Some portion of input–output data used for GP training 5]. To produce Pareto optimal solutions that meet the goals pH (x ) Inoculum Initial conc. Time (x ) (day) Removal 1 4 in the most efficient way possible, the grey wolf optimiza - size (x ) of Cr(VI) (x ) of Cr(VI) (y) 2 3 (%) (mg/L) (%) tion (GWO) was utilized to improve the input space of the MGGP model for phycoremediation. GWO has the fol- 7 10 22.23 12 45.667 lowing advantages: (i) it is a nature-inspired modern opti- 9 10 10 5.68 25.000 mization technique and has world-spread applicability; (ii) 9 10 10 10 42.580 GWO has not been used so far to optimize complex phy- 7 10 17.64 12 61.351 coremediation models. 10.62 10 20 12 60.840 Therefore, overall novelties of the present study are: (i) 11 10 20 11.467 53.778 MGGP has been used to solve an industrial problem; (ii) as 9 2 20 13.03 29.826 far as known, MGGP has not been used to build a phycore- 9 5 20 5.37 21.005 mediation model; (iii) out of several good performing mod- 9 4.249 20 12 40.759 els, those models are sorted out which actually obey the 7.85 10 20 12 65.200 internal phenomenology; and (iv) grey wolf optimization 9 10 25 5.611 3.986 (GWO), a nature-inspired metaheuristic optimization algo- 9 10 20 9.82 57.358 rithm, has been used to optimize the process parameters. 9 10 10 10.60 44.583 9 10 20 3.63 26.307 7 10 8.26 12 40.939 2 Methods 9 10 15 5.12 25.904 2.1 G eneration of data for model building Sen et  al. [38] performed experiments on the removal of Cr(VI) using the Chlorococcum sp., a cyanobacterial spe- cies. They had 82 experimental data points. In the present y = f (X , β) (1) study, using 82 data points and ‘Plot Digitizer’ software 897 data points were generated for better MGGP model build- where y indicates the process output variable (removal of ing. After building a reliable MGGP model for the phy- Cr(VI)); X is the N-dimensional vector of input variables coremediation process, the model equation is optimized like pH, inoculum size, initial concentration of Cr(VI), through the grey wolf optimization (GWO) technique. The etc. ranges of four variables, including pH, inoculum size (IS), X = [x , x , . . . , x , . . . , x ] (2) 1 2 n N initial concentrations (IC) of Cr(VI), and time, are shown in Table  1. Table  2 shows some of the experimental data and f denotes a nonlinear function whose parameters which are generated by ‘Plot Digitizer’ software. are defined in terms of a P-dimensional vector, β β , β . If phycoremediation data of input and 1 2,...,β output variables are given, the GP algorithm tries to best 2.2 Multi‑ gene genetic programming: at a glance fit the data by changing its functional form and parame - GP is a type of metaheuristic symbolic modeling tech- ter vector β . Genetic programing iterative modelling pro- nique that creates equations for solving problems based cess is shown in Fig. 1. on Darwinian natural selection’s ‘survival of the fittest’ concept [13]. The following is the general form of the process model to be obtained (Eq. 1) 2.2.1 Executional steps of genetic programming The algorithm of GP is illustrated in Fig.  2. The execution steps of GP are mentioned below [31]: Table 1 Name and range of input and output parameters Step 1 (Initialization) In the first step, the GP algo - rithm creates random equations to fit the data defined in Range Eq.  (1). These equations are commonly called the popu - Input parameters lation of strings (chromosomes) representing candidate pH (x ) 7–11 solutions. Basically, a population member consists of Inoculum size (x ) 2–10% (v/v) functions and terminals combined in a hierarchical man- Initial concentration of Cr( VI) (x ) 5–30 mg/L ner, which is termed the tree. The function set may con - Time (x ) 2–14 day tain algebraic operators and Boolean logical operators. Output parameters The terminal set may consist of variables and numerical Removal of Cr( VI) (y) 0–100% and logical constants. A typical tree is shown in Fig. 1. S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 5 of 17 Fig. 1 Genetic programming iterative modelling process Step 2 (Generation) This step is an iterative procedure fitness of a particular population. In this study, root- to generate the population with a high fitness value and mean-squared error (RMSE) was considered an consists of the following sub-steps (Fig. 2): error parameter. (b) Select individual equations from the population with the help of probabilistic determination of fit - (a) The fitness of each population is evaluated using a ness. prespecified fitness function. A coefficient of deter - (c) Create new individual equations with the help of mination (R ) dependent or error-dependent fitness genetic operators such as (Fig. 2): function can be used for this purpose. Higher the value of R or lower the value of error, the more the Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 6 of 17 Fig. 2 Algorithm of genetic programming (i) Reproduction During reproduction, the algo- (ii) Crossover In the crossover, offspring is pro - rithm copies the existing population into a new duced by the interchanging of chromosomes of generation without any change. the parent generation. S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 7 of 17 time, inoculum size, and initial concentrations of heavy metals. 2.2.4 S election of input and output variables for modeling Since the removal percentage of Cr(VI) has a large impact on wastewater, this parameter is kept as the out- put variable. All the input variables which can impact the Cr(VI) removal are studied in the literature [8, 34, 37]. Four input variables are finally shortlisted and shown in Table 1. Fig. 3 A typical MGGP model 2.2.5 Mo deling through multi‑gene genetic programming (MGGP) (iii) Mutation In mutation, the replacement of The experimental data were taken for GP-based mod - existing elements in offspring with other ele - eling. Random partitioning of the dataset comprising ments takes place. four input variables and one output variable (Table  1) into a training set (80% of whole data) and test set (20% The reproduction, crossover, and mutation steps are of whole data) was performed. The training set was used illustrated in Fig. 1. to compute the model by maximizing the fitness value, Step 3 (Termination) When the termination criteria are whereas the test data set was used for cross-validation of met, the best program will be the approximate solution the expression developed. The main objective of cross- to the problem (Fig. 2). validation is to make the model more generalizable. Open-source GPTIPS toolkit coupled with subrou- 2.2.2 Structure of MGGP tines written in MATLAB 2019a was utilized to construct Usually, symbolic regression uses GP to create a popula- the GP-based model [36]. In this study, the root-mean- tion of trees. Each of these trees is basically a mathemati- squared error (RMSE) between actual and predicted out- cal expression. In MGGP, a multiple tree structure is used puts was employed as a fitness function, and the program to predict the output as shown below. was run in such a way that the RMSE value was kept as y = C + C tree + C tree + . . . + C tree 0 1 1 2 2 n n (3) low as possible. Due to the stochastic character of the GP, the software was run 100 times to build the model. Each of these trees can be considered a gene. A typical multi-gene model is shown in Fig. 3. This model predicts 2.3 Optimization through grey wolf optimization (GWO) an output variable using three input variables ( x , x , x ). 1 2 3 Mirjalili et  al. [20] developed GWO which is a stochas- The characteristics of this model structure are though it tic and metaheuristic optimization methodology. This contains nonlinear terms (like a square), it is linear in the bionic optimization algorithm stimulates the rank-based parameter concerning the coefficients C , C , andC . The 0 1 2 mechanisms and attacking behaviour of the grey wolf maximum allowable number of genes (G ) for a model max pack. The lead wolf helps the other wolves to capture and the maximum tree depth ( D ) have a profound max the prey through the surrounding, haunting, and attack- effect on the final model structure and are usually speci - ing process. This large-scale search methodology centred fied by the users. From the literature, it is found that a on the three best grey wolves, but there is no elimina- maximum tree depth of 4 or 5 nodes usually gives a com- tion mechanism. The optimization technique is different pact efficient model. The linear coefficients C , C , andC 0 1 2 from others in terms of modeling. It constitutes a strict are usually evaluated by applying the ordinary least hierarchical pyramid. The group size is 5–12 on average. square method to the training data. In the literature, Layer α, consisting of a male and a female leader, is the multi-gene symbolic regression is reported as more accu- strongest and most capable individual for deciding the rate and efficient than the standard GP. team’s predation actions and other activities. Layer β and layer δ are the second and third layers, respectively, in the 2.2.3 Application of MGGP in phycoremediation process hierarchy, responsible for assisting α in the behaviour of modeling group organizations. The bottom ranking of the pyramid This paper tries to explore the MGGP modeling tech - is occupied by the majority of the total, named ω. They nique for the phycoremediation process to obtain a are mainly responsible for satisfying the entire pack by meaningful nonlinear relationship between the removal balancing the internal relationship of the populations, of heavy metals with various input parameters like pH, Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 8 of 17 Fig. 4 Pseudocode of GWO algorithm looking after the young, and maintaining the dominant trial-and-error approach and literature survey and are structure [20]. The social hierarchy, encircling, hunting, shown in Table  3 [36]. A sensitivity analysis was per- attacking prey (exploitation), and searching for prey are formed where the population size, maximum genera- the main key elements of the GWO model (exploration). tion, and maximum number of genes were varied. A The detailed mathematical modeling of GWO has been balance has been made between model complexity and described by Mirjalili et al. [20]. prediction accuracy as shown in Pareto diagram (Fig.  5). This metaheuristic approach is applied in various real- The maximum allowable number of genes ( G ) for a max world problems because of its efficient and simple perfor - model and the maximum tree depth ( D ) have a pro- max mance ability by tuning the fewest operators [9, 16, 18, found effect on the final model structure and are usu - 21, 22]. Recent research in this regard looks forward to ally specified by the users. From the literature, it is found the further development of the optimization algorithm. that a maximum tree depth of 4 or 5 nodes usually gives The detail of the GWO algorithm is depicted in Fig. 4. a compact efficient model. Population size, maximum generations, G andD were kept at 500, 150, 6, and max max 3 Results 4, respectively. The linear coefficients C , C , andC are 0 1 2 3.1 P erformance of the MGGP model usually evaluated by applying the ordinary least square The main objective of the present study is to gener - method to the training data [36]. Each run consists of ate a closed-form model equation of the phycoreme- more than 10,000 iterations as long as MGGP is able to diation process which is accurate, simple, and portable. improve the R value and minimize the RMSE value. The Eight hundred and ninety-seven data sets were quali- stopping criteria for MGGP are when RMSE drops below fied for model building. The values of MGGP param - the threshold value of 0.001 or maximum execution time eters required for modelling were decided based on a is achieved. Since MGGP is stochastic in nature, 20 such S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 9 of 17 Table 3 Run parameters percentage of Cr(VI). For each run of GP, one Pareto dia- gram was developed (Fig.  5). Pareto diagrams represent Run parameter Value the plot of expressional complexity (represented by the Population size 500 number of nodes in the equation) vs. prediction accuracy Max. generations 150 represented by (1 − R ). Generations elapsed 150 Input variables 4 3.3 Controlling model complexity Training instances 718 The use of regression models frequently suffers from the Tournament size 25 phenomenon called ‘bloat’, vertical and horizontal bloat. Elite fraction 0.7 The tendency to evolve trees with terms that provide little Lexicographic selection pressure On or no performance benefit is known as vertical bloat. This Probability of Pareto tournament 0.8 is connected to the phenomena of overfitting in terms Max. genes 6 of model development. Researchers have used a differ - Max. tree depth 4 ent approach to handle overfitting. Miriyala et al. in their Max. total nodes Inf paper conducted six tests for the overfitting, viz. infor - ERC probability 0.1 mation theory test, cross-validation test, held-out sample Integer 0.6 test, testing with noisy data, and goodness-of-fit test [19]. Crossover probability 0.84 In the present study, by the restriction on tree depth and High level 0.2, low level 0.8 use of the Pareto tournament between expressional com- Mutation probabilities 0.14 plexity and accuracy the vertical bloats were ameliorated. Subtree 0.9, Input 0.05, Perturb ERC 0.05 Tree depth was kept 4 by trial-and-error method, and the Complexity measure Expressional Pareto diagram is generated as shown in Fig. 5. Function Set TIMES MINUS PLUS MULT3 ADD3 Sometimes during execution multi-gene models add additional genes which do not give significant perfor - mance enhancement but add to the model complexity. This phenomenon is known as horizontal bloat. The sim - plest technique to avoid horizontal bloat in multi-gene regression is to limit the number of genes allowed in the model. In this study, the maximum number of genes is kept at 6 after the trial-and-error method. 3.4 Shortlisting the models Out of potential candidates or representative model equations (Table  4) with varying degrees of complexity and accuracy, for the selection of a suitable model, the following criteria were kept in mind: (i) Simplicity: The model complexity should be as low as possible. (ii) Prediction accuracy: The developed model should Fig. 5 Pareto diagram of model complexity vs. accuracy have low RMSE and high R . (iii) The model equation should capture the underly - ing physics of the process. In other words, model equations are not merely a predictive correlation but also should have a physical sense of the sys- runs were done to conclude the final model. Further tem under study. This is a prime consideration to increase in run up to 100 has no substantial benefit on develop real-life models. To judge this capabil- prediction accuracy. ity, domain experts’ qualitative knowledge about phycoremediation behaviour is collected from the literature and experiment. The operating skill, cir - 3.2 De veloping closed‑form model equations cumstance, and surveillance of phycoremediation GP generates a lot of promising model equations dur- behaviour are summarized in Table 5. ing its run. Table  4 summarizes the most prominent closed-form equations generated by GP for the removal Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 10 of 17 Table 4 Model equations: expressional complexity/performance characteristics (on training data) of symbolic models on the Pareto front Model ID Goodness of fit (R ) Model complexity Model 2 2 3 555 0.917 56 0.0957x x − 2.64 exp (−4)x x 3 1 3 − 7.13x − 3.47 exp −4 x x (4) ( ) 3 4 + 0.179x (2x + x ) − 0.0641x + 8.81 4 2 3 2 2 3 560 0.915 52 0.108x x − 3.01 exp (−4)x x − 10.2x 1 3 3 1 3 − 3.97 exp (−4)x x + 0.189x (2x + x ) + 20 4 4 2 3 3 (5) 2 3 2 596 0.895 50 0.105x − 2.23 exp (−4)x x + 0.0892x x 2 1 1 3 3 + 0.174x (2x + x ) − 0.354x − 0.0154x x x − 24.3 4 2 3 1 3 4 3 (6) 0.36x + 0.18x + 2.65x 2584 0.944 147 1 3 4 − 0.526x (2x − 9.19) − 0.126x x 4 3 4 + 0.18x x x 1 3 4 2 2 + 2.34 exp −6 x x x − 18.4 x − 12.7 ( ) ( )( ) 3 3 3 4 + 3.64 exp (−6)x x x (x − 12.7)(x − x ) + 0.394 3 4 3 2 3 1 (7) 0.173(x + x + x ) + 2.75x − 0.508x (2x − 9.19) 2730 0.945 159 1 2 3 4 4 3 − 0.121x x + 0.173x x x 4 1 3 4 + 3.87 exp (−6)x x (x − 18.4)(x − 12.7)(x − 12.9) 3 4 3 3 3 + 3.47 exp (−6)x x x (x − 12.7)(x − x ) + 0.415 3 4 3 2 3 1 (8) 0.173(x + x + x ) + 2.74x − 0.507x (2x − 9.19) 2871 0.945 152 1 2 3 4 4 3 − 0.121x x + 0.173x x x 4 1 3 4 2 2 + 3.83 exp (−6)x x (x − 18.4)(x − 12.7) 3 3 3 + 3.47 exp (−6)x x x (x − 12.7)(x − x ) + 0.447 1 3 4 3 2 3 (9) 7 2 3 4535 0.961 171 2.68 exp (−8)x − 1.62 exp (−5)x x + x + x x x 3 1 2 4 3 3 3 − 1.73 exp (−4)x (x + 2x − 5.02) + 0.0497x x x 1 4 1 2 4 − 1.81 exp(−7x x x + x ( ) 1 1 4 + 9.38 exp (−5)x x (x + 5.02)(x + x ) 1 3 1 4 + 2.07 exp −5 ( ) (10) 4812 0.96 165 7 2 3 2.86 exp (−8)x − 1.64 exp (−5)x x + x x x x 3 1 2 4 3 3 3 −1.07 exp (−4)x (x + 3x ) + 0.0511x x x 1 4 1 2 4 +1.28 exp (−4)x x (x + x ) 1 1 4 −2.12 exp (−7)x x (x + x ) + 2.1 exp (−5) 1 1 4 3 (11) 7 4 4924 0.96 161 2.69 exp (−8)x − 1.64 exp (−4)x (x + 2x − 5.02) 1 4 3 1 2 3 −1.57 exp (−5)x x + x − 5.02 + 0.0453x x x 1 1 2 4 3 3 −1.86 exp −7x x (x + x ) 1 1 4 +9.24 exp (−5)x x (x + 5.16)(x + x ) 1 3 1 4 +2.19 exp (−5) (12) 7484 0.934 64 0.644x + 1.77x x − 1.56x x + 0.0115x x 1 2 4 3 4 4 3 2 −0.23x x (x − x ) + 2.35 exp (−6)x x x (x − x ) + 0.0206 1 4 2 3 4 2 3 1 3 (13) 8003 0.935 72 3.12(x − x ) − 13.4x + 0.225x x + 0.0186x x x 4 1 3 1 2 3 4 +1.35 exp (−5)x (x − x )(x + x − 7) 3 3 4 1 3 −0.00576x x (x + x ) + 48.9 1 1 3 3 (14) S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 11 of 17 Table 4 (continued) Model ID Goodness of fit (R ) Model complexity Model 8122 0.926 59 2.92x − 11.4x + 1.27 exp (−5)x (x − x ) 4 3 3 4 +0.201x x + 0.0178x x x 1 2 3 4 −0.00522x x (x + x ) + 19.1 1 1 3 3 (15) 2 3 8591 0.922 58 0.706x − 10.1x − 2.83 exp (−4)x x 4 3 1 3 2 2 +0.103x x − 0.00238x x x 1 2 4 3 3 +0.0661x x x + 25.7 2 3 4 (16) Table 5 Rules to select the best model from real observations Sl. No. Parameter changed keeping all other parameters constant Whether process phenomenology matches with model prediction or not 1 pH (x ) increase/decrease Yes 2 Inoculum size (x ) increase/decrease Yes 3 Initial concentration of Cr( VI) (x ) increase/decrease Yes 4 Time (day) (x ) increase/decrease Yes 3.5 S hortlisting the model based on phenomenology a closed form  equation (like Eq.  10), which is portable, All thirteen equations in Table 4 are subjected to the above and easy to understand. GP develops a closed-form equa- scrutiny to judge whether the developed equation is in tion with a high prediction capability, but the created agreement with real observations. All the developed mod- equation is vast and complex, and it can be difficult to els are passed through rigorous testing. For example, ten interpret directly. In the present study, a methodology is test data set was generated where all the variables are kept at developed to enhance the interpretability of the devel- their 50-percentile value except pH which was varied from oped equations. Figures  6 and 7 summarize the devel- its minimum value to maximum value by 10 equal intervals. oped methodology. Figure  6 is created by varying one These test data were put in equations of Table  4, and respec- variable at a time from its minimum to the maximum tive 10-set removal percentage data were generated. After value (10 steps) while maintaining the average value of that, the first pH vs removal percentage was plotted. From the other three input variables. Cr(VI) removal equations these plots, observation number 1 of Table  5 was verified. developed by GP are used to predict the output value in In the same way, all other plots were generated which are each case of these simulated test data. After plotting was shown in Fig. 6. Final model equation is as follows: done, a trendline was drawn through each data. Based on eye inspection and R value, a trend line curve was cho- 7 2 3 y = 2.68 exp (−8)x − 1.62 exp (−5)x x + x + x x x 3 1 2 4 sen (such as a straight line, or a polynomial of degree 2 or 3 3 3 3 or more) that almost matches the data. − 1.73 exp (−4)x (x + 2x − 5.02) 1 4 On the other hand, Fig.  7 depicts the actual phenom- + 0.0497x x x − 1.81 exp(−7x x (x + x ) 1 2 4 1 1 4 enology of the phycoremediation process for the removal of Cr(VI). From the figure, it is seen that the removal of + 9.38 exp (−5)x x (x + 5.02)(x + x ) 1 3 1 4 Cr(VI) increases with the initial concentration of Cr(VI) + 2.07 exp (−5) up to 20  mg/L. Further increase in initial concentration (4) decreases the removal percentage. pH was varied from 7 to 11, and it is seen that removal of Cr(VI) increases with Table  6 shows the coefficient of determination (R ), time and reaches its maximum value at pH 9. Further root-mean-squared error (RMSE), and mean absolute increases in pH decrease the removal of Cr(VI). Inocu- error (MAE) values for training and test data. lum size was varied from 2 to 10%, and it is observed that the removal of Cr(VI) monotonically increases with 3.6 G eneration of explainable model equations inoculum size. Kinetic study was performed with respect One of the major advantages of the GP modeling tech- to the initial concentration (10–25 mg/L), pH (7–11), and nique over ANN and SVR techniques is that it generates Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 12 of 17 Fig. 6 Influence of each parameter on the removal of Cr( VI) Table 6 Performance of MGGP model4 Discussion 4.1 MGGP modeling R RMSE MAE In the current research work, basic arithmetic opera- Training 0.96 3.532 2.64 tors and functions, the population size of 500, maximum Testing 0.96 3.735 2.79 generation of 150, maximum tree depth of 4, and a maxi- mum number of genes of 6 were proved to be the best for modeling (shown in Table  3). If these parameters are taken with high values, the model accuracy may increase inoculum size (2–10%). In all the cases of kinetic study, it but the complexity of the solutions also increases, and is seen that removal of Cr(VI) increases with time. also the program becomes computationally expensive. As shown in Table 4, there is always a conflicting objec - 3.7 Optimization through GWO tive between model complexity (denoted by a number of Once the reliable models were successfully developed, tree nodes) and model prediction accuracy (denoted by the models were subjected to optimization. The purpose coefficient of determinations (R )). More accurate mod- of optimization is to find out the finest combination of els are more complex and vice versa. Complex models are input parameters that correspond to the highest Cr(VI) difficult to interpret and unnecessarily overfit the data. removal. One of the most significant jobs for any process Pareto diagrams represent the plot of expressional optimization is to define the search space in which the complexity (represented by the number of nodes in optimal process conditions should be found. the equation) vs. prediction accuracy represented by GWO code in MATLAB is used to optimize the input (1 − R ). An optimum point was found as the final model parameters (pH, inoculum size, and the number of days) with  a low value of (1  −  R )  and an acceptable value of which gives maximum removal of Cr(VI). Optimiza- model complexity. The red triangular point in the Pareto tion conditions and the optimal solutions are shown in diagram was chosen as a model. Tables 7 and 8, respectively. S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 13 of 17 Fig. 7 Phycoremediation process phenomenology for the removal of Cr( VI) All thirteen equations in Table 4 are subjected to scru- mind, firstly accuracy and secondly obeying underlying tiny to judge whether the developed equation is in agree- physics. ment with real observations. All the developed models From the shortlisted model equations, only one model are passed through rigorous testing. The scrutiny is nec - equation for the removal percentage of Cr(VI) was finally essary because in any modeling, besides the accuracy of selected (Eq.  10): This equation is considered the repre - the model, the developed relationship between output sentative model equation for the removal of metals as it is and input variables should be in agreement with the highly accurate, obeys Table 5 observations, and captures underlying physics of the process. In most cases, data- the internal physics of the process. Figure  7 shows the driven models with high accuracy end up being only actual phenomenology of the phycoremediation process. black-box models without any insight into the system. The high R and low RMSE values for the Cr(VI) In this procedure, the two parameters have been kept in removal model (Table  6) show that the predicted and Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 14 of 17 Table 7 Lower bounds and upper bounds for some optimization percentage of Cr(VI) and the operating parameters of the cases phycoremediation process. These trend lines can be used by plant operating engineers to get insights into how a Cases Bounds x x x x 1 2 3 4 particular input parameter affects the removal percentage Case 1 (IC: 5 mg/L) LB 7 2 5 2 of Cr(VI). UB 11 10 5 14 Case 2 (IC: 10 mg/L) LB 7 2 10 2 4.2 GWO optimization UB 11 10 10 14 In the present study, since the initial concentrations of Case 3 (IC: 15 mg/L) LB 7 2 15 2 Cr(VI) are not under our control and they are fixed as UB 11 10 15 14 per wastewater data, it is not considered an optimization Case 4 (IC: 20 mg/L) LB 7 2 20 2 parameter (Case 1–Case 6). Here, there are only three UB 11 10 20 14 optimization variables (pH, inoculum size, and the num- Case 5 (IC: 25 mg/L) LB 7 2 25 2 ber of days). The input parameters pH (x ), inoculum size UB 11 10 25 14 (x ), and time (x ) are varied in the range of 7–11, 2–10%, 2 4 Case 6 (IC: 30 mg/L) LB 7 2 30 2 and 2–14  days, respectively. In case 7, initial concentra- UB 11 10 30 14 tion of Cr(VI) also has been considered as optimization Case 7 (IC: 5–30 mg/L) LB 7 2 5 2 parameter. The lower bounds (LB) and upper bounds UB 11 10 30 14 (UB) explored in these scenarios are shown in Table 7. It is observed from Table  8 that removal of Cr(VI) increases with the initial concentration of Cr(VI) up to actual output values are comparable, and the models cre- 20  mg/L (Case 1–Case 4). Further increase in Cr(VI) ated are trustworthy, fairly accurate, and represent the concentration decreases the removal percentage (Case phycoremediation process’ underlying physics. The high 5–Case 6). In Case 7, pH, IS, IC, and time are the opti- R value on unseen test data and low RMSE further sug- mization parameters. It is observed from Table  8 that if gest the model’s generalizability and accurate learning on wastewater contains Cr(VI) at a concentration level of nonlinear input and output relationships. 14  mg/L, then if pH is kept at 9.48 (x ), IS (x ) at 10%, 1 2 Figure 8 depicts the models’ prediction performance on then within 13.4 days, 99% Cr(VI) removal is possible. training and testing data. The almost overlapping nature of the actual vs predicted curve indicates the model’s good prediction accuracy.5 Conclusions The generated model is highly accurate and dependa - This study uses multi-gene genetic programming tech - ble, as evidenced in Table 6, Figs. 6, 7, and 8. It also works niques to develop an accurate model of a phycoreme- well with unknown test data. diation process from the experimental data. The study’s As seen from Fig.  6, the developed trend lines are very key contribution is the development of an extremely pre- decisive and monotonically increasing/decreasing. As cise and explainable model equation that comes with an mentioned earlier, they all match the actual observations understanding of the process. The produced model equa - (Fig.  7) and obey Table  5. In short, developed models tions are based on the process’s underlying physics and capture the nonlinear relationship between the removal correspond to the expert’s observations and experiences. Table 8 Optimal solutions Cases Optimum values of input variables Optimum removal of pH (x ) Inoculum size (%) Initial concentration Time (day) (x ) 1 4 Cr(VI) (%) (v/v) (x ) (mg/L) (x ) 2 3 Case 1 (IC: 5 mg/L) 7 10 5 14 39.24 Case 2 (IC: 10 mg/L) 7.23 10 10 14 52.04 Case 3 (IC: 15 mg/L) 8.7 10 15 14 73.44 Case 4 (IC: 20 mg/L) 9.06 10 20 14 75.48 Case 5 (IC: 25 mg/L) 7 10 25 14 58.36 Case 6 (IC: 30 mg/L) 7 9.15 30 8.57 42.89 Case 7 (IC: 5–30 mg/L) 9.48 10 14.02 13.4 99.06 S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 15 of 17 Fig. 8 Actual versus predicted plots of removal of Cr( VI) with a training data and b test data LB Lower bounds Developed model equations are then used to generate UB Upper bounds optimal solutions to maximize the removal percentage AI Artificial intelligence of Cr(VI). Also, it has been shown that if the operating ATSDR Agency for Toxic Substances and Disease Registry mM Millimole parameters are adjusted as prescribed by optimization EPA En vironmental Protection Agency results, it can be possible to increase the removal per- USEPA United States Environmental Protection Agency centage up to 99%. C, C, C Linear coefficients 0 1 2 ppb Parts per billion Acknowledgements Abbreviations The authors sincerely acknowledge the computational and analytical instru- GP Genetic programming mentation support received from DST-FIST, GOI by the Department of Chemi- MGGP Multi-gene genetic programming cal Engineering, National Institute of Technology, Durgapur, India. GWO Grey wolf optimization ANN Ar tificial neural network Author contributions SVM Support vector machine BS is the investigator and performed the process optimization, modelling, and R&D R esearch and development paper writing. SS is the investigator and was involve in paper writing. SD con- MISO Multiple input single output tributed to conceptualization, supervision, and reviewing and final editing the WHO World Health Organization paper thoroughly. SKL was involved in conceptualization, supervision, model- IS Inoculum size ling, optimization, and reviewing and final editing the paper thoroughly. All IC Initial concentration authors read and approved the final manuscript. RMSE Root-mean-squared error R Coefficient of determination Funding G Maximum number of allowable genes max Not applicable. D Maximum tree depth max MAE M ean absolute error Sarkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 16 of 17 Availability of data and materials 14. Gupta VK, Rastogi A (2009) Biosorption of hexavalent chromium by raw All the data presented in the article are produced or evaluated during the and acid-treated green alga Oedogonium hatei from aqueous solutions. J research. Hazard Mater 163:396–402. https:// doi. org/ 10. 1016/j. jhazm at. 2008. 06. 104 15. Koza JR, Rice JP (1992) Genetic programming: the movie. The MIT Press, Cambridge Declarations 16. Kohli M, Arora S (2018) Chaotic grey wolf optimization algorithm for con- strained optimization problems. J Comput Des Eng 5:458–472. https:// Ethics approval and consent to participate doi. org/ 10. 1016/j. jcde. 2017. 02. 005 Not applicable. 17. Koza JR (1994) Genetic programming: on the programming of computers by means of natural selection. In: Koza JR (ed) A bradford book. MIT Press, Consent for publication Cambridge Not applicable. 18. Lu Q, He ZL, Graetz DA, Stoffella PJ, Yang X (2010) Phytoremediation to remove nutrients and improve eutrophic stormwaters using water let- Competing interests tuce (Pistia stratiotes L.). Environ Sci Pollut Res 17:84–96. https:// doi. org/ There are no competing interests between the authors. 10. 1007/ s11356- 008- 0094-0. 19. Miriyala SS, Mittal P, Majumdar S, Mitra K (2016) Comparative study of sur- Author details rogate approaches while optimizing computationally expensive reaction Department of Chemical Engineering, National Institute of Technology networks. Chem Eng Sci 140:44–61. https:// doi. org/ 10. 1016/j. ces. 2015. 09. Durgapur, Durgapur 713209, India. 20. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw Received: 8 December 2022 Accepted: 14 February 2023 69:46–61. https:// doi. org/ 10. 1016/j. adven gsoft. 2013. 12. 007 21. Mirjalili S, Saremi S, Mirjalili SM, Coelho LDS (2016) Multi-objective grey wolf optimizer: a novel algorithm for multi-criterion optimization. Expert Syst Appl 47:106–119. https:// doi. org/ 10. 1016/j. eswa. 2015. 10. 039 22. Mittal N, Singh U, Sohi BS (2016) Modified grey wolf optimizer for global References engineering optimization. Appl Comput Intell Soft Comput. https:// doi. 1. Abo-Hammour Z, Alsmadi O, Momani S, Arqub OA (2013) A genetic algo- org/ 10. 1155/ 2016/ 79503 48 rithm approach for prediction of linear dynamical systems. Math Probl 23. Modanli M, Go E, Khalil EM, Akgu A (2022) Two approximation methods Eng. https:// doi. org/ 10. 1155/ 2013/ 831657 for fractional order Pseudo-Parabolic differential equations. Alex Eng J 2. Abo-Hammour Z, Arqub OA, Momani S, Shawagfeh N (2014) Optimiza- 61:10333–10339. https:// doi. org/ 10. 1016/j. aej. 2022. 03. 061 tion solution of Troesch’s and Bratu’s problems of ordinary type using 24. Pan I, Pandey DS, Das S (2013) Global solar irradiation prediction using novel continuous genetic algorithm. Discrete Dyn Nat Soc. https:// doi. a multi-gene genetic programming approach. J Renew Sustain Energy. org/ 10. 1155/ 2014/ 401696 https:// doi. org/ 10. 1063/1. 48504 95 3. Abu O, Abo-hammour Z (2014) Numerical solution of systems of second- 25. Pradhan D, Behari L, Sawyer M, Rahman PKSM (2017) Recent bioreduction order boundary value problems using continuous genetic algorithm. Inf of hexavalent chromium in wastewater treatment: a review. J Ind Eng Sci. https:// doi. org/ 10. 1016/j. ins. 2014. 03. 128 Chem 55:1–20. https:// doi. org/ 10. 1016/j. jiec. 2017. 06. 040 4. Anjana K, Kaushik A, Kiran B, Nisha R (2007) Biosorption of Cr( VI) by 26. Qasem NAA, Mohammed RH, Lawal DU (2021) Removal of heavy metal immobilized biomass of two indigenous strains of cyanobacteria isolated ions from wastewater: a comprehensive and critical review. npj Clean from metal contaminated soil. J Hazard Mater 148:383–386. https:// doi. Water. https:// doi. org/ 10. 1038/ s41545- 021- 00127-0 org/ 10. 1016/j. jhazm at. 2007. 02. 051 27. Qu Y, Zhang X, Xu J, Zhang W, Guo Y (2014) Removal of hexavalent chro- 5. Arqub OA, Abo-hammour Z, Momani S, Shawagfeh N (2012) Solving mium from wastewater using magnetotactic bacteria. Sep Purif Technol singular two-point boundary value problems using continuous genetic 136:10–17. https:// doi. org/ 10. 1016/j. seppur. 2014. 07. 054 algorithm. Abstr Appl Anal. https:// doi. org/ 10. 1155/ 2012/ 205391 28. Qureshi ZA, Sultana M, Botmart T, Zahran HY, Yahia IS (2022) Mathemati- 6. Barati R, Neyshabouri SAAS, Ahmadi G (2014) Development of empirical cal analysis about influence of Lorentz force and interfacial nano layers models with high accuracy for estimation of drag coefficient of flow on nanofluids flow through orthogonal porous surfaces with injection of around a smooth sphere: an evolutionary approach. Powder Technol SWCNTs. Alex Eng J 61:12925–12941. https:// doi. org/ 10. 1016/j. aej. 2022. 257:11–19. https:// doi. org/ 10. 1016/j. powtec. 2014. 02. 045 07. 010 7. Bilal S, Ali I, Akgu A, Botmart T, Sayed E, Yahia IS (2022) A comprehensive 29. Ramanan R, Kannan K, Deshkar A, Yadav R, Chakrabarti T (2010) Enhanced mathematical structuring of magnetically effected Sutterby fluid flow algal CO sequestration through calcite deposition by Chlorella sp. and immersed in dually stratified medium under boundary layer approxi- Spirulina platensis in a mini-raceway pond. Bioresour Technol 101:2616– mations over a linearly stretched surface. Alex Eng J 61:11889–11898. 2622. https:// doi. org/ 10. 1016/j. biort ech. 2009. 10. 061 https:// doi. org/ 10. 1016/j. aej. 2022. 05. 044 30. Rangabhashiyam S, Selvaraju N (2015) Adsorptive remediation of 8. Dorman L, Rodgers JH, Castle JW (2010) Characterization of ash-basin hexavalent chromium from synthetic wastewater by a natural and ZnCl waters from a risk-based perspective. Water Air Soil Pollut 206:175–185. activated Sterculia guttata shell. J Mol Liq 207:39–49. https:// doi. org/ 10. https:// doi. org/ 10. 1007/ s11270- 009- 0094-9 1016/j. molliq. 2015. 03. 018 9. Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization 31. Sadhu T, Banerjee I, Lahiri SK, Chakrabarty J (2020) Modeling and optimi- approaches for feature selection. Neurocomputing 172:371–381. https:// zation of cooking process parameters to improve the nutritional profile of doi. org/ 10. 1016/j. neucom. 2015. 06. 083 fried fish by robust hybrid artificial intelligence approach. J Food Process 10. Floares A, Luludachi I (2014) Inferring transcription networks from data Eng 43:1–13. https:// doi. org/ 10. 1111/ jfpe. 13478 XX. 1 Introduction and background. In: Springer Handbook of Bio-/Neu- 32. Saha R, Nandi R, Saha B (2011) Sources and toxicity of hexavalent chro- roinformatics, pp 311–326 mium. J Coord Chem. https:// doi. org/ 10. 1080/ 00958 972. 2011. 583646 11. ATSDR (Agency for Toxic Substances and Disease Registry). 2017. ATSDR’s 33. Sajid M, Waqas M, Ahmed N, Akgül A, Rafiq M, Raza A (2023) Numerical substance priority list. Accessed April 28, 2017. https:// www. atsdr. cdc. simulations of nonlinear stochastic Newell-Whitehead-Segel equation gov/ spl/ and its measurable properties. J Comput Appl Math 418:114618. https:// 12. Gandomi AH, Alavi AH (2012) A new multi-gene genetic programming doi. org/ 10. 1016/j. cam. 2022. 114618 approach to nonlinear system modeling. Part I: materials and structural 34. Salama ES, Roh HS, Dev S, Khan MA, Abou-Shanab RAI, Chang SW, Jeon engineering problems. Neural Comput Appl 21:171–187. https:// doi. org/ BH (2019) Algae as a green technology for heavy metals removal from 10. 1007/ s00521- 011- 0734-z various wastewater. World J Microbiol Biotechnol. https:// doi. org/ 10. 13. Grosman B, Lewin DR (2002) Automated nonlinear model predictive con- 1007/ s11274- 019- 2648-3 trol using genetic programming. Comput Chem Eng 26:631–640. https:// doi. org/ 10. 1016/ S0098- 1354(01) 00780-3 S arkar et al. Beni-Suef Univ J Basic Appl Sci (2023) 12:27 Page 17 of 17 35. Searson D, Willis M, Montague G (2007) Co-evolution of non-linear PLS model components. J Chemom 21:592–603. https:// doi. org/ 10. 1002/ cem. 1084 36. Searson DP, Leahy DE, Willis MJ (2010) GPTIPS: an open source genetic programming toolbox for multigene symbolic regression. In: Proceed- ings of the International multiconference of engineers and computer scientists 2010, IMECS 2010 I, pp 77–80 37. Sen S, Dutta S, Guhathakurata S, Chakrabarty J, Nandi S, Dutta A (2017) Removal of Cr( VI) using a cyanobacterial consortium and assessment of biofuel production. Int Biodeterior Biodegrad 119:211–224. https:// doi. org/ 10. 1016/j. ibiod. 2016. 10. 050 38. Sen S, Rai A, Chakrabarty J, Lahiri SK, Dutta S (2021) Parametric modeling and optimization of phycoremediation of Cr( VI) using artificial neural network and simulated annealing, Algae. Multifarious Appl Sustain World. https:// doi. org/ 10. 1007/ 978- 981- 15- 7518-1_6 39. Shanab S, Essa A, Shalaby E (2012) Bioremoval capacity of three heavy metals by some microalgae species (Egyptian isolates). Plant Signal Behav 7:392–399. https:// doi. org/ 10. 4161/ psb. 19173 40. Tangahu BV, Sheikh Abdullah SR, Basri H, Idris M, Anuar N, Mukhlisin M (2011) A review on heavy metals (As, Pb, and Hg) uptake by plants through phytoremediation. Int J Chem Eng. https:// doi. org/ 10. 1155/ 2011/ 939161 41. USEPA. 2017. Chromium in drinking water. Accessed April 28, 2017. http:// www. epa. gov/ dwsta ndard sregu latio ns/ chrom ium- drink ing- water 42. Xu C, Farman M, Hasan A, Akgu A, Zakarya M, Albalawi W, Park C (2022) Lyapunov stability and wave analysis of Covid-19 omicron variant of real data with fractional operator. Alex Eng J 61:11787–11802. https:// doi. org/ 10. 1016/j. aej. 2022. 05. 025 43. Yen H, Chen P, Hsu C, Lee L (2017) The use of autotrophic Chlorella vulgaris in chromium ( VI) reduction under different reduction conditions. J Taiwan Inst Chem Eng 74:1–6. https:// doi. org/ 10. 1016/j. jtice. 2016. 08. 017 Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in pub- lished maps and institutional affiliations.

Journal

Beni-Suef University Journal of Basic and Applied SciencesSpringer Journals

Published: Feb 28, 2023

Keywords: Genetic programming; Multi-gene genetic programming; Grey wolf optimization; Artificial intelligence; Cr(III); Cr(VI); Wastewater

There are no references for this article.