 Research
 Open Access
 Published:
Robust mutant strain design by pessimistic optimization
BMC Genomics volume 18, Article number: 677 (2017)
Abstract
Background
Flux Balance Analysis (FBA) based mathematical modeling enables in silico prediction of systems behavior for genomescale metabolic networks. Computational methods have been derived in the FBA framework to solve bilevel optimization for deriving “optimal” mutant microbial strains with targeted biochemical overproduction. The common inherent assumption of these methods is that the surviving mutants will always cooperate with the engineering objective by overproducing the maximum desired biochemicals. However, it has been shown that this optimistic assumption may not be valid in practice.
Methods
We study the validity and robustness of existing bilevel methods for strain optimization under uncertainty and noncooperative environment. More importantly, we propose new pessimistic optimization formulations: PROOM and POptKnock, aiming to derive robust mutants with the desired overproduction under two different mutant cell survival models: (1) ROOM assuming mutants have the minimum changes in reaction fluxes from wildtype flux values, and (2) the one considered by OptKnock maximizing the biomass production yield. When optimizing for desired overproduction, our pessimistic formulations derive more robust mutant strains by considering the uncertainty of the cell survival models at the inner level and the cooperation between the outer and innerlevel decision makers. For both PROOM and POptKnock, by converting multilevel formulations into singlelevel Mixed Integer Programming (MIP) problems based on the strong duality theorem, we can derive exact optimal solutions that are highly scalable with large networks.
Results
Our robust formulations PROOM and POptKnock are tested with a small E. coli core metabolic network and a largescale E. coli iAF1260 network. We demonstrate that the original bilevel formulations (ROOM and OptKnock) derive mutants that may not achieve the predicted overproduction under uncertainty and noncooperative environment. The knockouts obtained by the proposed pessimistic formulations yield higher chemical production rates than those by the optimistic formulations. Moreover, with higher uncertainty levels, both cellular models under pessimistic approaches produce the same mutant strains.
Conclusions
In this paper, we propose a new pessimistic optimization framework for mutant strain design. Our pessimistic strain optimization methods produce more robust solutions regardless of the innerlevel mutant survival models, which is desired as the models for cell survival are often approximate to realworld systems. Such robust and reliable knockout strategies obtained by the pessimistic formulations would provide confidence for invivo experimental design of microbial mutants of interest.
Introduction
Wholegenome highthroughput profiling techniques have enabled the systematic study of biological systems at the genome scale [1, 2]. In particular, systems models and computational methods for analyzing and controlling genomescale metabolic networks have greatly advanced the field of metabolic engineering [3]. With the better understanding of the systems behavior of microbial metabolism, metabolic engineering based on the metabolic network models can help predict metabolic phenotypes [4, 5] and derive engineering strategies for strain design by manipulating the native microbial pathways to produce chemicals of commercial and biomedical benefits [6–11].
Mathematical modeling of metabolism often focuses on steadystate behavior, especially when longterm metabolic dynamics is of interest, as accurate reconstruction of genomescale kinetic models is challenging when considering the large model space and parameter uncertainties. Constraintbased approaches based on the reaction stochiometry, notably Flux Balance Analysis (FBA) by Linear Programming (LP) formulations, study genomescale dynamics by massbalance equations at steady states to understand and predict macrolevel microbial behavior in the presence of perturbation, for example caused by mutations or environmental changes [12–15]. By adding thermodynamic and flux capacity constraints, FBA often models steadystate behavior by assuming that the cell growth flux needs to be maximized based on biomass composition [16].
Many computational approaches have been proposed in this computational framework for in silico prediction of potentially feasible metabolic phenotypes and evaluation of theoretical limits of metabolic performance after knocking out certain genes or reactions for strain design [14, 16]. Metabolic engineering with such computationally derived knockout strategies has shown to be effective to achieve overproduction of biochemicals of interest [17]. Those strain optimization methods are generally modeled as bilevel optimization problems that seek for maximum overproduction of a desired biochemical at the outer level under the condition that mutant cells are still alive, modeled as the innerlevel optimization problem. For example, OptKnock [17], ROOM (Regulatory On/Off Minimization) [18], and MOMAKnock [19] all fall under this category with different mutant cell survival models at the inner level detailed in the “Background” section.
These existing bilevel formulations for microbial strain design have inherent assumptions that nature will always cooperate with the human desire to select the mutants that serve the outerlevel engineering objective the best, namely they assume that a cooperative environment exists between the outerlevel (human) and innerlevel (microbial strain) decision makers. However, in practice, surviving mutant strains may not always fit the best with the engineering goal. In addition, the model assumptions for the cell survival at the inner level are often the approximations to the realworld scenarios, which may result in the derived knockout strategies not overproducing the predicted amount of desirable biochemicals. Therefore, the robustness and viability of these predicted knockout strategies may need to be reevaluated. For example, if the perturbed strains do not cooperate with the knockout implementer, they may do very opposite to the engineering objective and these optimistic bilevel strategies may not work in practice. In [20], we have shown that OptKnockderived knockout strategies may produce the desired chemicals at levels much lower than the expected ones, when the aforementioned two assumptions are relaxed.
In this paper, we innovate a computational framework to predict robust knockout strategies for overproduction of desirable chemicals with different innerlevel mutant cell survival models. Specifically, we develop a pessimistic optimization formulation [21] that considers the uncertainty of the cell survival models at the inner level and the cooperation between the outer and innerlevel optimization decisions. Figure 1 illustrates our pessimistic bilevel optimization framework for identifying the knockouts to have the maximum overproduction of a desired chemical as a bioengineering objective under the worstcase or leastfavorable scenario. The innerlevel cellular model has a competing objective function to maintain the cell survival. According to the cellular model, different objective functions can be chosen. We study such pessimistic formulations with both the cell survival criteria of (1) minimization of significant flux changes with respect to wildtype flux values as in ROOM [18]; and (2) maximization of biomass growth as in OptKnock [17]. Based on the corresponding pessimistic formulations, which we call PROOM and POptKnock respectively, we will optimize the desired overproduction in the least favorable situation (i.e., the noncooperative situation) and investigate the sensitivity of resulting knockouts with respect to cell response and uncertainty introduced by the innerlevel cell survival models. Since the innerlevel cellular model is linear given the outerlevel binary decision variables, we can employ the LP duality theory and convert the pessimistic bilevel optimization problem to a singlelevel MIP problem by applying two dual transformations as described in Methods. Experimental results on both a core E. coli metabolic network and a genomescale iAF1260 network [22] show that our pessimistic formulations can generate more robust knockout strategies compared to the existing bilevel optimization methods for strain design. Regardless of the choice of innerlevel surrogate functions for cell survival, PROOM and POptKnock produce consistent and stable knockout strategies for the targeted biochemical overproduction as the formulations are specifically designed to take care of the model uncertainty.
Background
Before we present pessimistic optimization formulations to derive robust knockout strategies for strain design, we briefly summarize the background and important mathematical notations for FBA and the bilevel optimization models for deriving knockouts, including OptKnock [17] and ROOM [18].
FBA
FBA is an LP problem for the analysis of stoichiometricbased metabolic network models of involved reactions at the genome scale. A linear objective function is minimized or maximized subject to massbalance, thermodynamic and capacity constraints, with respect to reaction fluxes in a vector form v. The LP formulation of FBA can be expressed as:
Under the steadystate assumption, massbalance constraints constitute a system of linear equations where the weighted sum of fluxes, based on stoichiometric coefficients given in a matrix form S, is 0. Thermodynamic and capacity constraints are defined as lower and upper bounds on reaction fluxes. In the FBA framework, maximization of biomass growth is often adopted as the objective function for modeling cell survival, where c becomes a vector with all values of 1 for the reactions corresponding to the biomass formation.
OptKnock
FBA enables efficient computational design of beneficial genetically engineered microbial strains. The pioneering work by the authors of [17], OptKnock, searches for potential genetic perturbations (e.g., gene or reaction knockouts) for redirection of metabolic flux to overproduce desired biochemicals and maintain cellular growth. This bilevel optimization problem captures the engineering objective at the outer level (e.g., to maximize the overproduction) while the innerlevel problem models a cellular fitness objective (e.g., maximization of biomass growth). The mathematical programming formulation of OptKnock can be expressed as follows:
where I and J are the sets of metabolites and reactions, respectively. In this model, v _{ j } represents the flux of reaction j in J. Each element of the matrix S is the stoichiometric coefficient S _{ ij } of metabolite i in reaction j. The outerlevel binary decision variable z _{ j } is 1 if the corresponding reaction flux v _{ j } is active and 0 otherwise. The overproduction of a chemical of interest is maximized at the outer level allowing K possible knockout reactions. The innerlevel cell survival model is based on steadystate analysis with FBA constraints. Depending on the availability of nutrients or the maximal fluxes that can be supported by enzymatic pathways [13], \(v_{j}^{min}\) and \(v_{j}^{max}\) are the lowest and highest possible reaction fluxes for the reaction j, respectively. The glucose consumption rate v _{ glc } is often set to a fixed value as \(v_{glc\_uptake}\), and \(v_{biomass}^{target}\) is the minimum level of biomass production. As the biomass growth is a linear objective function of metabolic reaction fluxes, strong duality of the innerlevel optimization helps to convert the original bilevel optimization problem into a Mixed Integer Linear Programming (MILP) problem, which can be solved efficiently for largescale metabolic networks [17, 23].
ROOM
Observed experimental flux values reported in [24] showed that maximizing biomass reaction flux as a surrogate function for the most probable physiological state of the metabolic model is indeed effective for wildtype strains as they have been exposed to longterm evolutionary pressure. However, it may not be valid for the engineered mutants without such a longterm progress. Therefore, it calls for other realistic cell survival models for mutants. ROOM [18] is one of such models, in which binary variables y _{ j }’s are introduced to capture significant (up/down regulated) or insignificant reaction flux changes. Instead of maximizing the biomass growth, ROOM [18] aims to minimize the number of significant flux changes. Relaxing the binary variables y _{ j }’s in the original MILP formulation leads to the LP variant of ROOM:
in which w _{ j } is the wildtype flux values that can be solved by FBA. In this bilevel optimization model (2), the innerlevel cellular objective is to minimize the flux changes. The inequalities with y _{ j }’s constrain the flux values not to deviate from the wildtype flux values. We again can employ the strong duality condition to convert the bilevel optimization problem into a MILP to solve (2).
Methods
The above optimistic bilevel optimization methods effectively model the interacting objectives of the outerlevel knockout implementer and the innerlevel microbial cells. However, the effectiveness of these optimistic formulations depends on the inherent assumption that the outerlevel engineering objective and innerlevel cellular fitness function behave cooperatively by selecting the innerlevel solutions in favor of the outerlevel optimization problem. In practice, when the innerlevel problem either does not faithfully reflect cellular fitness objectives or has nonunique solutions, there is no guarantee that the cell response will be in cooperation with the engineering objective to maximize the desirable biochemical product formation as we recently found in [20]. The main questions that we would like to address here are:

1.
How robust are the derived knockout strategies based on these optimistic bilevel optimization formulations?

2.
Does the robustness depend on how the innerlevel optimization problem approximates mutant cell objectives?

3.
Can we formulate new optimization problems that derive more robust solutions?
We answer the first two questions by allowing the innerlevel problem to take nonoptimistic solutions as we have done in [20] to evaluate the knockout strategies derived by OptKnock and ROOM. More importantly, we propose a novel pessimistic bilevel optimization framework in this paper to derive robust knockout strategies, which consider the uncertainty introduced by the innerlevel models from the aforementioned noncooperative and nonunique issues.
We now present our pessimistic optimization formulations for deriving robust mutant strains and provide the detailed derivation of the optimization solutions to our pessimistic models, in which we maximize the desired overproduction for the worstcase scenario due to uncertainty and noncooperative environment with different innerlevel mutant cell survival models.
Pessimistic bilevel optimization
As discussed, due to the innerlevel model uncertainty and noncooperative issues, the existing optimistic bilevel strain optimization methods may not be able to produce reliable results when there exist nonunique solutions to the innerlevel problem. In order to overcome this limitation, we propose a pessimistic optimization framework to optimize for the worstcase scenario under uncertainty and noncooperative environment. The mathematical programming formulation of pessimistic bilevel optimization can be represented in the following general form [21]:
where x and y are the outerlevel and innerlevel decision variables with the corresponding feasible sets \(\mathbb {X}\) and \(\mathbb {Y}\). In the above formulation, F represents the outerlevel objective function, \(\mathbf {F}:\mathbb {X}\times \mathbb {Y}\rightarrow \mathbb {R}\), and f represents innerlevel objective function, \(\mathbf {F}:\mathbb {X}\times \mathbb {Y}\rightarrow \mathbb {R}\). G and g represent the inequality constraint functions at the outer and inner levels, respectively. Y(x) denotes the set of optimal solutions to the innerlevel problem for a given x, which may contain multiple elements. Solving such a pessimistic bilevel optimization problem is computationally very difficult. Even a linear optimistic bilevel problem, where both the outerlevel and the innerlevel problems are linear programs, is theoretically NPhard [25]. In our study, to solve the more difficult pessimistic bilevel optimization problem, we first employ the LP duality theory based on the fact that the innermost level problem is a linear program. It converts the pessimistic bilevel model into a standard bilevel problem, which enables us to fully make use of existing bilevel optimization algorithms. For example, we transform the standard bilevel model into a single level mixed integer programming (MIP) problem using the strong duality theory, which can be solved through the commercial MIP solver CPLEX. As those operations are rather simple, we make a challenging pessimistic bilevel problem practically solvable even for largescale cases.
We mention that for PROOM (4), there are J binary variables in the outer level, 2J continuous variables and O(I+5J) constraints in the inner level. By taking the dual of the innermost problem, we convert (4) into a standard bilevel problem (7), whose innerlevel problem has O(I+7J) variables and O(I+7J) constraints. Comparing to the traditional bilevel ROOM model (2), which has the identical outerlevel problem, and 2J continuous variables and O(I+5J) constraints in the inner level, (7) does not have a much larger or more complicated structure. Hence, it can be expected that the computational complexity of (7) will not be drastically more than that of the traditional ROOM model. Moreover, our numerical study shows that the single level MIP reformulation of (7) for a largerscale network can be computed by CPLEX in a reasonable time, which indicates the efficiency of our proposed solution strategy.
Inspired by this pessimistic optimization framework, we propose pessimistic models for mutant strain optimization under model uncertainty. We note that the notations appearing in the following pessimistic formulations have the same definitions as in the “Background” section.
Pessimistic strain optimization I: PROOM
In the following, we propose the pessimistic formulation based on the original ROOM model (2) in the same minimax flavor as in (3):
Originally introduced in [21], the εapproximation extension of the above pessimistic formulation is flexible when model uncertainty needs to be considered for bilevel decision making. Specifically, the εapproximation of the pessimistic problem is to allow a proportional gap of ε for the innerlevel objective function value from the optimal value that the actual knockout solutions would take. When ε=0, we have the original pessimistic formulations assuming that the innerlevel models are faithful, but innerlevel decision variables may act against outerlevel engineering objectives. Higher ε values reflect that the innerlevel mutant strains have a higher tolerance level for innerlevel model uncertainty when a noncooperative decision is taken against the outerlevel engineering objectives. Such an εapproximation of the pessimistic problem can be written as follows:
where φ(v ^{∗},y ^{∗}) is the innerlevel function value changing with respect to the innerlevel decision variables v ^{∗} and y ^{∗}. As the innerlevel problem is LP, we can convert the problem into its dual representation by the strong duality theorem as in [17, 26], which is:
where u _{ i } denotes the dual variable associated with the massbalance constraint, \(\sum _{j} S_{ij}v_{j} = 0\) for metabolite i, glc is the dual variable associated with the glucose uptake constraint, μ _{ biom } is the dual variable associated with the minimum biomass threshold constraint, and μ _{ m i n,j } and μ _{ m a x,j } are the dual variables for the two directions of the inequality constraint \(v_{j}^{min}z_{j} \leqslant v_{j} \leqslant v_{j}^{max}z_{j}\) for each reaction j. Finally, μ _{ m i n2,j } and μ _{ m a x2,j } are the dual variables associated with the constraints \(v_{j}y_{j}\left (v_{j}^{min}w_{j}\right) \geq w_{j}\) and \(v_{j}y_{j}\left (v_{j}^{max}w_{j}\right)\leq w_{j}\), denoting the flux changes with respect to wildtype flux values, respectively, and a _{ j } is the dual variable associated with the upper bound constraint on y _{ j }. Note that the constraint \(v_{j}^{min}z_{j} \leqslant v_{j} \leqslant v_{j}^{max}z_{j}\) links the outermost level binary decision variables z _{ j } and innerlevel decision variables v _{ j }.
By aggregating the innerlevel constraints to the outer level with those introduced by its dual in (6) with the adopted εapproximation, the final pessimistic formulation PROOM aims to solve the following maxmin problem;
In order to solve the maxmin problem above, we use the strong duality theorem again to transform the minimization problem into its dual maximization representation, resulting in its equivalent singlelevel MIP as:
Since some knockout strategies z _{ j } may not have corresponding feasible solutions to the inner problem of (7), causing the corresponding dual solutions to be unbounded, new continuous decision variables s _{ j } and r _{ j } are added here in order to enforce the feasibility of the inner primal problem of (7) to avoid such degenerated cases. The bilinear terms in the objective function and the constraints in (8) can be further linearized by using the commonly adopted bigM method. This singlelevel MILP problem can be solved efficiently for largescale problems by available solvers.
Pessimistic strain optimization II: POptKnock
In OptKnock [17], the innerlevel mutant survival model is approximated by the maximum biomass growth condition. In order to optimize for the overproduction of the target biochemicals in worsecase scenario, we can similarly write the corresponding pessimistic formulation as what we have derived for PROOM:
By adopting the same εapproximation to the pessimistic problem in (9) to incorporate the innerlevel model uncertainty, we have the POptKnock formulation as:
where φ(v ^{∗}) is the innermost level function value of a given innerlevel decision variable v ^{∗}. Similar to the approach we have employed for PROOM, we can convert this optimization problem (10) to a singlelevel MIP. We note that the singlelevel MIP equivalent of the pessimistic formulation POptKnock is indeed similar to the work in [26] when the tolerance level ε is chosen as 0. However, we emphasize that our proposed pessimistic framework is more general and can be extended to different bilevel strain optimization formulations that employ various cellsurvival models.
Results and discussion
In this section, we first evaluate the knockout solutions by optimistic bilevel optimization methods on a core E. coli metabolic network [27] when the outer and innerlevel decision makers are not cooperative. We then test our robust strain optimization methods, namely PROOM and POptKnock, derived in the “Methods” section, on the core network and a large E. coli metabolic network, iAF1260 [22], for overproduction of Succinate. The models are optimized using the MILP solver CPLEX 12.6.3 [28].
Succinate overproduction on AntCore metabolic network
We derive knockout solutions for a core E. coli metabolic network model [27] with 74 chemicals and 75 reactions for Succinate overproduction. The network model is from the OptKnock package [17], in which the glucose uptake is set at a fixed value of 100mmol/gDW.hr and the minimum biomass is set as 5 mmol/gDW.hr. Since the glucoseuptake rate is fixed, the Succinate yield is equivalent to the corresponding flux value considering steadystate stoichiometry constraint. The wildtype flux distribution is computed by maximizing the biomass in the FBA framework. In this set of experiments, we set the allowable knockout numbers K=3, 4, and 5. All the reported experiments are based on the aerobic condition.
We first evaluate the robustness of two optimistic bilevel programs OptKnock (1) and ROOM (2) by computing the performance of least favorable Succinate overproduction under uncertainty for the derived knockouts z ^{∗} by solving (1) and (2). As discussed in the “Methods” section, we introduce a parameter ε as the model tolerance level to capture how faithful the innerlevel mutant survival model is and whether the mutants will cooperate with the outerlevel overproduction objectives. By allowing the gap ε between the realistic responses and the optimal innerlevel objective function value, the increase in ε reflects the higher model tolerance by approximating the innerlevel model uncertainty and noncooperativeness. The pessimistic Succinate overproduction rate evaluations of optimistic knockout solutions are derived by finding the worstcase solutions of reaction flux values in εapproximation sets.
The computed evaluation values for different ε values (0≤ε≤0.4) are plotted in cyan and red lines in Fig. 2 with different numbers of allowed knockouts. It is clear that with higher model uncertainty (larger ε values), both ROOM and OptKnock strategies lead to smaller Succinate overproduction compared to the predicted optimistic overproduction rates. We also note that the Succinate rates drop very quickly even with small ε, which implies that these optimistic knockout strategies are not robust. Finally, when the innerlevel mutant survival models are not as faithful as desired, the derived knockout strategies may not be effective at all. For example, when K=3 and 4, the knockouts suggested by OptKnock are too “optimistic” as the corresponding evaluation Succinate rates go to 0 as shown in Fig. 2a and b. Compared with the results from ROOM, this validates that the minimum flux change model may be a better mutant cell survival model than the maximum biomass growth model adopted in OptKnock. These experiments bring out that the cellular and engineering objectives may behave in opposite directions and the optimistic knockout strategies may not work in practice.
With the same εapproximation model uncertainty, we now compare the corresponding derived knockout strategies based on our proposed pessimistic models: POptKnock and PROOM. Figure 2 illustrates the εoptimal Succinate overproduction rates of the derived knockout solutions by POptKnock and PROOM in blue and black lines, respectively. Obviously, our pessimistic strategies derive more robust solutions, for which the worstcase or least favorable Succinate rates for different ε values are consistently on top of the rates from optimistic strategies. The least favorable Succinate rates decrease monotonically with increasing ε corresponding to higher model uncertainty. More interestingly, as ε increases, both POptKnock and PROOM converge to the stable Succinate overproduction rates. In fact, they derive the same knockout reactions as provided in Table 1. This clearly shows that our pessimistic formulations can produce consistent and robust mutant strains with respect to the innerlevel mutant survival model uncertainty. As a side note, when K=5, the pessimistic Succinate values produced by PROOM and its optimistic counterpart are almost the same for different tolerance levels, probably because the surviving mutants are more restricted with the increasing number of knockout reactions. This also shows that the minimum flux change is a better mutant survival model.
In Table 1, the succinate overproduction rates are given by the pessimistic formulations PROOM and POptKnock for the tolerance values ε in which both models became stable (ε=0.4) as shown in Fig. 2. Although the pessimistic models provide different Succinate values for different ε values in Fig. 2, the knockout strategies and the Succinate values for both PROOM and POptKnock are identical when they reach stability. When K=3, one of the knockouts suggested by PROOM and POptKnock involves competing byproduct metabolism pathways for succinate such as 6PhosphoDgluconate (6pg) and Ribulose 5phosphate (ru5p). With K=4, the reaction decomposing succinate (suc) is also eliminated. As more knockouts are allowed, the resulting succinate production can be further increased. We should also note that the removal of an important reaction for Transhyrogenation (nadh ⇔nadph) may cause significant reduction in biomass flux value. For the knockout strategies suggested by ROOM and OptKnock, when assuming the optimistic environment, the succinate rates are larger than the values suggested by pessimistic knockouts as expected since optimistic formulation provides an upper bound for the pessimistic formulation. From all the experiments with different K’s, we demonstrate that, when we formulate the mutant optimization problem as pessimistic optimization by incorporating the model uncertainty, both POptKnock and PROOM can suggest consistent and robust knockouts, making the mutant strain optimization problem more reliable, accurate, and independent from the choice of innerlevel surrogate functions for mutant cell objectives.
Succinate overproduction on iAF1260 network
We have shown that the derived knockout strategies based on optimistic methods may not be robust and often “overoptimistic” by the experiments on the core E. coli metabolic network, while our proposed methods PROOM and POptKnock based on pessimistic optimization for mutant strain design have achieved robust and more practical knockout strategies. We further test PROOM and POptKnock to derive mutant strains for the overproduction of Succinate on a large E. coli metabolic network model, iAF1260 [22], which has 1658 metabolites and 2936 reactions including the pseudo reactions required for the computational model, also from the OptKnock package [17]. The glucose uptake rate is fixed at 100 mmol/gDW.hr, and the minimum biomass value is also set to 5 mmol/gDW.hr. The reported experiments are based on the anaerobic environment. We allow K=3 knockout reactions on this large network.
Figure 3 provides the εoptimal Succinate flux rates, which are the outerlevel objective function values based on two pessimistic models of PROOM and POptKnock at different ε values. When there is no model uncertainty (ε=0), PROOM clearly provides higher Succinate rate than the value POptKnock suggests. This again suggests that modeling mutant cell survival by minimization of flux changes plays a role in favor of engineering objectives with improved targeted productions. At higher ε values, PROOM and POptKnock reach stability, and they derive the same knockout reactions as provided in Table 2 as we also observed in the experiments on the E. coli core metabolic network. If we consider the fact that the innerlevel cell survival models are approximate to the realworld systems, incorporating this model uncertainty with εapproximation in our formulated pessimistic mutant strain optimization methods, PROOM and POptKnock, has a key role to achieve robust knockout solutions that are less affected by these approximate cell survival models.
Table 2 provides the predicted knockouts by PROOM and POptKnock when K=3. The minimum and maximum Succinate flux rates are also given for the mutant and also wildtype strains. The Succinate flux rates of pessimistic models are given for two different ε values: ε=1 denoting the highest tolerance level for the innerlevel model uncertainty, and ε=0 denoting 0 tolerance level. In Table 2, both PROOM and POptKnock predict three identical knockouts that yield a minimum production rate of the Succinate that is higher than the predicted production rate in the wildtype strains. It validates that our pessimistic mutant strain optimization methods guarantee a higher Succinate production rate than that of wildtype strains even in the worstcase scenario. Furthermore, minimum and maximum Succinate rates given the predicted knockouts define a larger range for the OptKnock model, which may indicate the innerlevel mutant cell survival model by biomass maximization is less faithful than ROOM’s minimum flux change model.
We have also included the derived knockouts by ROOM and OptKnock and corresponding minimum and maximum succinate rates in Table 2. The mutant strain with the knockouts obtained by ROOM produces 0 minimal succinate flux value with the highest innerlevel model uncertainty. However, PROOM succeeds in getting higher production rates even when ε=1. The minimal succinate rate when ε=0 in ROOM is greater than the minimal succinate rate for PROOM which is 8.09. This is because the reported pessimistic knockouts are identified when the model uncertainty is taken as ε=1. We note that the mutant with the pessimistic knockouts identified by PROOM when ε=0 actually produces the minimal succinate rate as 31.36. The knockouts captured by OptKnock are same as the ones identified by pessimistic formulations, leading to the same mutant with the same succinate production rates as POptKnock. Pessimistic formulations identified the succinate dehydrogenase reaction (SUCDi) as one of the suggested knockouts, which directly consumes succinate (succ), and the reaction pyruvate formate lyase (PFL). These have been reported as effective knockout strategies in [17, 29].
As demonstrated by the results from both the small core E. coli metabolic network and large iAF1260 network, our pessimistic formulations for strain optimization produce robust and stable knockout solution strategies under model uncertainty. Especially when comparing the results from PROOM and POptKnock, both models derive the same stable knockout solutions regardless of the innerlevel surrogate functions for mutant cell survival. Such consistency is critical when considering the translation of computationally derived results into in vivo experiments in practical metabolic engineering.
Conclusions
In this paper, we have proposed a new pessimistic optimization framework to identify optimal knockout strategies for maximum targeted bioproduction under model uncertainty. We specifically have investigated two cell survival models and presented two corresponding pessimistic models to derive robust knockout reactions to achieve maximum biochemical overproduction: (1) PROOM under the minimum flux change condition; and (2) POptKnock under the maximum biomass condition. The experiments on both the core E. coli metabolic network [27] and largescale iAF1260 network [22] have demonstrated that the pessimistic models for strain optimization derive robust and stable metabolic strain perturbation strategies through genomescale steadystate stoichiometric modeling. Formulating the strain optimization problem based on pessimistic optimization considering the model uncertainty leads to consistent and reliable solutions regardless of the innerlevel mutant survival models, which can provide high confidence for future in vivo experimental design of promising microbial mutants that benefit the human society.
References
 1
Barrett CL, Kim TY, Kim HU, Palsson BØ, Lee SY. Systems biology as a foundation for genomescale synthetic biology. Curr Opin Biotechnol. 2006; 17(5):488–92.
 2
Esvelt KM, Wang HH. Genomescale engineering for systems and synthetic biology. Mol Syst Biol. 2013; 9(1):641.
 3
Palsson BØ. Systems Biology: Constraintbased Reconstruction and Analysis. Cambridge: Cambridge University Press; 2015.
 4
McCloskey D, Palsson BØ, Feist AM. Basic and applied uses of genomescale metabolic network reconstructions of Escherichia coli. Mol Syst Biol. 2013; 9(1):661.
 5
Terzer M, Maynard ND, Covert MW, Stelling J. Genomescale metabolic networks. Wiley Interdiscip Rev Syst Biol Med. 2009; 1(3):285–97.
 6
Chotani G, Dodge T, Hsu A, Kumar M, LaDuca R, Trimbur D, Weyler W, Sanford K. The commercial production of chemicals using pathway engineering. Biochim Biophys Acta Protein Struct Mol Enzymol. 2000; 1543(2):434–55.
 7
Lee JW, Na D, Park JM, Lee J, Choi S, Lee SY. Systems metabolic engineering of microorganisms for natural and nonnatural chemicals. Nat Chem Biol. 2012; 8(6):536–46.
 8
Broa C, Regenberga B, Förster J, Nielsen J. In silico aided metabolic engineering of saccharomyces cerevisiae for improved bioethanol production. Metab Eng. 2006; 8:102–11.
 9
Lu J, Sheahan C, Fu P. Metabolic engineering of algae for fourth generation biofuels production. Energy Environ Sci. 2011; 4:2451–66.
 10
Luengo JM, Garcia B, Sandoval A, Naharro G, Olivera ER. Bioplastics from microorganisms. Curr Opin Microbiol. 2003; 6:251–60.
 11
Ohta K, Beall DS, Mejia JP, Shanmugam KT, Ingram LO. Metabolic engineering of klebsiella oxytoca m5a1 for ethanol production from xylose and glucose. Appl Environ Microbiol. 1991; 57:2810–1815.
 12
Burgard AP, Maranas CD. Probing the performance limits of the Escherichia coli metabolic network subject to gene additions or deletions. Biotech Bioeng. 2001; 74(5):364–75.
 13
Segre D, Vitkup D, Church GM. Analysis of optimality in natural and perturbed metabolic networks. Proc Natl Acad Sci. 2002; 99(23):15112–7.
 14
Varma A, Palsson BØ. Metabolic flux balancing: Basic concepts, scientific and practical use. Nat Biotechnol. 1994; 12:994–8.
 15
Lee JM, Gianchandani EP, Papin JA. Flux balance analysis in the era of metabolomics. Brief Bioinform. 2006; 7(2):140–50.
 16
Edwards JS, Palsson BØ. The escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc Natl Acad Sci. 2000; 97(10):5528–33.
 17
Burgard AP, Pharkya P, Maranas CD. Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotech Bioeng. 2003; 84(6):647–57.
 18
Shlomi T, Berkman O, Ruppin E. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc Natl Acad Sci USA. 2005; 102(21):7695–700.
 19
Ren S, Zeng B, Qian X. Adaptive bilevel programming for optimal gene knockouts for targeted overproduction under phenotypic constraints. BMC Bioinforma. 2013; 14(2):1.
 20
Apaydin M, Zeng B, Qian X. A reliable alternative of OptKnock for desirable mutant microbial strains. In: IEEEEMBS International Conference on Biomedical and Health Informatics (BHI).2016. p. 573–576. doi:10.1109/BHI.2016.7455962.
 21
Zeng B. Easier than we thoughta practical scheme to compute pessimistic bilevel optimization problem. 2015. Available at SSRN: https://ssrn.com/abstract=2658342 or http://0dx.doi.org.brum.beds.ac.uk/10.2139/ssrn.2658342.
 22
Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BØ. A genomescale metabolic reconstruction for Escherichia coli K12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007; 3(1):121.
 23
Kim J, Reed JL, Maravelias CT. Largescale bilevel strain design approaches and mixedinteger programming solution techniques. PLoS ONE. 2011; 6(9):24162.
 24
Burgard AP, Maranas CD. Optimizationbased framework for inferring and testing hypothesized metabolic objective functions. Biotech Bioeng. 2003; 82(6):670–7.
 25
Deng X. Complexity issues in bilevel linear programming. In: Multilevel Optimization: Algorithms and Applications. US: Springer: 1998. p. 149–64.
 26
Tepper N, Shlomi T. Predicting metabolic engineering knockout strategies for chemical production: accounting for competing pathways. Bioinformatics. 2010; 26(4):536–43.
 27
Antoniewicz MR, Kraynie DF, Laffend LA, GonzalezLergier J, Kelleher JK, Stephanopoulos G. Metabolic flux analysis in a nonstationary system: fedbatch fermentation of a high yielding strain of E. coli producing 1, 3propanediol. Metab Eng. 2007; 9(3):277–92.
 28
IBM ILOG CPLEX Optimizer 12.6.3. 2015, IBM Corporation.
 29
Stols L, Donnelly MI. Production of succinic acid through overexpression of NAD (+)dependent malic enzyme in an Escherichia coli mutant. Appl Environ Microbiol. 1997; 63(7):2695–701.
Acknowledgments
This work was partially supported by CCF1553281 and CMMI1642514 from the National Science Foundation.
Funding
The publication costs of this article was funded by Award CCF1553281 from the National Science Foundation.
Availability of data and materials
The C++ source code and relevant network data are available upon request.
About this supplement
This article has been published as part of BMC Genomics Volume 18 Supplement 6, 2017: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM) 2016: genomics. The full contents of the supplement are available online at https://0bmcgenomicsbiomedcentralcom.brum.beds.ac.uk/articles/supplements/volume18supplement6.
Author information
Affiliations
Contributions
Conceived and designed the experiments: MA, LX, BZ, XQ. Designed and Implemented the algorithm: MA, LX, BZ, XQ. Performed the experiments: MA. Analyzed the results: MA, LX, BZ, XQ. Wrote the paper: MA, LX, BZ, XQ. Supervised the study: BZ, XQ. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Apaydin, M., Xu, L., Zeng, B. et al. Robust mutant strain design by pessimistic optimization. BMC Genomics 18, 677 (2017). https://0doiorg.brum.beds.ac.uk/10.1186/s1286401740257
Published:
DOI: https://0doiorg.brum.beds.ac.uk/10.1186/s1286401740257
Keywords
 Strain optimization
 Pessimistic bilevel optimization
 Stoichiometric models