Different optimization strategies for the optimal control of tumor growth

Mutations is the major consequence of abnormal behavior of the genetic material DNA. Mutations affect normal growth and division i.e., mutations cause uncontrollable growth of cells, these mutations are caused by two reasons which are the signals telling cells to begin dividing are left on continuously or growth suppressing signals telling cells not divide are turned off [19]. The process starts from an evolutionary process which may give rise to abnormal DNA when a cell duplicates its genome due to defects in tumor suppressor or DNA mismatch repair genes. Tumors releases hormones that alter body function, they can grow and interfere with the digestive, nervous and circulatory systems, they can also invade nearly tissues and successfully spread to other parts of the body and grow, these tumors are of malignant type, whereas benign tumors are not cancerous and not spread to other parts of the body.


Introduction
Mutations is the major consequence of abnormal behavior of the genetic material DNA. Mutations affect normal growth and division i.e., mutations cause uncontrollable growth of cells, these mutations are caused by two reasons which are the signals telling cells to begin dividing are left on continuously or growth suppressing signals telling cells not divide are turned off [19]. The process starts from an evolutionary process which may give rise to abnormal DNA when a cell duplicates its genome due to defects in tumor suppressor or DNA mismatch repair genes. Tumors releases hormones that alter body function, they can grow and interfere with the digestive, nervous and circulatory systems, they can also invade nearly tissues and successfully spread to other parts of the body and grow, these tumors are of malignant type, whereas benign tumors are not cancerous and not spread to other parts of the body.
The immune system is composed of a wide variety of cells with different functions, most important types of them are innate, or nonspeci ic and speci ic, or adaptive, immune response [1], the similarity between innate and adaptive responses is that both must contact with target tumor cells in order to be able to kill them, whereas the difference between them is that the innate cells are always on patrol, and kill tumor cells not recognized them as self, so it is an early defense against pathogens. While adaptive, immune response must be primed to recognize antigen speci ic to the tumor cells. Cell-mediated immunity, also called cellular immunity, is mediated by T lymphocytes (also called T cells) are part of the adaptive immune response. as well as the Killer T cells are referred to as CTL "Cytotoxic T Lymphocyte" cells, or CD8+ T cells, which directly attack and eliminate infected cells, while, the CD4 helper T cells, assist other cells of the immune system during an infection.
The immune system alone usually fails to effectively ight the tumor for the following reasons: 1. The irst is an insuf icient immune response due to the tumor being poorly antigenic. This is described by Curilas [13] "too little of a good thing" Cytotoxic T cells are not suf iciently activated by the tumor, and therefore the response is minimal.
2. The second cause for failure is "too much of a bad thing" in which immunosuppressive factors damage otherwise capable immune system. Over the last few decades, the importance of the second paradigm has become clear; immune suppression is likely to be a signi icant factor when cancer is present in the host.
Several answers to the question how to choose best possible dose to reduce harm to healthy cells while beat cancer are introduced. These answers was introduced before based on empirical methods, empirical methods depend basically on drug holidays, which are rest periods give the health cells the opportunity to recover from the toxic attack of the drug these periods are speci ied by try and error methods. Apart from the empirical methods the mathematical modeling and optimal control can answer this question, for more details see [35].
In this article a base line model [37] is used based on tumorimmune model. A comparative study between the results of the Computation method used in our baseline article [37] which uses indirect method resulted in two point boundary value problem (TPBVP) and solved by collocation method is compared with another method also under the category of indirect method which also resulted in TPBVP but solved by other method named forward backward sweep method (FBWM), results of the base line article [37] also compared with other three different computational methods under the category of direct methods.
Previous comparative studies between direct and indirect control given in [38], where this paper two different approaches (direct and indirect) for the numerical solution of fractional optimal control problems (FOCPs) based on a spectral method using Chebyshev polynomial are presented. Moreover, in [18] a comparative study between singular arc method (without con-sidering inequality constraints) as indirect method and three different direct methods namely Hermite-Simpson's collocation method, 5 th degree Gauss-Lobatto collocation method, Radau Psuedospectral collocation method using GPOP [34]. In [9] a comparative study between dynamic programming, indirect methods and direct methods is presented.
This article is organized as follows: in section 2 we consider a tumor growth model exhibiting the effect of tumor-immune interaction with chemotherapeutic. The model construction and assumptions is introduced in "Mathematical Model" section. In section 3, the optimization stratigetis is presented. In section 4, the solution of OC problem is discussed. In section 5 a comparative study for solving OC problem is presented. The conclusion is given is section 6.

Mathematical modeling and optimal control problem for a tumor-immune system
In this section, a mathematical model problem is introduced, for more details see [37] as a description of the phenomena of tumor-immune interaction and the prediction of the outcome of the application of chemotherapy regimen is examined by using optimal control [5,33]. We can consider optimal control problem as a type of optimization problem where the objective is to determine the inputs (equivalently, the trajectory, state or path), the control inputs (equivalently, the trajectory, state or path), the control input u(t) R m , the initial time 0 the independent variable) of the dynamical system [34] optimize (i.e., minimize or maximize) a specified objective function (performance index) while satisfying any constraints on the motion of the system.
the Objective function (performance index) is represented by: where B 1 ,B 2 are positive constants representing the weights of the terms. The first term represents the tumor cell populations and the second term represents the harmful effects of drug on body. The square of the control variable     2 u t reflects the severity of the side effects of the drug imposed, for more details see [22,42] and the references cited therein. When chemotherapeutic drugs are administered in high dose, they are toxic to the human body, which justifies the quadratic terms in the functional. So the functional given in Eq. (2.1) should be minimized.
The dynamical system is defined by a set of ordinary differential equations (ODE's):

Where
• T (t) is the numbers of tumor cell.
• I H (t) is the active CTL cells (hunting CTL cells).

• I R (t) is the helper T-cells (resting T-cells).
D(t) is the density of chemotherapeutic drug at time t.
The tumor-immune model of the base line paper is originally developed by de Pillis and Radunskaya [31], an optimal control for this model is introduced in [ [37] which is based on [14,15] the model assumptions are as follows: • Hunting CTL cells are capable of killing tumor cells, the effect of CTL on tumor cells in represented by both the terms: • Mass-action kill rate assumes that all immune cells are similarly prone to communicate with any tumor cell: it assumes spatial homogeneity.
• I H helper (resting) T-cells are not able to attack and destroy tumor cells directly but it convert CTLs into active (hunting) CTL helper either by releasing cytokines (interleukin-2) or by direct contact with them.
• I R active (hunting) CTL cells have the role of attacking, destroying, or ingesting the tumor cells.
• Chemotherapeutic drug destroys tumor cells as well as helper T-cells and active CTL cells; that is, chemotherapeutic drug has a negative effect on both tumor cells and immune cells and this negative effect is expressed by subtracting this destroy from each cell population equation.
The model parameters are described as follows, for more details see [37] (Table 1).

Optimization strategies for OCP
The problem formulation which is given in (2.1), (2.2), based on open loop control (also referred to it in literatures as dynamic optimization ), i.e., feedback is not utilized (control input u(t) is independent of state). There are three main approaches to numerically solve continuous time OCP:

Dynamic programming methods:
The optimal criterion in continuous time is based on the Hamilton-Jacobi-Bellman partial differential equation, for more details see [4], which is not in the scope of this paper.

Indirect methods:
It take an approach optimize irst then discretize, also relies on Pontryagin's Maximum Principle (PMP), for more details see [32]. Typically, the optimal control problem is turned into TPBVP containing the same mathematical information as the original one by means of necessary conditions of optimality, for more details see [3], [38] and [39].

Direct methods:
It take an approach discretize irst then optimize, it can be applied without deriving the necessary condition of optimality. Direct methods are based on a inite dimensional parameterization of the in inite dimensional problem. The inite dimensional problem is typically solved using an optimization method, such as nonlinear programming (NLP) techniques. NLP problems can be solved to local optimality relying on the so called Karush-Kuhn-Tucker conditions (KKT), for more details see [3], if we are using KKT we can claim the irst-order conditions of optimality. These conditions were irst derived by Karush in 1939 [23], and later, in 1951, independently by Kuhn and Tucker [26].

Indirect methods
In indirect methods the necessary optimality conditions is derived by using Pontryagin's maximum principle [32], by considering a simple optimal control problem.
by setting irst variation of the Lagrangian to zero i.e., Integrating by parts the last term on the right side in Eq.(3.2), it yields: We can conclude that the necessary optimality conditions for the unconstrained optimal is derived which stated as: They are referred to as the Euler-Lagrange equations.
For more details see [11], which will be solved numerically. Many numerical methods which are based on the Euler-Lagrange differential equation (EL-DEQ) are available to solve the TPBVP. One may classify these numerical methods according to the particular approach used in [2].
In this article we introduce two approaches namely: The purpose of examining two methods is to choose the method with the higher accuracy and set this method as the base method to compare against it. In the following we explained one of the most important methods for solving resultant system of the differential equations.

Indirect collocation method [36]: The Indirect
Collocation Method is the merit of the techniques for solving TPBVP, inite difference method with continuous extension as well as collocation method are the mathematical tools for this method. As a difference methods it is based on Implicit Runge-Kutta (IRK) method which is equivalent to collocation method according to the following theorem.
Referring to the theorem of Guillou & Soule´1969, Wright 1970 which states that: The collocation method de ined as: given s positive integer and c 1 , ..., c s distinct real numbers (typically between 0 and 1), the corresponding collocation polynomial u(x) of degree s is de ined by is equivalent to the s-stage IRK-method which is de ined as ( ) for theorem proof see [21].

Forward-backward sweep method [27]:
In FBSM the initial value problem of the state equation is initialized by using an estimate for the control and costate variables and solved forward in time. Then the costate inal value problem is solved backwards in time. An early reference to a technique that has the forward-backward lavor is [30]. FBSM method can be summarized by the following algorithm.
Information about convergence and stability of Runge-Kutta 4 ODE's solver can be found in [20].

Direct methods
Sequential and simultaneous, are the two main classes of direct methods: for more details see [41]. Sequential methods only parameterize the control while simultaneous methods parameterize both the state and control.
The differences on how we discretize the problem and how the continuity between discritizated intervals are defined result in different transcription method. In the next sections different methods depend on different discretization schemes and different continuity conditions are presented namely multiple shooting method, trapezoidal direct collocation and Hermite Simpson's direct collocation.

Multiple shooting method:
We start with the single shooting method, since single shooting is a special case of multiple shooting [12].

Step1: Transcription:
In single shooting system the time interval [t 0 , t f ] is divided into equal sub intervals (segments) where N is the total number of sub intervals.
Then the control vector is transformed into a parameterized inite dimensional control vector u(t, q) that depends on the inite dimensional parameter vector N q  R there are several parametrization schemes for more details see [3,8], we assume piece wise constant control: which is solved to yield the state vector    . , ) Which is an NLP problem that is solved using the Interior Point Method (IP) in this paper.

Continuity constraints:
Since the simulation is done over the whole time horizon this method does not have continuity constraints.
Accuracy: Determine the accuracy of the inite-dimensional approximation and if necessary repeat the transcription and optimization steps (i.e. go to step 1).
single shooting method has the following drawbacks: • Convergence of the NLP solution is slow, because of the high nonlinear dependence of the objective and constraint functions on the variable u, • NLP solver cannot initialize with an initial guess for x 1 , ..., x N even it is available.
• Parallel evaluation of the states and objective functions is impossible because of the recursive elimination for both states and objective functions.
Direct multiple shooting method [10]: This method combines the advantages of simultaneous methods like collocation method with the main advantages of the single shooting method, so that it is sometimes called a hybrid method [17].
We follow the same steps in single shooting method except that we solve ODE's in each interval [t i , t i+1 ] independently, starting with an arti icial initial value S j :   Make an initial guess for u  over the interval.
1: while not converged do 2: Using the initial condition x 0 = x (t 0 ) and the values for u  , solve x  forward in time according to its differential equation in the optimality system using Runge-Kutta 4 ODE's solver 3: Using the transversality condition Continuity constrains: To enforce continuity between discretized intervals, the following NLP constraints is added at the interface of each sub-interval, this constraint is formed such that the propagated (as an illustration for propagation means consider a cannon be aimed such that the cannonball hit its target i.e., shoot) or integrated value of the state from the previous phase match the value of the state at the current state. The continuity depends on propagation which is approximate because propagation is using algebraic formulas based on numerical integration schemes (or discretization schemes).
So Continuity Constrains is de ined as: We have the following NLP, but contains the extra variables s i , and has a block sparse structure. Direct collocation: Problem size, non-linearity and sparsity of the NLP resulting from direct transcription methods are the prices of moving from one method to another. In single shooting the NLP is highly nonlinear while its size is small. Multiple shooting is less nonlinear but larger and sparser. The direct collocation goes further in the same direction as it is less nonlinear, sparser but even larger, direct collocation method is introduced by Dickmanns [ It employs an interpolating function to approximate the state of the system. Usually the interpolating function is poly nomial.
Polynomials consider the following K th -degree piece wise polynomial: suppose further that the coef icients (a 0 ,..., a K ) of the piece wise polynomial are chosen to match the value of the function at the beginning of the step, i.e.,    . Trapezoidal method and Hermite-Simpson's method are the two direct collocation methods we consider in this article.

Defects constraints:
The difference between the irst derivative of the interpolating polynomial at the midpoint of a segment and the irst derivative calculated from the equations of motion at the segment midpoint is used as the defect. To have a good approximation of the actual states the defects must approach zero and thus interpolating polynomial is a good approximation, the defect is written in the following form: • The quadrature rule used for integration is consistent with the numerical method used for solving the differential equa-tion. If one is using a Runge-Kutta method for solving the differential equation, the cost would also be approximated using Runge-Kutta integration. In the case of an Hermite-Simpson's collocation method, the integration rule is Her-mite-Simpson collocated quadrature rule [33].
• Thus the optimal control problem Eq.

Solution of the NLP problem
Numerical methods for solving NLPs fall into categories: gradient based (local) methods and heuristic (global) methods, gradient based (local) approach is now described in the following algorithm (the heuristic method is out of the interest of this article, for more details see [40]).
The convergence may depend on a given tolerance or depends on no further change in the objective function after several iterations. There are many ways to modify direction selection and step size, and this leads to many different algorithms. Some of common ones included Steepest Descent Methods, Conjugate Direction Method, Simplex Method, Interior Point Method (IP) and Sequential Quadratic Programming, more details on the IP methods will be given here. holds with strict inequality, which may require reformulating some decision variables as parameters, they formulate a barrier problem of the form:

IP
Sequences of barrier problems for increasing values of the barrier parameter μ are then solved with Newton-type methods, for more details see [7,40].

Accuracy
Since the solution obtained from the NLP problem is a discrete set of numbers we don't know what happen between the discrete points, a continuous approximation to the discrete solution resulted from the NLP is needed, i.e., representing the solution (interpolation) [6].

Spline representation:
To get the solution as continuous approximation [y˜(t), u˜(t)] from the NLP solution we use spline representation (where spline is a sequence i.e., whole collection of polynomial) de ined as follows: where n 1 = 2M, M is the number of mesh points, y(t), u(t) are the true state and control respectively, y˜(t), u˜(t) are the approximate state and control respectively, the functions β i (t) form a basis for C 0 or C 1 cubic B-splines with n 1 = 2M, where M is the number of mesh points. The coef icients α i in the state or control variable representation which is de ined by different Interpolation of discrete solution depending on the discritization method, the spline approximation Eq. Notations: Let z is the unknown decision vector, k is the iteration counter, p is the search direction along which to change the current value z k, α k is the magnitude of the change in z k.
1: while not converged do.

2:
Steps are taken in a certain direction i.e., the k th iteration, a search direction is p k and a step length, α k are determined. The update from z k, to z k+1 has the form: z k+1 = z k + αp k, 3: The objective function is evaluated, in case of minimization, the search direction is chosen to suffi ciently decrease the objective function in the form.
where y and u are the true state and control values, we can consider the approximation: from Eq.(3.8) we can de ine the discretization error on the k th mesh iteration as : de ines the error in the differential equation as a function of t. An accurate estimate for the integral in Eq.(3.9) is evaluated by using a standard quadrature method, (i.e., Romberg quadrature algorithm).

Comparative Study and Results
In this part we compare the results from the our base line article [37], with another indirect method solved by sweep method [27] and another three direct method namely direct multiple shooting, trapezoidal direct collocation and Hermite-Simpsons's direct collocation.
In all the previous igures the direct methods achieve better results than indirect methods. The values of controls and number of tumor cells made the big differences in the objective function of the theses methods. The number of tumor             https://www.heighpubs.org/hjcsr 061 cells is lower in direct method than indirect methods also direct methods achieve better control than indirect method the following table shows the value of objective function for ive different methods.
Since the direct methods achieve better results than indirect method this motivate us to give more insight into direct methods through new merits such as discretization error, absolute local error, number of NLP iterations, NLP time the following table gives the values of the previous merits for the methods of direct methods only.
Within the direct method apart from indirect methods the results are butter in both Hermite Simpson DT and direct multiple shooting is butter than the trapozidal since both two methods represented by polynomials, the Hermite Simpson's DT is better than the Trapezoidal DT since Hermite Simpson's DT uses a higher order polynomial than trapezoidal DT which is an agreement with the concept of (p-method), for more details see [29], the hermite Simpson is the most accurate method on the price of time.
The following plots shows the different discretization errors and state errors for the three method.

Conclusion
In this article ive numerical techniques to study the tumor immune dynamic optimization model are discussed, these methods are indirect method in which the resulting TPBVP is studied by collocation method, and by the backward forward sweep method and direct methods which are multiple shooting method, trapezoid direct collocation method, Hermite-Simpson's direct collocation. Many studies claim that indirect methods have drawbacks. The two major drawbacks are that the differential equations obtained are often dif icult to solve due to strong nonliterary and instability and also de ine a guess initial solution for the Lagrangian multiplier. As a matter of fact, these variables do not have a physical meaning, this leads to problems in the de inition of an initial guess solution to start the algorithm. Another problem that it is necessary to take into account in the resolution of a BVP is that the existence and uniqueness of solution is not guaranteed as in an initial value problem. Each problem may have a unique solution, several solutions or no solution at all. By comparing different methods both in direct and indirect methods we can claim that direct method can be used to get better results than indirect method and it is easy to implement. A question that is needed to be answered is how to choose the discretization method among different methods related to direct method to get the best results this question can be a future work.