1 Introduction
The design of stellarator magnetohydrodynamic (MHD) equilibria requires optimizing within a high-dimensional space due to the fully three-dimensional nature of the magnetic field configuration and the sensitive dependence of charged particle trajectories in such field configurations (Boozer Reference Boozer2015). While there are many possible choices for the space in which to optimize, a common choice is the space of the shape of the outer boundary of the plasma (Hirshman et al. Reference Hirshman, Spong, Whitson, Nelson, Batchelor, Lyon, Sanchez, Brooks, Y.-Fu and Goldston1999; Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2018). The confining electromagnetic coils must then be designed to reproduce the desired plasma boundary. This approach was used to design Wendelstein 7-X (Grieger et al. Reference Grieger, Lotz, Merkel, Nührenberg, Sapper, Strumberger, Wobig, Burhenn, Erckmann and Gasparino1992) and the Helically Symmetric Experiment (HSX) (Anderson et al. Reference Anderson, Almagri, Anderson, Matthews, Talmadge and Shohet1995). An alternative approach is to optimize the shape of electromagnetic coils directly to minimize an objective function which includes functions of the equilibria (Hanson & Cary Reference Hanson and Cary1984). This can be performed with the merged STELLOPT/COILOPT code, which was used in the late stages of the National Compact Stellarator Experiment (NCSX) design (Strickler et al. Reference Strickler, Hirshman, Spong, Cole, Lyon, Nelson, Williamson and Ware2004). To navigate such spaces, it is often useful to employ gradient-based optimization techniques. For this reason, it is desirable to compute derivatives with respect to shape.
As the target optimized configuration can never be realized exactly, an analysis of the sensitivity to perturbations, such as errors in coil fabrication or assembly, is central to the success of a stellarator. Tight tolerances have proven to be a significant driver of the cost of stellarator experiments (Strykowsky et al. Reference Strykowsky, Brown, Chrzanowski, Cole, Heitzenroeder, Neilson, Rej and Viol2009; Klinger et al. Reference Klinger, Baylard, Beidler, Boscary, Bosch, Dinklage, Hartmann, Helander, Maßberg and Peacock2013); thus an improvement to the algorithms used to conduct sensitivity studies can have great impact on the field. To quantify the coil tolerances for flux surface quality of Large Helical Device (LHD) (Yamazaki et al. Reference Yamazaki, Yanagi, Ji, Kaneko, Ohyabu, Satow, Morimoto, Yamamoto and Motojima1993) and NCSX (Brooks & Reiersen Reference Brooks and Reiersen2003; Williamson et al. Reference Williamson, Brooks, Brown, Chrzanowski, Cole, Fan, Freudenberg, Fogarty, Hargrove and Heitzenroeder2005), perturbations of several distributions were manually applied to each coil. Sensitivity analysis can also be performed with analytic derivatives. Numerical derivatives with respect to tilt angle and coil translation of the Columbia Non-neutral Torus (CNT) coils have been used to compute the sensitivity of the rotational transform on axis (Hammond et al. Reference Hammond, Anichowski, Brenner, Pedersen, Raftopoulos, Traverso and Volpe2016). Analytic derivatives have recently been applied to study coil sensitivities of the CNT stellarator by considering the eigenvectors of the Hessian matrix (Zhu et al. Reference Zhu, Hudson, Lazerson, Song and Wan2018). Thus, in addition to gradient-based optimization, derivatives with respect to shape can be applied to sensitivity analysis.
The gradients of figures of merit with respect to shape have often been represented as derivatives with respect to whichever quantities parameterize the shape. Examples of such quantities are the toroidal flux dependent amplitudes ( $R_{mn}^{c}$ , $Z_{mn}^{s}$ ) in the double Fourier series for the cylindrical coordinates ( $R$ , $Z$ ) of a toroidal flux surface. Here superscripts c and s refer to cosine and sine, while integers $m$ and $n$ denote the poloidal and toroidal mode numbers. Another way to represent derivatives with respect to shape is the shape gradient (Landreman & Paul Reference Landreman and Paul2018), which provides a local and coordinate-independent form. Consider any scalar figure of merit, $f$ , that depends on a three-dimensional (3-D) MHD equilibrium solution. We can consider $f$ to be a functional of the shape of the outer boundary of the plasma, $S_{P}$ . In this case, a differential change to the boundary, $\unicode[STIX]{x1D6FF}\boldsymbol{r}$ , causes a corresponding change to the figure of merit, $\unicode[STIX]{x1D6FF}f$ ,
Here $\boldsymbol{n}$ is the outward unit normal and $S$ is the shape gradient. The shape gradient quantifies the local linear sensitivity of a figure of merit to differential perturbations of the shape. As tangential displacements to $S_{P}$ do not cause any changes to $f$ , $\unicode[STIX]{x1D6FF}f$ only depends on the normal component of $\unicode[STIX]{x1D6FF}\boldsymbol{r}$ . The Hadamard–Zolésio structure theorem (Delfour & Zolésio Reference Delfour and Zolésio2011) states that under certain assumptions of smoothness, the shape derivative of a functional can be written in the form of (1.1). This can be thought of an instance of the Riesz representation theorem, which states that any linear functional, such as $\unicode[STIX]{x1D6FF}f(\unicode[STIX]{x1D6FF}\boldsymbol{r})$ , can be written as an inner product over the appropriate space (Rudin Reference Rudin2006). The motivation for the form of (1.1) is discussed in more detail in section 2 of Landreman & Paul (Reference Landreman and Paul2018).
If we consider $f$ to be a functional of the shape of the electromagnetic coils, a differential change to the coils, $\unicode[STIX]{x1D6FF}\boldsymbol{r}_{C}$ , will cause a corresponding change to the figure of merit, $\unicode[STIX]{x1D6FF}f$ ,
Here the sum is taken over the coils, $C_{k}$ is a curve describing the filamentary shape of each coil, $C=\{C_{k}\}$ , and $\unicode[STIX]{x1D6FF}\boldsymbol{r}_{C}=\{\unicode[STIX]{x1D6FF}\boldsymbol{r}_{C_{k}}\}$ . The shape gradient for coil $k$ is $\boldsymbol{S}_{k}$ . As tangential displacements to the coil do not change $f$ , $\boldsymbol{S}_{k}$ must be perpendicular to the tangent vector along $C_{k}$ .
As the shape gradient provides local sensitivity information, it can be used to quantify engineering tolerances with respect to the displacement of coils or magnetic perturbations. The shape gradient representation can be computed from parameter derivatives by solving a small linear system (Landreman & Paul Reference Landreman and Paul2018).
However, computing parameter derivatives can often be computationally expensive, as numerical derivatives require evaluating the objective function at least $N+1$ times for $N$ parameters if one-sided finite differences are used, or $2N$ times for centred differences. As computing the objective function often involves solving a linear or nonlinear system, such as the MHD equilibrium equations, this implies solving the system of equations ${\geqslant}N+1$ times. Numerical derivatives also introduce additional noise, and the finite difference step size must be chosen carefully. To avoid these difficulties, it is advantageous to compute shape gradients using adjoint methods (Pironneau Reference Pironneau1974; Glowinski & Pironneau Reference Glowinski and Pironneau1975; Dekeyser, Reiter & Baelmans Reference Dekeyser, Reiter and Baelmans2012, Reference Dekeyser, Reiter and Baelmans2014a ,Reference Dekeyser, Reiter and Baelmans b ). Adjoint methods allow the analytic derivative with respect to all $N$ parameters to be computed with only two solutions to the system of equations. Adjoint methods are thus much more efficient for computing derivatives with respect to many parameters, and they do not introduce the noise of numerical derivatives. Adjoint methods were recently used in the context of stellarator design by Paul et al. (Reference Paul, Landreman, Bader and Dorland2018) for shape optimization of coil winding surfaces.
Rather than use parameter derivatives, in this work we will use an adjoint method to compute the shape gradient directly. This is sometimes termed adjoint shape sensitivity or adjoint shape optimization, which has its origins in aerodynamic engineering and computational fluid dynamics (Pironneau Reference Pironneau1974; Glowinski & Pironneau Reference Glowinski and Pironneau1975). As with adjoint methods for parameter derivatives, this technique only requires the solution of two linear or nonlinear systems of equations. This technique has been applied to magnetic confinement fusion for the design of tokamak divertor shapes by solving forward and adjoint fluid edge equations (Dekeyser et al. Reference Dekeyser, Reiter and Baelmans2012, Reference Dekeyser, Reiter and Baelmans2014a ,Reference Dekeyser, Reiter and Baelmans b ). As stellarators require many parameters to fully describe their shape, adjoint shape sensitivity could significantly decrease the cost of computing the shape gradient. If one is optimizing in the space of parameters describing the boundary of the plasma or the shape of coils, the shape gradient representation obtained from the adjoint method can be converted to parameter derivatives upon multiplication with a small matrix (Landreman & Paul Reference Landreman and Paul2018).
In § 2, the fundamental adjoint relations for perturbations to MHD equilibria are derived and discussed. These relations take a form that is similar to that of transport coefficients that are related by Onsager symmetry (Onsager Reference Onsager1931). Specifically, perturbations to the equilibrium are characterized as a set of generalized responses to a complementary set of generalized forces. The responses and forces can be thought of as being related by a matrix operator, which is symmetric. The resulting relations among forces and responses can be used to compute the shape gradient of functions of the equilibria with respect to displacements of the plasma boundary, as in (1.1), or the coil shapes, as in (1.2), Several applications to stellarator figures of merit will be demonstrated in § 3. While the primary application considered in this work will be stellarator optimization, the relations we obtain are equally applicable for 2-D equilibria.
2 Driven linear perturbations of 3-D MHD equilibria
The goal is to find a relation between small perturbations to a 3-D magnetic confinement equilibrium configuration and the resulting change in a figure of merit of interest. As mentioned, the perturbations may be prescribed in one of two ways: either as a change in the shape of the outermost flux surface (fixed boundary) or as a change in the position, shape or current strength in the coils confining the plasma (free boundary). (Even though the boundary shape changes in the former case, we refer to it as ‘fixed boundary’ since the equilibrium code is run in fixed-boundary mode, and since the associated adjoint problem will turn out to have no boundary perturbation.) These perturbations will be referred to as the ‘true’ perturbations and their direct calculation needs to be repeated many times for each possible change in shape of the outer flux surface or each change in the coil configuration to determine fully the sensitivity of the equilibrium.
The approach we use is to instead calculate a different change in the equilibrium, which we refer to as the ‘adjoint’ perturbation. The adjoint perturbation will correspond to the change in the equilibrium when an additional bulk force acts on the plasma, or the toroidal current profile is changed. For the adjoint perturbation there is no change to the outer flux surface in the fixed-boundary case or to the coil currents in the free boundary case. In this section we will show that aspects of the true and adjoint changes are related to each other in a manner similar to Onsager symmetry. Thus, it will be shown that by calculating the adjoint perturbation, with a judiciously chosen added force or change in the toroidal current profile, the solution to the true problem can be determined.
We consider equilibria in which the magnetic field in the plasma can be expressed in terms of scalar of functions $\unicode[STIX]{x1D713}(\boldsymbol{x}),\unicode[STIX]{x1D6F7}(\unicode[STIX]{x1D713}),\unicode[STIX]{x1D703}(\boldsymbol{x})$ and $\unicode[STIX]{x1D701}(\boldsymbol{x})$ ,
We will regard $\unicode[STIX]{x1D713}$ as labelling the flux surfaces and consider toroidal geometries for which
label field lines in a flux surface, where $\unicode[STIX]{x1D703}$ is a poloidal angle, $\unicode[STIX]{x1D701}$ is a toroidal angle and $\unicode[STIX]{x1D704}(\unicode[STIX]{x1D713})=\text{d}\unicode[STIX]{x1D6F7}/\text{d}\unicode[STIX]{x1D713}$ is the rotational transform, with $\unicode[STIX]{x1D6F7}$ being the poloidal flux function. (Any straight-field-line angles may be used.) With these definitions the magnetic flux passing toroidally through a poloidally closed curve of constant $\unicode[STIX]{x1D713}$ is $2\unicode[STIX]{x03C0}\unicode[STIX]{x1D713}$ . The flux passing poloidally between the magnetic axis and the surface of constant $\unicode[STIX]{x1D713}$ is $2\unicode[STIX]{x03C0}\unicode[STIX]{x1D6F7}(\unicode[STIX]{x1D713})$ . Thus, we assume that good flux surfaces exist and leave aside the issues of islands and chaotic field lines.
In addition to the representation of the magnetic field, we assume that MHD force balance is satisfied with a scalar pressure, $p(\unicode[STIX]{x1D713})$ ,
where the current density, $\boldsymbol{J}$ , satisfies Ampere’s law,
and $c$ is the speed of light. As mentioned, we will consider two cases, a fixed-boundary case in which the shape of the outer flux surface is prescribed, and a free boundary case for which outside the plasma, whose surface is defined by a particular value of toroidal flux, the force balance equation (2.3) does not apply, but rather, the magnetic field is determined by Ampere’s law (2.4) with a given current density $\boldsymbol{J}_{c}$ , representing current flowing in a set of coils.
From (2.3) it follows that current density streamlines also lie in the $\unicode[STIX]{x1D713}=$ constant surfaces. The toroidal current passing through a surface, $S_{T}(\unicode[STIX]{x1D713})$ , whose perimeter is a closed poloidal loop at constant $\unicode[STIX]{x1D713}$ is given by
where $\sqrt{g}^{-1}=\unicode[STIX]{x1D735}\unicode[STIX]{x1D701}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}\times \unicode[STIX]{x1D735}\unicode[STIX]{x1D703}$ .
Equations (2.1)–(2.5) describe our base equilibrium configuration. We now consider small changes in the equilibrium that are assumed to yield a second equilibrium state of the same form as (2.1), but with new functions such that $\boldsymbol{B}^{\prime }=\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}^{\prime }\times \unicode[STIX]{x1D735}\unicode[STIX]{x1D703}^{\prime }-\unicode[STIX]{x1D735}\unicode[STIX]{x1D6F7}^{\prime }(\unicode[STIX]{x1D713}^{\prime })\times \unicode[STIX]{x1D735}\unicode[STIX]{x1D701}^{\prime }$ . Each of the primed variables is assumed to differ from the corresponding unprimed variables by a small amount (e.g. $\unicode[STIX]{x1D713}^{\prime }=\unicode[STIX]{x1D713}+\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D713}(\boldsymbol{x})$ ). The perturbed magnetic field can then be expressed $\boldsymbol{B}^{\prime }=\boldsymbol{B}+\unicode[STIX]{x1D6FF}\boldsymbol{B}$ , where
We write the perturbed poloidal flux as the sum of a term resulting from the perturbation of toroidal flux at fixed rotational transform, $\unicode[STIX]{x1D704}\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D713}$ , and a term representing the perturbed rotational transform, $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}(\unicode[STIX]{x1D713})$ . Thus, we can regroup the terms in (2.6) as follows
The group of terms in parentheses in (2.7) corresponds to perturbations of the magnetic field allowed by ideal MHD, which is constrained by the ‘frozen in law’, and which preserves the rotational transform, ( $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D704}=0$ ). The last term in (2.7) allows for changes in the rotational transform, ( $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D704}=\text{d}\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}/\text{d}\unicode[STIX]{x1D713}$ ). Note also that the expression in parentheses in (2.7) can be written as a sum of terms parallel to $\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}$ and $\unicode[STIX]{x1D735}\unicode[STIX]{x1D6FC}$ , and hence it is perpendicular to $\boldsymbol{B}$ . The group of terms in parentheses in (2.7) can thus be expressed in terms of a vector potential that is perpendicular to the equilibrium magnetic field, while the last term in (2.7) can be represented in terms of a vector potential in the toroidal direction, which thus has a component parallel to the equilibrium field. We can therefore write $\unicode[STIX]{x1D6FF}\boldsymbol{B}=\unicode[STIX]{x1D735}\times \unicode[STIX]{x1D6FF}\boldsymbol{A}$ , where
Here, the variable $\unicode[STIX]{x1D743}$ can be taken to be perpendicular to the applied magnetic field, and can be thought of as a displacement of the equilibrium magnetic field line. Using (2.6) we write
and from this we can see that perturbations of the toroidal and poloidal angles correspond to displacements in the flux surface, and the perturbation $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D713}$ gives a displacement out of the flux surface. A surface containing a prescribed toroidal flux $2\unicode[STIX]{x03C0}\unicode[STIX]{x1D713}_{0}$ in the unperturbed field is defined by $\unicode[STIX]{x1D713}(\boldsymbol{x})=\unicode[STIX]{x1D713}_{0}$ . From (2.9), the perturbation to $\unicode[STIX]{x1D713}$ at fixed position is $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D713}=-\unicode[STIX]{x1D743}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}$ . When the field is perturbed the surface moves. As $\unicode[STIX]{x1D743}$ is a vector which describes the motion of field lines, the perturbed toroidal flux label moving with the $\unicode[STIX]{x1D713}$ surface, as measured in the unperturbed coordinate system, is given by $\unicode[STIX]{x1D713}(\boldsymbol{x})=\unicode[STIX]{x1D713}_{0}+\unicode[STIX]{x1D743}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}(\boldsymbol{x})$ .
The change in toroidal current flowing through the perturbed surface is
where $S_{T}(\unicode[STIX]{x1D713})$ is a surface at constant toroidal angle bounded by the $\unicode[STIX]{x1D713}$ surface and $\unicode[STIX]{x2202}S_{T}(\unicode[STIX]{x1D713})$ is the boundary of such surface, a closed poloidal loop. Here the first term accounts for the displacement of the flux surface and the second term accounts for the change in toroidal current density.
We now consider two distinct perturbations of the equilibrium of the type described by (2.8)–(2.10), which we denote with subscripts 1 and 2. In general, variables with subscript 1 will be associated with the true perturbation, and those with subscripts 2 will be associated with the adjoint perturbation. We then form the quantity
where the variables $\unicode[STIX]{x1D6FF}\boldsymbol{J}$ and $\unicode[STIX]{x1D6FF}\boldsymbol{A}$ are the changes in current density and vector potential associated with the two perturbations, and the integral is, for the time being, over all space. That $U_{T}=0$ follows from expressing the change in current densities in terms of the change in magnetic fields via Ampere’s law. This turns the integrand in (2.11) into a divergence, which in turn becomes a surface integral, which we take to be at infinity where fields vanish sufficiently fast.
We now express the volume integral in (2.11) as the sum of three terms,
Here $U_{P}$ is the contribution from volume in the plasma, integrated just up to the plasma–vacuum boundary. For this term we represent the vector potentials using (2.8)
To evaluate (2.13) we use the perturbed force balance relation
where $\unicode[STIX]{x1D6FF}\boldsymbol{F}$ is an additional perturbed force to be prescribed. The term $\unicode[STIX]{x1D735}(\unicode[STIX]{x1D743}\boldsymbol{\cdot }\unicode[STIX]{x1D735}p)$ represents the perturbed force associated with the perturbed pressure in the case of a true solution for which the dependence of the pressure on flux is preserved. As the perturbation to the flux label at fixed position is $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D713}=-\unicode[STIX]{x1D743}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}$ and the pressure is assumed to be a fixed function $p(\unicode[STIX]{x1D713})$ , then the perturbation to the pressure at fixed position is $\unicode[STIX]{x1D6FF}p=-\unicode[STIX]{x1D743}\boldsymbol{\cdot }\unicode[STIX]{x1D735}p(\unicode[STIX]{x1D713})$ . If the equilibrium calculation is performed with specified $p(\unicode[STIX]{x1D713})$ , this does not imply any additional restrictions.
The term $U_{B}$ comes from integrating over a thin layer at the plasma–vacuum boundary. At the boundary the difference between the perturbed and unperturbed current density has the character of a current sheet due to the displacement of the outermost flux surface. This effective current sheet causes a jump in the tangential components of the perturbation to the magnetic fields at the surface. This jump implies that care must be taken in evaluating the perturbed magnetic fields at the surface as they have different values on either side of the plasma–vacuum surface. However, the vector potential is continuous at the plasma–vacuum boundary. Thus, we write
where the vector potentials are expressed as in (2.8). Using this expression for the vector potentials and expressing the surface integral as an integral over the toroidal and poloidal angles gives
Here we note the terms in the vector potential coming from the MHD displacement cancel.
Last, the quantity $U_{C}$ represents the contribution from the integral over the volume outside the plasma where only the coil currents need to be included
where $\unicode[STIX]{x1D6FF}\boldsymbol{A}_{V_{1,2}}$ is the change in the vacuum vector potential, and $\unicode[STIX]{x1D6FF}\boldsymbol{J}_{C_{1,2}}$ is the change in the coil current density.
Combining $U_{P}$ , $U_{B}$ and $U_{C}$ gives the following relation appropriate to the free boundary case $U_{T}=U_{P}+U_{B}+U_{C}=0$ , or
The many steps leading to (2.18) are outlined in appendix A.
A similar relation can be obtained in the fixed-boundary case. Here the integral over the plasma volume, (2.13), can be written as a surface integral by applying the divergence theorem,
Again, following steps outlined in (appendix A), this may be rewritten in the following form,
The fixed-boundary adjoint relation can also be obtained by applying the self-adjointness of the MHD force operator (appendix B). If the second term in (2.20) is integrated by parts in $\unicode[STIX]{x1D713}$ , we see that the fixed and free boundary adjoint relations share the terms involving the products of displacements with bulk forces and perturbed fluxes with perturbed toroidal currents. The integral over the vacuum region in (2.18) is replaced by an integral over the plasma boundary and a boundary term from the integration by parts in $\unicode[STIX]{x1D713}$ in (2.20).
We now have two integral relations between perturbations 1 and 2, viz. (2.18) and (2.20). They have a common form in that they each are the sum of three integrals: the first involving forces and displacements, the second involving the toroidal current and poloidal flux profiles and the third involving the manner in which the plasma boundary is prescribed. In (2.18), the free boundary case, the changes in coil current densities are specified. In (2.20), the fixed-boundary case, the displacement of the outer flux surface is prescribed. Equations (2.18) and (2.20) can also be viewed as the difference in sums of generalized forces and responses. For example, in (2.18) we can consider the quantities $\unicode[STIX]{x1D6FF}\boldsymbol{F}$ , $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}$ , $\unicode[STIX]{x1D6FF}\boldsymbol{J}_{c}$ as forces and $\unicode[STIX]{x1D743}$ , $\text{d}\unicode[STIX]{x1D6FF}I_{T}/\text{d}\unicode[STIX]{x1D713}$ , $\unicode[STIX]{x1D6FF}\boldsymbol{A}_{V}$ as responses. The fact that the sum of the products of true forces and adjoint responses less the products of adjoint forces and true responses vanishes is similar to the relation between forces and fluxes related by Onsager symmetry. In the case of Onsager symmetry this relation follows from the self-adjoint property of the collision operator. In the case of MHD equilibria it is known that the force operator is self-adjoint. Here we see that the self-adjoint property is extended to cases in which the MHD frozen-in constraint is broken.
These relations (2.18) and (2.20) can be used to generate the shape gradient if we manipulate the terms to be zero, or some known quantity. For example, if we impose that both the true and adjoint solutions maintain pressure as a function of flux, or equivalently there are no added forces, then the first terms in both (2.18) and (2.20) vanish. If we impose that the adjoint solution in (2.20) involves no change in shape of the outer flux surface, the boundary term involving $\boldsymbol{n}\boldsymbol{\cdot }\unicode[STIX]{x1D743}_{2}$ vanishes. This leaves
Thus, if we solve the adjoint problem with a prescribed change in toroidal current profile, and ask what is the change in rotational transform in the true problem when the current profile is unchanged we find
The right-hand side has the form of (1.1), with $\unicode[STIX]{x1D6FF}\boldsymbol{B}_{2}\boldsymbol{\cdot }\boldsymbol{B}/4\unicode[STIX]{x03C0}$ playing the role of the shape gradient. Thus, by picking an adjoint current profile change that is localized to a particular flux surface, we determine the sensitivity of the rotational transform at that flux surface to changes in the boundary shape. Note that even if the rotational transform is allowed to vary, the normal component of $\unicode[STIX]{x1D743}_{1}$ describes the displacement of the boundary, as discussed in (appendix C). Thus the form of the shape gradient in (1.1) can be used with $\unicode[STIX]{x1D6FF}\boldsymbol{r}\boldsymbol{\cdot }\boldsymbol{n}$ replaced with $\unicode[STIX]{x1D743}_{1}\boldsymbol{\cdot }\boldsymbol{n}$ .
If we impose that the adjoint solution in (2.18) involves no change in coil currents, upon integrating the middle term in (2.18) by parts a similar relation can be obtained for the free boundary case,
where it has been assumed that $\unicode[STIX]{x1D6FF}I_{T,2}=0$ at the boundary.
When the coil currents are confined to wires, the right-hand side can be expressed as changes in currents and fluxes and integrals along wires,
Here $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}_{C_{2,k}}$ , $\unicode[STIX]{x1D6FF}I_{C_{1,k}}$ are the change in adjoint flux through and change in true current in coil $k$ and $I_{C_{k}}$ is the current through the unperturbed coil. The unit tangent vector along $C_{k}$ is $\boldsymbol{t}$ . The second term describes the effect of moving the coil. It is proportional to the current in the coil and the line integral of the displacement of the coil dotted with the cross product of the tangent to the coil and the perturbed adjoint magnetic field.
The sensitivity of other figures of merit may be evaluated by considering cases for which both the true and adjoint solutions preserve the rotational transform or the toroidal current profile. In this case a perturbed force density is included in the adjoint calculation
In this case one must find an expression for the perturbed adjoint force, $\unicode[STIX]{x1D6FF}\boldsymbol{F}_{2}$ , such that the left-hand side is the change in a figure of merit of interest. Some examples will be discussed in the next section.
3 Applications
In this section we will consider figures of merit which depend on the shape of the outer boundary of the plasma (§§ 3.1 and 3.2) and on the shape of the electromagnetic coils (§ 3.3). The shape gradients of these figures of merit will be computed using both a direct method and an adjoint method, to demonstrate that the adjoint method produces identical results to the direct method but at much lower computational expense.
3.1 Surface shape gradient for $\unicode[STIX]{x1D6FD}$
Consider a figure of merit, the volume-averaged $\unicode[STIX]{x1D6FD}$ ,
where
and
(This definition of volume-averaged $\unicode[STIX]{x1D6FD}$ is the one employed in the VMEC code (Hirshman & Whitson Reference Hirshman and Whitson1983).) While $f_{\unicode[STIX]{x1D6FD}}$ is a figure of merit not often considered in stellarator shape optimization, we include this calculation to demonstrate the adjoint approach, as its shape gradient can be computed without modifications to an equilibrium code.
The differential change in $f_{P}$ associated with displacement $\unicode[STIX]{x1D743}_{1}$ is
The first term accounts for the change in $p$ at fixed position due to the motion of the flux surfaces, and the second term accounts for the motion of the boundary. The differential change in $f_{B}$ associated with $\unicode[STIX]{x1D743}_{1}$ is
where we have assumed a perturbation that preserves the rotational transform ( $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}_{1}=0$ ), for which $\unicode[STIX]{x1D6FF}B^{2}=-2(B^{2}\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D743}_{1}+\unicode[STIX]{x1D743}_{1}\boldsymbol{\cdot }\unicode[STIX]{x1D735}(B^{2}+4\unicode[STIX]{x03C0}p))$ is the perturbation to the magnetic field strength at fixed position. The first term in (3.5) corresponds to the change in $f_{B}$ due to the perturbation to the field strength, while the second term accounts for the motion of the boundary. Applying the divergence theorem we obtain,
The differential change in $f_{\unicode[STIX]{x1D6FD}}$ associated with displacement $\unicode[STIX]{x1D743}_{1}$ satisfies
The first term on the right of (3.7) is already in the form of a shape gradient. To evaluate the second term, we turn to the adjoint problem, and we choose $\unicode[STIX]{x1D6FF}\boldsymbol{F}_{2}=-\unicode[STIX]{x1D735}(\unicode[STIX]{x0394}p)$ , where $\unicode[STIX]{x1D6E5}\ll 1$ is a constant scalar. That is, we add a force which is proportional to the equilibrium pressure force with a small multiplier, $\unicode[STIX]{x1D6E5}$ . This additional force produces a proportional change in magnetic field at the boundary and thus from (2.20), we find
Thus, we can obtain the shape gradient without perturbing the shape of the surface,
Obtaining $S$ amounts to computing an equilibrium with unperturbed $\unicode[STIX]{x1D704}$ , unperturbed boundary and perturbed pressure $p^{\prime }=(1+\unicode[STIX]{x1D6E5})p$ .
A similar expression can be obtained for equilibria for which the rotational transform is allowed to vary, but the toroidal current is held fixed ( $\unicode[STIX]{x1D6FF}I_{T,1}=0$ ). In this case, the toroidal current for the adjoint problem is chosen to be $\unicode[STIX]{x1D6FF}I_{T,2}=-\unicode[STIX]{x1D6E5}I_{T}(1/f_{P}-1/f_{B})^{-1}(1/f_{B})$ where $I_{T}$ is the unperturbed current profile, and again $\unicode[STIX]{x1D6FF}\boldsymbol{F}_{2}=-\unicode[STIX]{x1D735}(\unicode[STIX]{x0394}p)$ . The shape gradient can then be obtained from (3.9).
To demonstrate, we use the NCSX stellarator LI383 equilibrium (Zarnstorff et al. Reference Zarnstorff, Berry, Brooks, Fredrickson, Fu, Hirshman, Hudson, Ku, Lazarus and Mikkelsen2001). The pressure profile was perturbed with $\unicode[STIX]{x1D6E5}=0.01$ . The unperturbed and adjoint equilibria are computed with the VMEC code (Hirshman & Whitson Reference Hirshman and Whitson1983). The shape gradient obtained with the adjoint solution, $S_{\text{adjoint}}$ , and that obtained with the direct approach, $S_{\text{direct}}$ , are shown in figure 1(a). Positive values of the shape gradient indicate that $f_{\unicode[STIX]{x1D6FD}}$ increases if a normal perturbation is applied at a given location as indicated by (1.1). For the direct approach, parameter derivatives $(\unicode[STIX]{x2202}f_{\unicode[STIX]{x1D6FD}}/\unicode[STIX]{x2202}R_{mn}^{c},\unicode[STIX]{x2202}f_{\unicode[STIX]{x1D6FD}}/\unicode[STIX]{x2202}Z_{mn}^{s})$ are computed with a centred 4-point stencil for $m\leqslant 15$ and $|n|\leqslant 9$ using a polynomial fitting technique. The shape gradient is obtained using the method outlined in §4.2 of Landreman & Paul (Reference Landreman and Paul2018). The fractional difference between the two methods,
is shown in figure 1(c). The surface-averaged value of $S_{\text{residual}}$ is $1.7\times 10^{-3}$ .
The parameter $\unicode[STIX]{x1D6E5}$ must be chosen carefully, as the perturbation must be large enough that the result is not dominated by round-off error, but small enough that nonlinear effects do not become important. The relationship between $S_{\text{residual}}$ and $\unicode[STIX]{x1D6E5}$ is shown in figure 1(d). Here $S_{\text{direct}}$ is computed using the parameters reported above such that convergence is obtained. We find that $S_{\text{residual}}$ decreases as $(\unicode[STIX]{x1D6E5})^{1}$ until $\unicode[STIX]{x1D6E5}\approx 0.5$ , at which point round-off error begins to dominate. This scaling is to be expected, as $\unicode[STIX]{x1D6FF}\boldsymbol{B}_{2}$ is computed with a forward difference derivative with step size $\unicode[STIX]{x1D6E5}$ .
For this and the later examples, the computational cost of transforming the parameter derivatives to the shape gradient was negligible compared to the cost of computing the parameter derivatives. The direct approach used 2357 calls to VMEC while the adjoint approach only required two. It is clear that the adjoint method yields nearly identical derivative information to the direct method but at substantially reduced computational cost.
In figure 1 we find that the shape gradient for $f_{\unicode[STIX]{x1D6FD}}$ is everywhere positive. This reflects the fact that the toroidal flux enclosed by $S_{P}$ is fixed. As perturbations which displace the plasma surface outward increase the surface area of a toroidal cross-section, the toroidal field must correspondingly decrease, thus increasing $f_{\unicode[STIX]{x1D6FD}}$ . We find that the shape gradient is increased in regions of large field strength, as indicated by the second term in (3.9).
3.2 Surface shape gradient for rotational transform
Consider a figure of merit, the average rotational transform in a radially localized region,
Here $w(\unicode[STIX]{x1D713})$ is a normalized weighting function,
and $\unicode[STIX]{x1D713}_{m}$ and $\unicode[STIX]{x1D713}_{w}$ are parameters defining the centre and width of the Gaussian weighting, respectively.
The differential change of $f_{\unicode[STIX]{x1D704}}$ associated with perturbation $\unicode[STIX]{x1D743}_{1}$ is
For the adjoint problem, we can choose a toroidal current profile $\unicode[STIX]{x1D6FF}I_{T,2}=I_{\unicode[STIX]{x1D6E5}}w(\unicode[STIX]{x1D713})$ , where $I_{\unicode[STIX]{x1D6E5}}$ is a scalar constant, and we can take $\unicode[STIX]{x1D6FF}\boldsymbol{F}_{2}=0$ . This additional current produces a proportional change in the magnetic field at the boundary; thus using (2.20), we obtain the following
So, we can obtain the shape gradient from the adjoint solution
Note that the computation of the shape gradient of the rotational transform on a single surface, $\unicode[STIX]{x1D713}_{m}$ , with the adjoint approach would require a delta-function current perturbation, $\unicode[STIX]{x1D6FF}I_{T,2}=I_{\unicode[STIX]{x1D6E5}}\unicode[STIX]{x1D6FF}(\unicode[STIX]{x1D713}-\unicode[STIX]{x1D713}_{m})$ . As this type of perturbation is difficult to resolve in a numerical computation, the use of the Gaussian envelope allows the shape gradient of the rotational transform in a localized region of $\unicode[STIX]{x1D713}_{m}$ to be computed.
To demonstrate, we use the NCSX stellarator LI383 equilibrium. The toroidal current profile was perturbed with $I_{\unicode[STIX]{x1D6E5}}=715$ A, $\unicode[STIX]{x1D713}_{m}=0.1\unicode[STIX]{x1D713}_{0}$ and $\unicode[STIX]{x1D713}_{w}=0.05\unicode[STIX]{x1D713}_{0}$ . The shape gradient obtained with the adjoint solution and with the direct approach are shown in figure 2(a). For the direct approach, the shape gradient is computed from parameter derivatives ( $\unicode[STIX]{x2202}f_{\unicode[STIX]{x1D704}}/\unicode[STIX]{x2202}R_{mn}^{c}$ and $\unicode[STIX]{x2202}f_{\unicode[STIX]{x1D704}}/\unicode[STIX]{x2202}Z_{mn}^{s}$ ) using an 8-point stencil with $m\leqslant 18$ and $|n|\leqslant 12$ . The fractional difference, $S_{\text{residual}}$ , between the two approaches is shown in figure 2(c), with a surface-averaged value of $2.7\times 10^{-2}$ .
The direct approach used 7401 calls to VMEC, while the adjoint only required two. Again, it is apparent that the adjoint method allows the same derivative information to be computed at much lower computational cost.
We find that over much of the surface, the shape gradient is close to zero. A region of large negative shape gradient occurs in the concave region of the plasma surface with adjacent regions of large positive shape gradient. This indicates that ‘pinching’ the surface in this region, making it more concave, would increase $\unicode[STIX]{x1D704}$ near the axis. The relationship between shaping and rotational transform is generally quite complex. Further analysis is needed to interpret the shape gradient for $f_{\unicode[STIX]{x1D704}}$ .
3.3 Coil shape gradient for rotational transform
The shape gradient of $f_{\unicode[STIX]{x1D704}}$ can also be computed with a free boundary approach. We can again choose a toroidal current profile $\unicode[STIX]{x1D6FF}I_{T,2}=I_{\unicode[STIX]{x1D6E5}}w(\unicode[STIX]{x1D713})$ for the adjoint problem, where $w(\unicode[STIX]{x1D713})$ is given by (3.12). Using (3.13) and (2.18) and noting that $\unicode[STIX]{x1D6FF}I_{T,2}$ vanishes at the boundary, we find
Using (2.24), this can be written in terms of changes in the positions of coils in the vacuum region,
When computing the coil shape gradient, the current in each coil is fixed. In arriving at (3.17), we assume that $\unicode[STIX]{x1D6FF}I_{C_{1,k}}=0$ . The coil shape gradient is thus
As anticipated, $\boldsymbol{S}_{k}$ has no component in the direction tangent to the coil. Evaluating the shape gradient requires computing the adjoint magnetic field at the unperturbed coil locations in the vacuum region. This can be performed with the DIAGNO code (Gardner Reference Gardner1990; Lazerson Reference Lazerson2012), which employs the virtual casing principle to efficiently compute the fields in the vacuum region due to the plasma current.
To demonstrate, we use the NCSX stellarator LI383 equilibrium. The toroidal current profile was perturbed with $I_{\unicode[STIX]{x1D6E5}}=5.7\times 10^{5}$ A, $\unicode[STIX]{x1D713}_{m}=0.1\unicode[STIX]{x1D713}_{0}$ and $\unicode[STIX]{x1D713}_{w}=0.05\unicode[STIX]{x1D713}_{0}$ . The shape gradient is computed for each of the three unique modular coils per half-period (Williamson et al. Reference Williamson, Brooks, Brown, Chrzanowski, Cole, Fan, Freudenberg, Fogarty, Hargrove and Heitzenroeder2005), keeping the planar coils fixed. The result obtained with the adjoint solution, $\boldsymbol{S}_{\text{adjoint},k}$ , is shown in figure 3. The shape gradient is also computed with the direct approach, $\boldsymbol{S}_{\text{direct},k}$ . For the direct approach, the Cartesian components of each coil are Fourier discretized ( $X_{m}$ , $Y_{m}$ , $Z_{m}$ ). The numerical derivative with respect to these parameters are computed for $m\leqslant 45$ using an 8-point stencil. In figure 4(a) the Cartesian components of the shape gradient computed with the adjoint approach, $S_{\text{adjoint},k}^{l}$ , and with the direct approach, $S_{\text{direct},k}^{l}$ , are shown for each coil. Here $l\in \{x,y,z\}$ . The arrows indicate the direction and magnitude of $\boldsymbol{S}_{k}$ such that if a coil were deformed in the direction of $\boldsymbol{S}_{k}$ , $f_{\unicode[STIX]{x1D704}}$ would increase according to (1.2). The direct approach used 6553 calls to VMEC, while the adjoint only required two. In figure 4(b) the fractional difference between the results obtained with the two methods,
is plotted. The line-averaged values of $S_{\text{residual}}^{l}$ are $6.1\times 10^{-2}$ for coil 1, $3.8\times 10^{-2}$ for coil 2 and $4.8\times 10^{-2}$ for coil 3.
From figure 3 we see that the sensitivity of $f_{\unicode[STIX]{x1D704}}$ to coil displacements is much higher in regions where the coils are close to the plasma surface. The shape gradient points toward the plasma surface in the concave region of the plasma surface, while on the outboard side the sensitivity is significantly lower, again indicating the ‘pinching’ effect seen in figure 2.
4 Conclusions
We have obtained a relationship between 3-D perturbations of MHD equilibria that is a consequence of the self-adjoint property of the MHD force operator. The relation allows for the efficient computation of shape gradients for either the outer plasma surface using the fixed boundary adjoint relation (2.20) or for coil shapes using the free boundary adjoint relation (2.18). The computation of the shape gradient of several stellarator figures of merit has been demonstrated with both the adjoint and direct approach. The application of the adjoint relation provides an $O(N)$ reduction in CPU hours required in comparison with the direct method of computing the shape gradient, where $N$ is the number of parameters used to describe the shape of the outer boundary or the coils. For a fully 3-D geometry, $N$ can be $10^{2}$ . Thus, the application of adjoint methods can significantly reduce the cost of computing the shape gradient for gradient-based optimization or local sensitivity analysis.
To compute the adjoint equilibria in this work, the full nonlinear MHD equilibrium equations are solved using VMEC with the addition of a small perturbation to the current profile, characterized by $I_{\unicode[STIX]{x1D6E5}}$ , or a small perturbation to the pressure profile, characterized by $\unicode[STIX]{x1D6E5}$ . These parameters must be tuned carefully such that the perturbation is large enough that the result is not dominated by round-off error, but small enough that nonlinear effects do not become important. If a perturbed equilibrium code were instead used, these difficulties could be avoided. However, it is convenient to use the same code for both the unperturbed and adjoint equilibrium.
It should be noted that the adjoint approach we have outlined can not yield an exact analytic shape gradient. Throughout we have assumed the existence of magnetic surfaces as the 3-D equilibrium is perturbed. Therefore a code such as VMEC, which minimizes an energy subject to the constraint that surfaces exist, is suitable. Generally VMEC solutions do not satisfy (2.3) exactly (Nührenberg, Boozer & Hudson Reference Nührenberg, Boozer and Hudson2009) as they do not account for the formation of islands or current singularities associated with rational surfaces. Furthermore, the parameters $\unicode[STIX]{x1D6E5}$ and $I_{\unicode[STIX]{x1D6E5}}$ introduce additional numerical noise. We have demonstrated that the typical difference between the shape gradient obtained with the adjoint method and that computed directly from numerical derivatives is ${\lesssim}5\,\%$ . These errors should not be significant for applying the shape gradient to an analysis of engineering tolerances. The discrepancy between the true shape gradient and that obtained numerically, with the adjoint approach or with finite difference derivatives, may become problematic as one nears a local minimum during gradient-based optimization, as the resulting shape gradient may not provide a true descent direction.
For the figure of merit considered in § 3.1, the additional force ( $\unicode[STIX]{x1D6FF}\boldsymbol{F}_{2}$ in (2.14)) applied to the adjoint problem can be expressed as a gradient of a scalar pressure; thus an existing equilibrium code could be utilized without modification. This approach could be applied to other quantities of interest. For example, consider a figure of merit which quantifies the vacuum magnetic well,
where $w(\unicode[STIX]{x1D713})$ is a weighting function and $V^{\prime \prime }(\unicode[STIX]{x1D713})=(\unicode[STIX]{x2202}/\unicode[STIX]{x2202}\unicode[STIX]{x1D713})(\int _{0}^{2\unicode[STIX]{x03C0}}\,\text{d}\unicode[STIX]{x1D703}\int _{0}^{2\unicode[STIX]{x03C0}}\,\text{d}\unicode[STIX]{x1D701}\,\sqrt{g})$ . The presence of a magnetic well ( $V^{\prime \prime }(\unicode[STIX]{x1D713})<0$ ) has a stabilizing effect on MHD modes (Greene Reference Greene1997; Helander Reference Helander2014) and has been considered in stellarator design (Hirshman et al. Reference Hirshman, Spong, Whitson, Nelson, Batchelor, Lyon, Sanchez, Brooks, Y.-Fu and Goldston1999; Drevlak et al. Reference Drevlak, Beidler, Geiger, Helander and Turkin2018). Computing the shape gradient of $f_{W}$ requires solving an adjoint force balance equation with a perturbation to the scalar pressure, similar to the calculation in § 3.1.
For many interesting figures of merit the spatial dependence of the required force is more complicated. Thus, an equilibrium code that allows for an arbitrary force perturbation is needed. One possibility is a generalization of the code ANIMEC (Cooper et al. Reference Cooper, Hirshman, Merkel, Graves, Kisslinger, Wobig, Narushima, Okamura and Watanabe2009) that currently treats anisotropic pressure tensors in the form of a bi-Maxwellian distribution. For example, the shape gradient of the neoclassical particle flux in the $1/\unicode[STIX]{x1D708}$ regime (Nemov et al. Reference Nemov, Kasilov, Kernbichler and Heyn1999) can be computed with the addition of a bulk force that takes the form $\unicode[STIX]{x1D6FF}\boldsymbol{F}_{2}=-\unicode[STIX]{x1D735}\boldsymbol{\cdot }\mathbb{P}_{2}$ to the adjoint force balance equation. Here $\mathbb{P}_{2}$ is a pressure tensor that has arbitrary spatial dependence. For several figures of merit, the required additional bulk force does not take the form of the divergence of a pressure tensor. Consider the following figure of merit, which quantifies the departure from quasi-symmetry,
where $w(\unicode[STIX]{x1D713})$ is a weighting function and
Here $M$ and $N$ are the mode numbers of the desired quasi-symmetry such that if $B(\unicode[STIX]{x1D713},M\unicode[STIX]{x1D703}-N\unicode[STIX]{x1D701})$ for Boozer angles $\unicode[STIX]{x1D703}$ and $\unicode[STIX]{x1D701}$ , then $f_{QS}=0$ (Helander Reference Helander2014). The Boozer covariant components are $G(\unicode[STIX]{x1D713})=2I_{P}/c$ and $I(\unicode[STIX]{x1D713})=2I_{T}/c$ , where $I_{P}$ is the poloidal current outside the $\unicode[STIX]{x1D713}$ surface. Computing the shape gradient of $f_{QS}$ requires the addition of a bulk force to the adjoint force balance equation which does not take the form of the divergence of a tensor pressure. These calculations will be reported in a separate publication. We anticipate there will be numerous additional applications of this technique for efficient optimization of MHD equilibria.
Acknowledgements
This work was supported by the US Department of Energy through grants DE-FG02-93ER-54197 and DE-FC02-08ER-54964. The computations presented in this paper have used resources at the National Energy Research Scientific Computing Center (NERSC).
Appendix A. Derivation of adjoint relation
The quantity $U_{P}=U_{P_{1}}+U_{P_{2}}$ consists of two terms, accounting for changes to the vector potential due to MHD perturbations
and changes to the rotational transform,
The quantity $U_{P_{1}}$ can be expressed by using (2.14) and applying the divergence theorem to the pressure gradient terms,
We will define $\unicode[STIX]{x1D6FF}\hat{\boldsymbol{B}}_{1,2}=\unicode[STIX]{x1D735}\times (\unicode[STIX]{x1D743}_{1,2}\times \boldsymbol{B})$ such that $\unicode[STIX]{x1D6FF}\boldsymbol{B}_{1,2}=\unicode[STIX]{x1D6FF}\hat{\boldsymbol{B}}_{1,2}-\unicode[STIX]{x1D735}\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}_{1,2}\times \unicode[STIX]{x1D735}\unicode[STIX]{x1D701}$ . The terms in (A 3) due to $\unicode[STIX]{x1D6FF}\hat{\boldsymbol{B}}_{1,2}$ can be evaluated using $\boldsymbol{J}=J_{||}\boldsymbol{b}+c\boldsymbol{b}\times \unicode[STIX]{x1D735}p/B$ and (2.3),
The first term in (A 4) can be simplified using $\unicode[STIX]{x1D735}\boldsymbol{\cdot }\boldsymbol{J}=0$ and noting that the perturbation can be written as $\unicode[STIX]{x1D743}_{1,2}=\unicode[STIX]{x1D709}_{1,2}^{\unicode[STIX]{x1D713}}\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}+\unicode[STIX]{x1D709}_{1,2}^{\bot }\boldsymbol{b}\times \unicode[STIX]{x1D735}\unicode[STIX]{x1D713}$ . Applying the identity $\boldsymbol{b}\boldsymbol{\cdot }\unicode[STIX]{x1D6FF}\hat{\boldsymbol{B}}_{1,2}=-B^{2}\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D743}_{1,2}-\unicode[STIX]{x1D743}_{1,2}\boldsymbol{\cdot }\unicode[STIX]{x1D735}B^{2}-4\unicode[STIX]{x03C0}\unicode[STIX]{x1D743}_{1,2}\boldsymbol{\cdot }\unicode[STIX]{x1D735}p$ to the second term, the following expression can be obtained,
Hence we obtain the following expression for $U_{P_{1}}$ ,
We now consider $U_{P_{2}}$ defined in (A 2). Applying (2.10) for the change in toroidal current, integrating by parts in $\unicode[STIX]{x1D713}$ and combining the expressions for $U_{P_{1}}$ (A 3) and $U_{P_{2}}$ (A 2), we obtain
Next we combine $U_{P}$ (A 7) with $U_{B}$ (2.16) and add $U_{c}$ (2.17) to obtain the free boundary adjoint relation (2.18).
To obtain the fixed-boundary adjoint relation, the integral over the plasma volume (2.13) can be related to a surface integral by applying the divergence theorem, (2.19). Using (2.8) and applying several vector identities,
Using the expression for $U_{P}$ , (A 7), expressing the second term in (A 8) as a perturbed current using (2.10), the fixed-boundary adjoint relation (2.20) is obtained.
Appendix B. Alternate derivation of fixed-boundary adjoint relation
The MHD force operator,
possesses the following self-adjointness property (Bernstein et al. Reference Bernstein, Frieman, Kruskal and Kulsrud1958; Goedbloed & Poedts Reference Goedbloed and Poedts2004),
where $\unicode[STIX]{x1D6FF}\hat{\boldsymbol{B}}_{1,2}=\unicode[STIX]{x1D735}\times (\unicode[STIX]{x1D743}_{1,2}\times \boldsymbol{B})$ is the perturbed field corresponding to the MHD perturbations. For perturbations described by (2.8)–(2.10) and (2.14), we have the following force operator,
Using (B 3) and several vector identities, the left-hand side of (B 2) can be written as
In arriving at (B 4), we use $\boldsymbol{J}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}=0$ . Using (2.10) to re-express the first two terms on the right-hand side,
Appendix C. Interpretation of the displacement vector
For MHD perturbations such that $\unicode[STIX]{x1D6FF}\boldsymbol{B}=\unicode[STIX]{x1D735}\times (\unicode[STIX]{x1D743}\times \boldsymbol{B})$ the displacement can be interpreted as a vector describing the motion of a field lines. Thus a normal perturbation to the surface of the plasma as in (1.1) can be expressed in terms of the displacement vector,
For perturbations that allow for changes in the rotational transform it remains to be shown that a similar relation can be found.
As we require that $\unicode[STIX]{x1D713}$ remain a flux surface label in the perturbed equilibrium, the Lagrangian perturbation to $\unicode[STIX]{x1D713}$ at fixed position is
The perturbed magnetic field, $\boldsymbol{B}^{\prime }=\boldsymbol{B}+\unicode[STIX]{x1D6FF}\boldsymbol{B}$ must remain tangent to $\unicode[STIX]{x1D713}^{\prime }=\unicode[STIX]{x1D713}+\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D713}$ surfaces; thus to first order in the perturbation,
Applying the form for the perturbed field allowing for changes in the rotational transform, $\unicode[STIX]{x1D6FF}\boldsymbol{B}=\unicode[STIX]{x1D735}\times (\unicode[STIX]{x1D743}\times \boldsymbol{B}-\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}\unicode[STIX]{x1D735}\unicode[STIX]{x1D701})$ , and using several vector identities, the following condition is obtained
This implies that $\unicode[STIX]{x1D6FF}\boldsymbol{r}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}=\unicode[STIX]{x1D743}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D713}+F(\unicode[STIX]{x1D713})$ , where $F(\unicode[STIX]{x1D713})$ is some flux function which can be determined by requiring that the perturbation to the toroidal flux as a function of $\unicode[STIX]{x1D713}$ vanishes, $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}_{T}(\unicode[STIX]{x1D713})=0$ .
The perturbed toroidal flux through a surface labelled by $\unicode[STIX]{x1D713}$ contains two terms, corresponding to the flux of the unperturbed field through the perturbed surface and the perturbed field through the unperturbed surface,
Using the form for $\unicode[STIX]{x1D6FF}\boldsymbol{B}$ , applying the divergence theorem and noting that $\boldsymbol{B}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\unicode[STIX]{x1D701}=\sqrt{g}^{-1}$ , the following condition is obtained,
By requiring that $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D6F7}_{T}(\unicode[STIX]{x1D713})=0$ , we find that $F(\unicode[STIX]{x1D713})=0$ . Thus we can express shape gradients in the form of (C 1) even when the rotational transform is allowed to vary.