Home
Hostname: page-component-568f69f84b-d8fc5 Total loading time: 1.235 Render date: 2021-09-19T00:17:35.470Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": true, "newCiteModal": false, "newCitedByModal": true, "newEcommerce": true, "newUsageEvents": true }

# Reduction of the statistical error in electromagnetic gyrokinetic particle-in-cell simulations

Published online by Cambridge University Press:  20 February 2019

*Corresponding

## Abstract

The $\unicode[STIX]{x1D6FF}f$-PIC method is widely used for electrostatic particle-in-cell (PIC) simulations. Its basic idea is the ansatz $f=f_{0}+\unicode[STIX]{x1D6FF}f$ ($\unicode[STIX]{x1D6FF}f$-ansatz) where the particle distribution function $f$ is split into a usually time-independent background $f_{0}$ and a time-dependent perturbation $\unicode[STIX]{x1D6FF}f$. In principle, it can also be used for electromagnetic gyrokinetic PIC simulations, but the required number of markers can be so large that PIC simulations become impractical. The reason is a decreasing efficiency of the $\unicode[STIX]{x1D6FF}f$-ansatz for the so-called ‘Hamiltonian formulation’ using $p_{\Vert }$ as a dynamic variable. As a result, the density and current moment of the distribution function develop large statistical errors. To overcome this obstacle we propose to solve the potential equations in the symplectic formulation using $v_{\Vert }$ as a dynamic variable. The distribution function itself is still evolved in the Hamiltonian formulation which is better suited for the numerical integration of the parallel dynamics. The contributions from the full Jacobian of phase space, a finite velocity sphere of the simulation domain and a shifted Maxwellian as a background are considered. Special care has been taken at the discretisation level to make damped magnetohydrodynamics (MHD) mode simulations within a realistic gyrokinetic model feasible. This includes devices like e.g. large tokamaks with a small aspect ratio.

## Keywords

Type
Research Article
Creative Commons
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

## 1 Introduction

The $\unicode[STIX]{x1D6FF}f$ -PIC method (Dimits & Lee Reference Dimits and Lee1993; Parker & Lee Reference Parker and Lee1993) is widely used for electrostatic gyrokinetic particle-in-cell (PIC) simulations to discretise the gyro-centre particle number distribution function  $f_{s}$ of a species  $s$ . It is based on the $\unicode[STIX]{x1D6FF}f$ -ansatz

(1.1) $$\begin{eqnarray}f_{s}=f_{0s}+\unicode[STIX]{x1D6FF}f_{s},\end{eqnarray}$$

where the distribution function  $f_{s}$ is split into a usually time-independent background  $f_{0s}$ and a time-dependent perturbation $\unicode[STIX]{x1D6FF}f_{s}$ . As long as the perturbed part  $\unicode[STIX]{x1D6FF}f_{s}$ remains small in comparison to the background part  $f_{0s}$ , which is usually chosen to be a Maxwellian, the $\unicode[STIX]{x1D6FF}f$ -method reduces the statistical noise in some situations dramatically.

Although there is no approximation involved in the $\unicode[STIX]{x1D6FF}f$ -ansatz its basic idea stems from a physically motivated perturbative approach where a small deviation from an equilibrium state should be calculated. Hence, the $\unicode[STIX]{x1D6FF}f$ -method for PIC was originally derived by inserting the $\unicode[STIX]{x1D6FF}f$ -ansatz into the Vlasov–Maxwell equations, which has the disadvantage that an additional evolution equation for the perturbed weights of the Monte Carlo particles (markers) appears. For a long time it did not become clear that, in the nonlinear case, this equation was a dummy equation resulting from its derivation using the $\unicode[STIX]{x1D6FF}f$ -ansatz. In principle, its solution is trivial for the nonlinear case and a time integration can be avoided (Allfrey & Hatzky Reference Allfrey and Hatzky2003). Nevertheless, for the linear case, the evolution equation for the weights needs to be solved. In addition, the question about the quantitative reduction of the statistical error for a given $\unicode[STIX]{x1D6FF}f$ was not answered. Both problems show that there was a lack of a solid mathematical framework for a rigorous description. Finally, Aydemir (Reference Aydemir1994) conveyed the message that the conventional $\unicode[STIX]{x1D6FF}f$ -method for PIC can be traced back to a standard Monte Carlo method, called the control variates method, which is widely used in the Monte Carlo community as a variance reduction method. Usually, the variance and with it the statistical error can be numerically computed within the simulation so that it is possible to quantify and to monitor the error reduction by the $\unicode[STIX]{x1D6FF}f$ -method as a control variates method.

In principle, it is possible to use the $\unicode[STIX]{x1D6FF}f$ -method for electromagnetic gyrokinetic PIC simulation (Mishchenko, Hatzky & Könies Reference Mishchenko, Hatzky and Könies2004a ), but the time-step size can be quite restrictive and the required number of markers can be so large that PIC simulations become unfeasible. Hence, in the past three decades, many attempts have been made to improve electromagnetic PIC algorithms in two respects: the increase of the time-step size (Kleiber et al. Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016) and the reduction of the statistical error (Mishchenko et al. Reference Mishchenko, Bottino, Hatzky, Sonnendrücker, Kleiber and Könies2017). In the following, we will focus on the latter in the context of simulation of damped magnetohydrodynamics (MHD) modes.

The symplectic ( $v_{\Vert }$ -) formulation of the gyrokinetic Vlasov–Maxwell system has the disadvantage that a partial time derivative of the perturbed parallel magnetic potential  $\unicode[STIX]{x1D6FF}A_{\Vert }$ occurs in the parallel dynamics. Hence, it is very difficult to integrate the parallel dynamics of the markers in time numerically. On the contrary, the so-called ‘ $p_{\Vert }$ -formulation’ (Hahm, Lee & Brizard Reference Hahm, Lee and Brizard1988) (also called the Hamiltonian formulation) of the gyrokinetic Vlasov–Maxwell system does not have this problem. But it suffers from a large statistical variance in the evaluation of the moments of the distribution function.

The evolution of the gyrokinetic distribution function depends on the chosen formulation, therefore the gyro-centre distribution function in Hamiltonian ( $p_{\Vert }$ -) formulation,  $f^{\text{h}}$ , evolves differently from the distribution function in the symplectic ( $v_{\Vert }$ -) formulation, $f^{\text{s}}$ . In particular the $p_{\Vert }$ -coordinate is shifted in comparison to the $v_{\Vert }$ -coordinate by a linear coordinate transformation. The distribution function is shifted accordingly with the result that the background $f_{0}$ in the $\unicode[STIX]{x1D6FF}f$ -ansatz of the Hamiltonian formulation becomes misaligned. As a consequence, the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ , evolves a so-called ‘adiabatic (Boltzmann) part’ in comparison to $\unicode[STIX]{x1D6FF}f^{\text{s}}$ . The adiabatic part can be very pronounced so that its contribution to the variance becomes dominant. In such a case, its complement, the physically relevant non-adiabatic part of the distribution function, is exceeded by the adiabatic response to the perturbed magnetic potential  $\unicode[STIX]{x1D6FF}A_{\Vert }$ . This produces a severe signal-to-noise problem, especially at high density, i.e. high beta cases, and/or small perpendicular wavenumbers  $k_{\bot }$ where the relevant signal is so small that it is usually swamped by the statistical noise.

As the evaluation of the moments of the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{s}}$ , does not suffer from the spurious adiabatic part it would make sense to evaluate the moments in the symplectic formulation instead. Therefore, we propose a hybrid scheme which evolves the equations of motion and thus the distribution function in the Hamiltonian formulation, but evaluates the potential equations in the symplectic formulation. For this a transformation from $\unicode[STIX]{x1D6FF}f^{\text{h}}$ to $\unicode[STIX]{x1D6FF}f^{\text{s}}$ becomes necessary. In the linear case, the derived scheme is equivalent to an enhanced control variates method (Hatzky, Könies & Mishchenko Reference Hatzky, Könies and Mishchenko2007) which has been presented to solve the inaccuracy problem.

Although the basic idea of the proposed algorithm is straightforward, the required accuracy of the numerical implementation is very high. This is a consequence of the fact that in the MHD limit ( $k_{\bot }\unicode[STIX]{x1D70C}\rightarrow 0$ ) the ratio between the adiabatic and non-adiabatic parts of the distribution function can be more than three orders of magnitude. Therefore, the implemented transformation of the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ to $\unicode[STIX]{x1D6FF}f^{\text{s}}$ , can only work if the numerical error of $\unicode[STIX]{x1D6FF}f^{\text{h}}$ is significantly smaller than the magnitude of  $\unicode[STIX]{x1D6FF}f^{\text{s}}$ . Small inconsistencies at the level of the discretisation and implementation, which are usually unimportant, have to be avoided to be able to run a MHD mode simulation within a gyrokinetic model. More complex geometries, e.g. large tokamaks with a small aspect ratio, are especially challenging.

This situation is aggravated by the fact that we also want to simulate damped modes. For a damped mode the inherent statistical noise accumulates over time whilst for a sufficiently unstable mode the simulation can keep the noise at bay. In addition, the MHD modes become marginally stable in the MHD limit so that the damping rates become arbitrarily small. Therefore, the resolving power of the PIC code must be high enough to yield quantitatively correct damping rates. This is one of the most challenging tasks for PIC codes which demands a precise discretisation of the model equations and the reduction of all sources of error to a minimum. To be able to reproduce our approach it is indispensable to describe the numerics in great detail.

In this context, we will address the following topics:

1. (i) Shifted Maxwellian as background distribution function $f_{0}$ .

2. (ii) Effect of the full phase-space Jacobian.

3. (iii) Effect of a limited velocity sphere.

4. (iv) Solving of the potential equations in the symplectic formulation.

5. (v) Consistent finite element approach for the discretisation of the parallel electric field perturbation in the parallel dynamics.

## 2 The statistical error in Monte Carlo integration

There is always a statistical error involved when it comes to the evaluation of the moments of the distribution function  $f$ via a Monte Carlo approach. Such moments are derived by evaluating integrals over the phase-space volume $\unicode[STIX]{x1D6FA}$

(2.1) $$\begin{eqnarray}I(\tilde{\unicode[STIX]{x1D6EC}})\stackrel{\text{def}}{=}\int _{\unicode[STIX]{x1D6FA}}\tilde{\unicode[STIX]{x1D6EC}}(\boldsymbol{z})f(\boldsymbol{z})\,{\mathcal{J}}_{z}\,\text{d}^{6}z,\end{eqnarray}$$

where $\tilde{\unicode[STIX]{x1D6EC}}(\boldsymbol{z})$ is a general function of the phase-space coordinates $\boldsymbol{z}$ with the Jacobian  ${\mathcal{J}}_{z}$ . For example, $I(\tilde{\unicode[STIX]{x1D6EC}})$ would be the total particle number if $\tilde{\unicode[STIX]{x1D6EC}}=1$ , and the integral is evaluated over the whole phase space. If $\boldsymbol{z}$ is a continuous random variable in $\mathbb{R}^{n}$ with a probability density  $g(\boldsymbol{z})$ in the phase-space volume $\unicode[STIX]{x1D6FA}$ such that

(2.2a,b ) $$\begin{eqnarray}\int _{\unicode[STIX]{x1D6FA}}g(\boldsymbol{z}){\mathcal{J}}_{z}\,\text{d}^{6}z=1\quad \text{and}\quad g(\boldsymbol{z})>0,\end{eqnarray}$$

we can draw an independent random sample ( $\boldsymbol{z}_{1},\boldsymbol{z}_{2},\ldots ,\boldsymbol{z}_{N_{\text{p}}}$ ) to place our $N_{\text{p}}$ Monte Carlo sampling points (markers) within phase space. Then the integral for $I(\tilde{\unicode[STIX]{x1D6EC}})$ can be written as an expected value with respect to the distribution $g(\boldsymbol{z})$

(2.3) $$\begin{eqnarray}I(\tilde{\unicode[STIX]{x1D6EC}})=\text{E}[X]\stackrel{\text{def}}{=}\int _{\unicode[STIX]{x1D6FA}}X(\boldsymbol{z})\,g(\boldsymbol{z}){\mathcal{J}}_{z}\,\text{d}^{6}z\quad \text{with }X(\boldsymbol{z})\stackrel{\text{def}}{=}\frac{\tilde{\unicode[STIX]{x1D6EC}}(\boldsymbol{z})f(\boldsymbol{z})}{g(\boldsymbol{z})},\end{eqnarray}$$

where $\text{E}[X]$ is the expected value of the random variable $X$ . In addition, we define the variance of $X$ by

(2.4) $$\begin{eqnarray}\text{Var}_{g}[X]\stackrel{\text{def}}{=}\text{E}[(X-\text{E}[X])^{2}].\end{eqnarray}$$

Thus, in contrast to the expected value, which is in our case (see  (2.3)) independent of the marker distribution function  $g(\boldsymbol{z})$ , the choice of $g(\boldsymbol{z})$ strongly influences the value of the variance.

An unbiased Monte Carlo estimator for the integral $I(\tilde{\unicode[STIX]{x1D6EC}})$ , which approximates $\text{E}[X]$ up to the statistical error $\unicode[STIX]{x1D700}_{\text{E}}$ , is given by the sum over all marker weights  $w_{p}$ :

(2.5) $$\begin{eqnarray}I(\tilde{\unicode[STIX]{x1D6EC}})=\text{E}[X]=\frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\tilde{\unicode[STIX]{x1D6EC}}(\boldsymbol{z}_{p})\,w_{p}\,\pm \,\unicode[STIX]{x1D700}_{\text{E}}\quad \text{where }w_{p}\stackrel{\text{def}}{=}\frac{f(\boldsymbol{z}_{p})}{g(\boldsymbol{z}_{p})}.\end{eqnarray}$$

Accordingly, an unbiased estimator for the variance can be given:

(2.6) $$\begin{eqnarray}\text{Var}_{g}[X]=\text{E}[X^{2}]-(\text{E}[X])^{2}=\frac{1}{N_{\text{p}}-1}\left(\mathop{\sum }_{p=1}^{N_{\text{p}}}[\tilde{\unicode[STIX]{x1D6EC}}(\boldsymbol{z}_{p})\,w_{p}]^{2}-\frac{1}{N_{\text{p}}}\left[\mathop{\sum }_{p=1}^{N_{\text{p}}}\tilde{\unicode[STIX]{x1D6EC}}(\boldsymbol{z}_{p})\,w_{p}\right]^{2}\right)\,\pm \unicode[STIX]{x1D700}_{\text{Var}},\end{eqnarray}$$

where $\unicode[STIX]{x1D700}_{\text{Var}}$ denotes the statistical error of the variance.

Finally, the statistical error of $\text{E}[X]$ for a number of $N_{\text{p}}$ markers is given as

(2.7) $$\begin{eqnarray}\unicode[STIX]{x1D700}_{\text{E}}\simeq \frac{\unicode[STIX]{x1D70E}}{\sqrt{N_{\text{p}}}},\end{eqnarray}$$

with the definition of the standard deviation $\unicode[STIX]{x1D70E}=\sqrt{\text{Var}_{g}[X]}$ . One sees that the convergence rate of $1/\sqrt{N_{\text{p}}}$ is quite poor, i.e. to halve the error one needs four times more markers. Hence, it would be much more efficient if the standard deviation $\unicode[STIX]{x1D70E}$ could be reduced. Fortunately, for the Monte Carlo approach there exist variance reduction methods like e.g. the control variates method (see appendix A) which pursue exactly this aim. The effectiveness of such a method can be easily monitored by calculating the variance before and after applying it.

### 2.1 Some useful properties of the weights

Using weights gives a high degree of flexibility. With just one marker distribution function  $g$ different distribution functions $f$ and $\tilde{f}$ can be expressed by just rescaling the weights:

(2.8) $$\begin{eqnarray}\tilde{w}_{p}\stackrel{\text{def}}{=}\frac{\tilde{f}(\boldsymbol{z}_{p})}{g(\boldsymbol{z}_{p})}=\frac{\tilde{f}(\boldsymbol{z}_{p})}{f(\boldsymbol{z}_{p})}\frac{f(\boldsymbol{z}_{p})}{g(\boldsymbol{z}_{p})}=\frac{\tilde{f_{p}}}{f_{p}}w_{p}.\end{eqnarray}$$

Here we have introduced the abbreviation $f_{p}=\,f(\boldsymbol{z}_{p})$ to express that we want to evaluate a quantity at the position of the marker $\boldsymbol{z}_{p}$ in phase space.

If we express the marker distribution function  $g$ itself by the weights, the weights become equal to one. In such a case, we can easily check that the normalisation condition of the marker distribution function, equation (2.2), is fulfilled by our unbiased estimator for the expected value:

(2.9) $$\begin{eqnarray}\frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\left.w_{p}\right|_{f_{p}=g_{p}}=\frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}1=1.\end{eqnarray}$$

In addition, we can define a phase-space volume  $\unicode[STIX]{x1D6FA}_{p}$ for each marker:

(2.10) $$\begin{eqnarray}\unicode[STIX]{x1D6FA}\simeq \frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\left.w_{p}\right|_{f_{p}=1}=\frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\frac{1}{g_{p}}=\mathop{\sum }_{p=1}^{N_{\text{p}}}\frac{1}{N_{\text{p}}g_{p}}=\mathop{\sum }_{p=1}^{N_{\text{p}}}\unicode[STIX]{x1D6FA}_{p}\quad \text{where }\unicode[STIX]{x1D6FA}_{p}\stackrel{\text{def}}{=}\frac{1}{N_{\text{p}}g_{p}}.\end{eqnarray}$$

Note that the phase-space volume  $\unicode[STIX]{x1D6FA}_{p}$ as defined here is a constant of motion along the trajectories as long as $\text{d}g/\text{d}t=0$ , which is the case for a Hamiltonian flow.

## 3 The gyrokinetic model

### 3.1 The model equations in the symplectic formulation

The gyrokinetic equations in the symplectic formulation ( $v_{\Vert }$ -formulation) as postulated by Kleiber et al. (Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016) are used to calculate the time evolution of the gyro-centre distribution function $f^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)$ of each species:

(3.1) $$\begin{eqnarray}\frac{\text{d}f^{\text{s}}}{\text{d}t}=\frac{\unicode[STIX]{x2202}f^{\text{s}}}{\unicode[STIX]{x2202}t}+\unicode[STIX]{x1D735}f^{\text{s}}\boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}}{\text{d}t}+\frac{\unicode[STIX]{x2202}f^{\text{s}}}{\unicode[STIX]{x2202}v_{\Vert }}\frac{\text{d}v_{\Vert }}{\text{d}t}+\frac{\unicode[STIX]{x2202}f^{\text{s}}}{\unicode[STIX]{x2202}\tilde{\unicode[STIX]{x1D707}}}\frac{\text{d}\tilde{\unicode[STIX]{x1D707}}}{\text{d}t}=0,\end{eqnarray}$$

where $\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}}$ are the gyro-centre position, parallel velocity and magnetic moment per unit mass. The gyro-centre distribution function is gyrotropic, i.e. it is independent of the gyro-phase angle  $\unicode[STIX]{x1D6FC}$ of the gyration. Without loss of generality we restrict ourselves to a model which includes just ions and electrons.

The key idea of the Monte Carlo method is that the markers follow the characteristics of (3.1), i.e. the trajectories of the physical particles of the plasma. Therefore, the marker distribution  $g^{\text{s}}$ has to evolve in the same way as the phase-space distribution function  $f^{\text{s}}$ :

(3.2) $$\begin{eqnarray}\frac{\text{d}g^{\text{s}}}{\text{d}t}=\frac{\unicode[STIX]{x2202}g^{\text{s}}}{\unicode[STIX]{x2202}t}+\unicode[STIX]{x1D735}g^{\text{s}}\boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}}{\text{d}t}+\frac{\unicode[STIX]{x2202}g^{\text{s}}}{\unicode[STIX]{x2202}v_{\Vert }}\frac{\text{d}v_{\Vert }}{\text{d}t}+\frac{\unicode[STIX]{x2202}g^{\text{s}}}{\unicode[STIX]{x2202}\tilde{\unicode[STIX]{x1D707}}}\frac{\text{d}\tilde{\unicode[STIX]{x1D707}}}{\text{d}t}=0.\end{eqnarray}$$

The equations of motion for the gyro-centre trajectories in reduced phase space $(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})$ are

(3.3) $$\begin{eqnarray}\displaystyle \frac{\text{d}\boldsymbol{R}}{\text{d}t} & = & \displaystyle \,v_{\Vert }\boldsymbol{b}+\frac{1}{B_{\Vert }^{\ast \text{s}}}\boldsymbol{b}\times \unicode[STIX]{x1D735}(\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\rangle -v_{\Vert }\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle )\nonumber\\ \displaystyle & & \displaystyle +\,\frac{m}{q}\left[\frac{\tilde{\unicode[STIX]{x1D707}}B+v_{\Vert }^{2}}{BB_{\Vert }^{\ast \text{s}}}\boldsymbol{b}\times \unicode[STIX]{x1D735}B+\frac{v_{\Vert }^{2}}{BB_{\Vert }^{\ast \text{s}}}(\unicode[STIX]{x1D735}\times \boldsymbol{B})_{\bot }\right]+\frac{v_{\Vert }}{B_{\Vert }^{\ast \text{s}}}\boldsymbol{b}\times \unicode[STIX]{x1D73F}\,\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle ,\end{eqnarray}$$
(3.4) $$\begin{eqnarray}\displaystyle \frac{\text{d}v_{\Vert }}{\text{d}t} & = & \displaystyle -\frac{q}{m}\left(\frac{\unicode[STIX]{x2202}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }{\unicode[STIX]{x2202}t}+\boldsymbol{b}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\rangle +\frac{1}{B_{\Vert }^{\ast \text{s}}}\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \boldsymbol{\cdot }\boldsymbol{b}\times \unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\rangle \right)\nonumber\\ \displaystyle & & \displaystyle -\,\frac{\tilde{\unicode[STIX]{x1D707}}}{B_{\Vert }^{\ast \text{s}}}\left[\boldsymbol{b}\times \unicode[STIX]{x1D735}B\boldsymbol{\cdot }\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle +\frac{1}{B}\unicode[STIX]{x1D735}B\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times B)_{\bot }\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right]\nonumber\\ \displaystyle & & \displaystyle -\,\frac{1}{B_{\Vert }^{\ast \text{s}}}\!\left(v_{\Vert }+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \!\right)\boldsymbol{b}\times \unicode[STIX]{x1D73F}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\rangle -\tilde{\unicode[STIX]{x1D707}}\,\unicode[STIX]{x1D735}B\boldsymbol{\cdot }\!\left[\boldsymbol{b}+\frac{m}{q}\frac{v_{\Vert }}{BB_{\Vert }^{\ast \text{s}}}(\unicode[STIX]{x1D735}\times \boldsymbol{B})_{\bot }\right]\!,\quad\end{eqnarray}$$
(3.5) $$\begin{eqnarray}\frac{\text{d}\tilde{\unicode[STIX]{x1D707}}}{\text{d}t}=0,\end{eqnarray}$$

with $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}$ the perturbed electrostatic potential, $\unicode[STIX]{x1D6FF}A_{\Vert }$ the perturbed parallel magnetic potential and $q$ and $m$ the charge and mass of the species. Furthermore, we have $\boldsymbol{b}\times \unicode[STIX]{x1D73F}=[\boldsymbol{b}\times \unicode[STIX]{x1D735}B+(\unicode[STIX]{x1D735}\times \boldsymbol{B})_{\bot }]/B$ , $\boldsymbol{B}^{\ast \text{s}}=\unicode[STIX]{x1D735}\times \boldsymbol{A}^{\ast \text{s}}$ , $\boldsymbol{A}^{\ast \text{s}}=\boldsymbol{A}_{0}+(mv_{\Vert }/q+\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle )\boldsymbol{b}$ the so-called ‘modified vector potential’,  $\boldsymbol{A}_{0}$ the background magnetic vector potential corresponding to the equilibrium magnetic field $\boldsymbol{B}=\unicode[STIX]{x1D735}\times \boldsymbol{A}_{0}$ and $\boldsymbol{b}=\boldsymbol{B}/B$ its unit vector. The Jacobian of phase space ${\mathcal{J}}_{z}^{\text{s}}$ is given in the symplectic formulation by

(3.6) $$\begin{eqnarray}{\mathcal{J}}_{z}^{\text{s}}=B_{\Vert }^{\ast \text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})=\boldsymbol{b}\boldsymbol{\cdot }\boldsymbol{B}^{\ast \text{s}}=B+\left(\frac{m}{q}v_{\Vert }+\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b}).\end{eqnarray}$$

The corresponding gyro-averaged potentials are defined as

(3.7a,b ) $$\begin{eqnarray}\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\rangle (\boldsymbol{R},\tilde{\unicode[STIX]{x1D707}})\stackrel{\text{def}}{=}\frac{1}{2\unicode[STIX]{x03C0}}\mathop{\oint }_{0}^{2\unicode[STIX]{x03C0}}\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}(\boldsymbol{R}+\unicode[STIX]{x1D746})\,\text{d}\unicode[STIX]{x1D6FC},\quad \langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle (\boldsymbol{R},\tilde{\unicode[STIX]{x1D707}})\stackrel{\text{def}}{=}\frac{1}{2\unicode[STIX]{x03C0}}\mathop{\oint }_{0}^{2\unicode[STIX]{x03C0}}\unicode[STIX]{x1D6FF}A_{\Vert }(\boldsymbol{R}+\unicode[STIX]{x1D746})\,\text{d}\unicode[STIX]{x1D6FC},\end{eqnarray}$$

where $\unicode[STIX]{x1D746}$ is the gyroradius vector perpendicular to $\boldsymbol{b}$ which can be parametrised by the gyro-phase angle  $\unicode[STIX]{x1D6FC}$

(3.8a,b ) $$\begin{eqnarray}\unicode[STIX]{x1D746}(\boldsymbol{R},\tilde{\unicode[STIX]{x1D707}},\unicode[STIX]{x1D6FC})\stackrel{\text{def}}{=}\unicode[STIX]{x1D70C}[\cos (\unicode[STIX]{x1D6FC})\boldsymbol{e}_{\bot 1}-\sin (\unicode[STIX]{x1D6FC})\boldsymbol{e}_{\bot 2}]\quad \text{and}\quad \unicode[STIX]{x1D70C}\stackrel{\text{def}}{=}\frac{\sqrt{2B\tilde{\unicode[STIX]{x1D707}}}}{\unicode[STIX]{x1D6FA}_{\text{c}}},\end{eqnarray}$$

with $\unicode[STIX]{x1D6FA}_{\text{c}}=|q|B/m$ the cyclotron frequency. Note that $\unicode[STIX]{x1D6FC}$ also corresponds to the angular coordinate in velocity space, but we are not interested here in the actual direction of $\unicode[STIX]{x1D746}$ which is determined by the fast gyromotion. In the equations of motion the gradient of the gyro-averaged potentials, $\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\rangle$ and $\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle$ , has to be known. It can be calculated analytically as shown in appendix B. Whenever gyro-averaging is done, we approximate the orientation of the gyro-ring to lie in the poloidal plane of the plasma device. Furthermore, we assume a reflecting boundary condition for the gyro-ring, i.e. the part of the gyro-ring leaving the simulation domain is reflected back into the domain.

In a slab and cylindrical geometry it is possible to use the flux label as the radial coordinate. Instead, in a tokamak geometry, the toroidal canonical momentum $\unicode[STIX]{x1D6F9}_{0}$ is a conserved quantity and thus can be used as the radial coordinate. However, it has the disadvantage that the temperature and density profiles are usually given as functions of the toroidal flux $\unicode[STIX]{x1D6F9}$ (see appendix C) which makes their conversion to functions of the toroidal canonical momentum $\unicode[STIX]{x1D6F9}_{0}$ quite complex (see Angelino et al. Reference Angelino, Bottino, Hatzky, Jolliet, Sauter, Tran and Villard2006). Ultimately, in a general geometry, i.e. non-quasi-symmetric stellarator geometry, there exists, apart from the energy, no other constant of motion. Furthermore, we know that a more complete physical model would include collisions. At lowest order, we can introduce a collision operator  ${\mathcal{C}}[f_{0}]$ which acts only on the background distribution function. As a result, the local Maxwellian $f_{\text{M}}$ is a solution of the equation

(3.9) $$\begin{eqnarray}\frac{\text{d}f_{0}}{\text{d}t}={\mathcal{C}}[f_{0}].\end{eqnarray}$$

Hence, we define the local Maxwellian for species $s=\text{i},\text{e}$ (ions and electrons) as a function of the toroidal flux $\unicode[STIX]{x1D6F9}$

(3.10) $$\begin{eqnarray}f_{\text{M}s}\stackrel{\text{def}}{=}\frac{\hat{n}_{0s}(\unicode[STIX]{x1D6F9})}{(2\unicode[STIX]{x03C0})^{3/2}\,v_{\text{th}s}^{3}(\unicode[STIX]{x1D6F9})}\exp \left[-\frac{\tilde{E}_{s}}{v_{\text{th}s}^{2}(\unicode[STIX]{x1D6F9})}\right],\end{eqnarray}$$

where the particle energy per unit mass $\tilde{E}_{s}$ and thermal velocity $v_{\text{th}s}$ are defined as

(3.11a,b ) $$\begin{eqnarray}\tilde{E}_{s}\stackrel{\text{def}}{=}\tilde{\unicode[STIX]{x1D707}}B(\boldsymbol{R})+\frac{[v_{\Vert }-\hat{u} _{0s}(\unicode[STIX]{x1D6F9})]^{2}}{2},\quad v_{\text{th}s}(\unicode[STIX]{x1D6F9})\stackrel{\text{def}}{=}\sqrt{\frac{k_{\text{B}}T_{s}(\unicode[STIX]{x1D6F9})}{m_{s}}}\end{eqnarray}$$

and where $k_{\text{B}}$ is the Boltzmann constant and $T_{s}$ the temperature. The profiles $\hat{n}_{0s}$ and $\hat{u} _{0s}$ are the density and mean velocity of the local Maxwellian in guiding centre coordinates.

The parallel background and parallel perturbed magnetic potential $A_{0\Vert }=\boldsymbol{b}\boldsymbol{\cdot }\boldsymbol{A}_{0}$ and $\unicode[STIX]{x1D6FF}A_{\Vert }$ add up to

(3.12) $$\begin{eqnarray}A_{\Vert }\stackrel{\text{def}}{=}A_{0\Vert }+\unicode[STIX]{x1D6FF}A_{\Vert }.\end{eqnarray}$$

The particle and current number density are defined as

(3.13) $$\begin{eqnarray}\displaystyle & \displaystyle n_{s}^{\text{s}}(\boldsymbol{x})\stackrel{\text{def}}{=}\int f_{s}^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{s}}\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}, & \displaystyle\end{eqnarray}$$
(3.14) $$\begin{eqnarray}\displaystyle & \displaystyle j_{\Vert s}^{\text{s}}(\boldsymbol{x})\stackrel{\text{def}}{=}q_{s}\int v_{\Vert }\,f_{s}^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{s}}\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}. & \displaystyle\end{eqnarray}$$

The integration is performed over phase space ${\mathcal{J}}_{z}^{\text{s}}\,\text{d}^{6}z=B_{\Vert }^{\ast \text{s}}\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}$ . The gyro-averaged particle and current number density are defined as

(3.15) $$\begin{eqnarray}\displaystyle & \displaystyle \bar{n}_{s}^{\text{s}}(\boldsymbol{x})\stackrel{\text{def}}{=}\int f_{s}^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{s}}\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}, & \displaystyle\end{eqnarray}$$
(3.16) $$\begin{eqnarray}\displaystyle & \displaystyle \bar{j}_{\Vert s}^{\text{s}}(\boldsymbol{x})\stackrel{\text{def}}{=}q_{s}\int v_{\Vert }\,f_{s}^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{s}}\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}. & \displaystyle\end{eqnarray}$$

The same definitions are applied to the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f$ , which leads to the quantities: $\unicode[STIX]{x1D6FF}n^{\text{s}}$ , $\unicode[STIX]{x1D6FF}j_{\Vert }^{\text{s}}$ and $\unicode[STIX]{x1D6FF}\bar{n}^{\text{s}}$ , $\unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{s}}$ .

At an equilibrium state we have $\unicode[STIX]{x1D6FF}A_{\Vert }(t_{0})=0$ and thus the coordinates between the symplectic and Hamiltonian ( $p_{\Vert }$ -)formulation do not differ at the beginning of the simulation, i.e. at $t_{0}$ (compare with § 3.2). In such a case, the symplectic Jacobian of phase space $B_{\Vert }^{\ast \text{s}}$ reduces to the unperturbed one, $B_{\Vert }^{\ast \text{h}}$ (see (3.56)). Hence, the background particle and current number density become the same for the symplectic and Hamiltonian formulation at $t_{0}$ . They are defined as

(3.17) $$\begin{eqnarray}\displaystyle & \displaystyle n_{0s}(\boldsymbol{x})=n_{0s}^{\text{h}}\stackrel{\text{def}}{=}\int f_{0s}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}(\boldsymbol{R},v_{\Vert })\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}, & \displaystyle\end{eqnarray}$$
(3.18) $$\begin{eqnarray}\displaystyle & \displaystyle j_{0\Vert s}(\boldsymbol{x})=j_{0\Vert s}^{\text{h}}\stackrel{\text{def}}{=}q_{s}\int v_{\Vert }\,f_{0s}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}(\boldsymbol{R},v_{\Vert })\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}. & \displaystyle\end{eqnarray}$$

Consistently, we define the particle and current number density of the background $f_{0}$ with gyro-averaging: $\bar{n}_{0s}$ and $\bar{j}_{0\Vert s}$ . The average speed $u_{0s}$ of the background distribution function $f_{0s}$ is given by

(3.19) $$\begin{eqnarray}j_{0\Vert s}(\boldsymbol{x})=q_{s}n_{0s}u_{0s}\quad \text{where }u_{0s}\stackrel{\text{def}}{=}\frac{j_{0\Vert s}}{qn_{0s}}.\end{eqnarray}$$

If the Jacobian of phase space is approximated in (3.17) and (3.18) by $B_{\Vert }^{\ast \text{h}}\simeq B$ and if we use $f_{0s}=f_{\text{M}s}$ we can identify $\hat{n}_{0s}$ and $\hat{u} _{0s}$ with the background particle number density  $n_{0s}$ and the average speed $u_{0s}$ :

(3.20a,b ) $$\begin{eqnarray}\hat{n}_{0s}\simeq n_{0s},\quad \hat{u} _{0s}\simeq u_{0s}.\end{eqnarray}$$

However, when using the complete Jacobian of phase space the relation between these quantities becomes more complex:

(3.21) $$\begin{eqnarray}\displaystyle & \displaystyle n_{0s}(\boldsymbol{x})=\hat{n}_{0s}\left[1+\frac{m_{s}}{q_{s}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\hat{u} _{0s}\right], & \displaystyle\end{eqnarray}$$
(3.22) $$\begin{eqnarray}\displaystyle & \displaystyle u_{0s}(\boldsymbol{x})=\hat{u} _{0s}\left[1+\frac{m_{s}}{q_{s}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\frac{v_{\text{th}s}^{2}+\hat{u} _{0s}^{2}}{\hat{u} _{0s}}\right]\frac{\hat{n}_{0s}}{n_{0s}}, & \displaystyle\end{eqnarray}$$
(3.23) $$\begin{eqnarray}\displaystyle & \displaystyle j_{0\Vert s}(\boldsymbol{x})=q_{s}\hat{n}_{0s}\left[\hat{u} _{0s}+\frac{m_{s}}{q_{s}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}(v_{\text{th}s}^{2}+\hat{u} _{0s}^{2})\right]. & \displaystyle\end{eqnarray}$$

From (3.23) we can conclude that, depending on the background magnetic field, it is possible to have a contribution to the background current  $j_{0\Vert }$ even for an unshifted Maxwellian, i.e.  $\hat{u} _{0}=0$ . In other words, the unshifted Maxwellian in gyro-centre phase space corresponds to a shifted Maxwellian in real phase space.

The quasi-neutrality equation and parallel Ampère’s law close the self-consistent gyrokinetic Vlasov–Maxwell system. In the derivation of the potential equations, we linearise the $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}$ -terms by approximating the distribution function  $f$ by the background distribution function  $f_{0}$ which is assumed to be a shifted Maxwellian $f_{\text{M}s}$ . Furthermore, a long-wavelength approximation is applied to the quasi-neutrality condition where the gyroradius-dependent quantities of the ion gyro-average are expanded up to the order of $\text{O}((k_{\bot }\unicode[STIX]{x1D70C}_{0\text{i}})^{2})$ . The finite gyroradius effects are neglected for the electrons. In addition, we use that in a state of equilibrium the quasi-neutrality imposes

(3.24) $$\begin{eqnarray}\mathop{\sum }_{s=\text{i},\text{e}}q_{s}\bar{n}_{0s}=0,\end{eqnarray}$$

where we have assumed a vanishing background electrostatic potential $\unicode[STIX]{x1D719}_{0}=0$ . Ampère’s law is given by

(3.25) $$\begin{eqnarray}-\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D735}_{\bot }A_{0\Vert }=\unicode[STIX]{x1D707}_{0}\mathop{\sum }_{s=\text{i},\text{e}}\bar{j}_{0\Vert s}.\end{eqnarray}$$

Finally, we assume that the Maxwellian of the ions is unshifted, i.e.  $\hat{u} _{0\text{i}}=0$ .

Thus, by using the $\unicode[STIX]{x1D6FF}f$ -ansatz and taking into account that the Jacobian $B_{\Vert }^{\ast \text{s}}$ is a perturbed quantity in  $\unicode[STIX]{x1D6FF}A_{\Vert }$ (see also (4.14) and (4.15)) the quasi-neutrality condition (see Bottino Reference Bottino2004, p. 159) and Ampère’s law take the following form:

(3.26) $$\begin{eqnarray}\displaystyle & & \displaystyle -\unicode[STIX]{x1D735}\boldsymbol{\cdot }\left(\frac{q_{\text{i}}^{2}n_{0\text{i}}}{k_{\text{B}}T_{\text{i}}}\unicode[STIX]{x1D70C}_{0\text{i}}^{2}\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\right)=q_{\text{i}}\unicode[STIX]{x1D6FF}\bar{n}_{\text{i}}^{\text{s}}+q_{\text{e}}\unicode[STIX]{x1D6FF}n_{\text{e}}^{\text{s}}\nonumber\\ \displaystyle & & \displaystyle \qquad +\,q_{\text{i}}\int \frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}f_{\text{Mi}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}q_{\text{e}}\hat{n}_{0\text{e}}\,\unicode[STIX]{x1D6FF}A_{\Vert },\nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$
(3.27) $$\begin{eqnarray}-\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}A_{\Vert }-\unicode[STIX]{x1D707}_{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}q_{\text{e}}\hat{n}_{0\text{e}}\hat{u} _{0\text{e}}\,\unicode[STIX]{x1D6FF}A_{\Vert }=\unicode[STIX]{x1D707}_{0}(\unicode[STIX]{x1D6FF}\bar{j}_{\Vert \text{i}}^{\text{s}}+\unicode[STIX]{x1D6FF}j_{\Vert \text{e}}^{\text{s}}),\end{eqnarray}$$

where $\unicode[STIX]{x1D70C}_{0s}=\sqrt{m_{s}k_{\text{B}}T_{s}}/(|q_{s}|B)$ is the thermal gyroradius. Due to the quasi-neutrality condition the terms in the second line of (3.26) would vanish if the gyro-averaging of the ions were neglected. Thus for large scale modes the contribution of these terms is usually small.

### 3.2 Transformation between the symplectic and Hamiltonian formulation

Before we describe the gyrokinetic model in the Hamiltonian formulation we want to introduce the coordinate transformation

(3.28) $$\begin{eqnarray}\tilde{p}_{\Vert }(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)=v_{\Vert }+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }(\boldsymbol{R},t)\rangle\end{eqnarray}$$

and its inverse

(3.29) $$\begin{eqnarray}v_{\Vert }(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)=\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }(\boldsymbol{R},t)\rangle ,\end{eqnarray}$$

which links the symplectic with the Hamiltonian formulation. It leaves the spatial coordinate $\boldsymbol{R}=\boldsymbol{R}^{\text{s}}=\boldsymbol{R}^{\text{h}}$ untouched but changes the coordinate $v_{\Vert }$ from the symplectic formulation to $\tilde{p}_{\Vert }$ of the Hamiltonian one. The difference between the $v_{\Vert }$ - and the $\tilde{p}_{\Vert }$ -coordinates depends strongly on the charge to mass ratio and is much more pronounced for the electrons than for the ions.

The transformation of the first-order partial derivatives of a function $h^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)$ in the symplectic formulation to its equivalent $h^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)$ in the Hamiltonian formulation is given by the Jacobian matrix. From this, it follows that the gradient transforms as

(3.30) $$\begin{eqnarray}\unicode[STIX]{x1D735}h^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)=\unicode[STIX]{x1D735}h^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)-\frac{q}{m}\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}A_{\Vert }(\boldsymbol{R},t)\rangle \frac{\unicode[STIX]{x2202}h^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)}{\unicode[STIX]{x2202}v_{\Vert }},\end{eqnarray}$$

where we used the fact that the partial velocity derivative does not change

(3.31) $$\begin{eqnarray}\frac{\unicode[STIX]{x2202}h^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)}{\unicode[STIX]{x2202}\tilde{p}_{\Vert }}=\frac{\unicode[STIX]{x2202}h^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)}{\unicode[STIX]{x2202}v_{\Vert }}.\end{eqnarray}$$

In addition, the total time derivative of $\tilde{p}_{\Vert }$ transforms as

(3.32) $$\begin{eqnarray}\frac{\text{d}\tilde{p}_{\Vert }}{\text{d}t}=\frac{\text{d}v_{\Vert }}{\text{d}t}+\frac{q}{m}\left(\frac{\unicode[STIX]{x2202}\langle \unicode[STIX]{x1D6FF}A_{\Vert }(\boldsymbol{R},t)\rangle }{\unicode[STIX]{x2202}t}+\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}A_{\Vert }(\boldsymbol{R},t)\rangle \boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}}{\text{d}t}\right).\end{eqnarray}$$

Furthermore, the coordinate transformation implies various dependencies between quantities in the symplectic and Hamiltonian formulation. First, the phase-space and marker distribution functions are connected in the following way:

(3.33) $$\begin{eqnarray}\displaystyle & \displaystyle f^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)=f^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert }(v_{\Vert }),\tilde{\unicode[STIX]{x1D707}},t)=f^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t), & \displaystyle\end{eqnarray}$$
(3.34) $$\begin{eqnarray}\displaystyle & \displaystyle g^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)=g^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert }(v_{\Vert }),\tilde{\unicode[STIX]{x1D707}},t)=g^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t). & \displaystyle\end{eqnarray}$$

This expresses the fact that the distribution functions are shifted by $-q/m\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle$ in the $\tilde{p}_{\Vert }$ -coordinate frame compared to the $v_{\Vert }$ -coordinate frame.

For completeness, also the Jacobian of phase space, equations (3.6) and (3.56), has the property:

(3.35) $$\begin{eqnarray}B_{\Vert }^{\ast \text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },t)=B_{\Vert }^{\ast \text{h}}(\boldsymbol{R},\tilde{p}_{\Vert }(v_{\Vert }),t)=B_{\Vert }^{\ast \text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t).\end{eqnarray}$$

As a direct consequence of (3.33) and (3.34) the weights are invariant under the coordinate transformation (3.28):

(3.36) $$\begin{eqnarray}w^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})=w^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert }(v_{\Vert }),\tilde{\unicode[STIX]{x1D707}})=w^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}}),\end{eqnarray}$$

with the definitions

(3.37a,b ) $$\begin{eqnarray}w^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})\stackrel{\text{def}}{=}\frac{f^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)}{g^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)}\quad \text{and}\quad w^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}})\stackrel{\text{def}}{=}\frac{f^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)}{g^{\text{s}}(\boldsymbol{R},v_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)},\end{eqnarray}$$

i.e. at any time we have the opportunity to shift the phase-space position of the markers from one representation to the other. As the weights are not modified by the transformation, particle number conservation is automatically fulfilled.

#### 3.2.1 Use of the $\unicode[STIX]{x1D6FF}f$ -ansatz for both formulations

Although the coordinates $v_{\Vert }$ and $\tilde{p}_{\Vert }$ are shifted with respect to each other due to the transformation (3.28) the same $\unicode[STIX]{x1D6FF}f$ -ansatz is used for both the symplectic and Hamiltonian formulation:

(3.38) $$\begin{eqnarray}\displaystyle & \displaystyle f^{\text{s}}(v_{\Vert })=f_{0}(v_{\Vert })+\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert }), & \displaystyle\end{eqnarray}$$
(3.39) $$\begin{eqnarray}\displaystyle & \displaystyle f^{\text{h}}(\,\tilde{p}_{\Vert })=f_{0}(\,\tilde{p}_{\Vert })+\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert }). & \displaystyle\end{eqnarray}$$

Hence, the $f_{0}$ background follows the coordinate frame and does not show the relation of  $f^{\text{s}}$ and $f^{\text{h}}$ expressed by (3.33):

(3.40) $$\begin{eqnarray}f_{0}(\,\tilde{p}_{\Vert })=f_{0}\left(v_{\Vert }+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)\neq f_{0}(v_{\Vert }).\end{eqnarray}$$

In other words, the background distribution function  $f_{0}(\,\tilde{p}_{\Vert })$ stays fixed while the distribution function  $f^{\text{h}}(\,\tilde{p}_{\Vert })=f^{\text{s}}(v_{\Vert }(\,\tilde{p}_{\Vert }))$ is shifted in the $\tilde{p}_{\Vert }$ -coordinate frame due to the coordinate transformation (3.28). For $f^{\text{h}}(\,\tilde{p}_{\Vert })$ this means that it gets less and less aligned with the background  $f_{0}(\,\tilde{p}_{\Vert })$ as the $v_{\Vert }$ - and $\tilde{p}_{\Vert }$ -coordinates move apart (see figure 1). This has to be compensated by $\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })$ and causes an additional contribution to it, which is usually large in relation to $\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })$ , so that we have $|\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })|>|\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })|$ . As a result, a severe problem with the statistical error within $\unicode[STIX]{x1D6FF}f$ -PIC codes occurs (see § 4.6). In addition, there is not a simple relation between the perturbation to the distribution functions, $\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })$ and  $\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })$ , such as given by (3.33) for the full distribution functions $f^{\text{s}}(v_{\Vert })$ and $f^{\text{h}}(\,\tilde{p}_{\Vert })$ as soon as $\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \neq 0$ . Therefore, the perturbed weights are not invariant under the coordinate transformation (3.28), i.e.  $\unicode[STIX]{x1D6FF}w^{\text{s}}(v_{\Vert })\neq \unicode[STIX]{x1D6FF}w^{\text{h}}(\,\tilde{p}_{\Vert })$ .

Figure 1. (a) The distribution function  $f^{\text{s}}$ (solid red line), the background distribution function  $f_{0}$ (dashed blue line) and the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{s}}$ , (dotted green line) as a function of  $v_{\Vert }$ normalised to the thermal velocity $v_{\text{th}}$ . (b) The distribution function  $f^{\text{h}}$ (solid red line), the background distribution function  $f_{0}$ (dashed blue line) and the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ , (dotted green line) as a function of $\tilde{p}_{\Vert }$ normalised to the thermal velocity $v_{\text{th}}$ . Due to the coordinate transformation (3.28) the distribution function  $f^{\text{h}}$ is shifted compared to $f^{\text{s}}$ (arrow at maximum). As a consequence, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ becomes asymmetric which leads to a non-physical current in the Hamiltonian formulation, the so-called ‘adiabatic current’.

From (3.33) and (3.40) we can derive the following three representations of the perturbation to the distribution functions, $\unicode[STIX]{x1D6FF}f^{\text{s}}$ and $\unicode[STIX]{x1D6FF}f^{\text{h}}$ :

(3.41) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })=f^{\text{h}}(\,\tilde{p}_{\Vert })-f_{0}(v_{\Vert })=f^{\text{h}}(\,\tilde{p}_{\Vert })-f_{0}\left(\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right), & \displaystyle\end{eqnarray}$$
(3.42) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })=\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })+f_{0}(\,\tilde{p}_{\Vert })-f_{0}\left(\,\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right), & \displaystyle\end{eqnarray}$$
(3.43) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })=\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })-\mathop{\sum }_{n=1}^{\infty }\frac{1}{n!}\left.\frac{\unicode[STIX]{x2202}^{n}f_{0}}{\unicode[STIX]{x2202}v_{\Vert }^{n}}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}\left(-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)^{n}. & \displaystyle\end{eqnarray}$$
(3.44) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })=f^{\text{s}}(v_{\Vert })-f_{0}(\,\tilde{p}_{\Vert })=f^{\text{s}}(v_{\Vert })-f_{0}\left(v_{\Vert }+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right), & \displaystyle\end{eqnarray}$$
(3.45) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })=\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })+f_{0}(v_{\Vert })-f_{0}\left(v_{\Vert }+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right), & \displaystyle\end{eqnarray}$$
(3.46) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })=\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })-\mathop{\sum }_{n=1}^{\infty }\frac{1}{n!}\left.\frac{\unicode[STIX]{x2202}^{n}f_{0}}{\unicode[STIX]{x2202}\tilde{p}_{\Vert }^{n}}\right|_{\tilde{p}_{\Vert }=v_{\Vert }}\left(\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)^{n}. & \displaystyle\end{eqnarray}$$

In particular (3.42) and (3.45) show that the difference between $\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })$ and $\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })$ stems from the misalignment of the two background distribution functions $f_{0}(v_{\Vert })$ and $f_{0}(\,\tilde{p}_{\Vert })$ in the two coordinate frames.

In general, we have three different ways to derive the perturbed symplectic weight  $\unicode[STIX]{x1D6FF}w^{\text{s}}$ from the perturbed Hamiltonian weight $\unicode[STIX]{x1D6FF}w^{\text{h}}$ . The first form is suited for the full- $f$ method and is called ‘direct $\unicode[STIX]{x1D6FF}f$ -method’ (see Allfrey & Hatzky (Reference Allfrey and Hatzky2003) and § 5), the second is suited for an algorithm which uses a separate evolution equation for $\unicode[STIX]{x1D6FF}f$ and the third way is to derive e.g. a linearised transformation equation:

(3.47) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}w^{\text{s}}(v_{\Vert })=w^{\text{h}}(\,\tilde{p}_{\Vert })-\frac{1}{g^{\text{h}}(\,\tilde{p}_{\Vert })}f_{0}\left(\,\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right), & \displaystyle\end{eqnarray}$$
(3.48) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}w^{\text{s}}(v_{\Vert })=\unicode[STIX]{x1D6FF}w^{\text{h}}(\,\tilde{p}_{\Vert })+\frac{1}{g^{\text{h}}(\,\tilde{p}_{\Vert })}\left[f_{0}(\,\tilde{p}_{\Vert })-f_{0}\left(\,\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)\right], & \displaystyle\end{eqnarray}$$
(3.49) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}w^{\text{s},\text{lin}}(v_{\Vert })=\unicode[STIX]{x1D6FF}w^{\text{h},\text{lin}}(\,\tilde{p}_{\Vert })+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \frac{1}{g^{\text{h}}(\,\tilde{p}_{\Vert })}\left.\frac{\unicode[STIX]{x2202}f_{0}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}, & \displaystyle\end{eqnarray}$$

with the definitions

(3.50a,b ) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}w^{\text{s}}(v_{\Vert })\stackrel{\text{def}}{=}\frac{\unicode[STIX]{x1D6FF}f^{\text{s}}(v_{\Vert })}{g^{\text{s}}(v_{\Vert })}\quad \text{and}\quad \unicode[STIX]{x1D6FF}w^{\text{h}}(\,\tilde{p}_{\Vert })\stackrel{\text{def}}{=}\frac{\unicode[STIX]{x1D6FF}f^{\text{h}}(\,\tilde{p}_{\Vert })}{g^{\text{h}}(\,\tilde{p}_{\Vert })}.\end{eqnarray}$$

### 3.3 The model equations in the Hamiltonian formulation

In § 3.1 we have introduced the first-order gyrokinetic model in the symplectic formulation. The question is why do we need the same model equations in the Hamiltonian formulation as well? The answer is: just for numerical reasons. The numerical advantage of the Hamiltonian formulation over the symplectic one is that the partial time derivative of  $\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle$ vanishes in the parallel dynamics (compare (3.4) and (3.54)). This is a result of the transformation of the total time derivative of  $v_{\Vert }$ (see (3.32)). The second term on the right-hand side of (3.32) cancels with the partial derivative of $\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle$ – first term on the right-hand side of (3.4) – being part of the parallel dynamics in the symplectic formulation. This is a big advantage of the Hamiltonian formulation over the symplectic one, as on the numerical level such a partial time derivative makes it very difficult to integrate the coupled set of equations of motion and potential equations in time. Therefore, we will use the Hamiltonian formulation to push the markers along the characteristics while using the symplectic moments in the potential equations.

The equations of motion in the Hamiltonian formulation can be derived from the symplectic ones by using the transformations described in (3.28)–(3.32). Alternatively, one can transform the symplectic Lagrangian from $v_{\Vert }$ - to $\tilde{p}_{\Vert }$ -coordinates. The equivalent Lagrangian is of second order in $\unicode[STIX]{x1D6FF}A_{\Vert }$ . It can be used to derive the equations of motion in the Hamiltonian formulation (Kleiber et al. Reference Kleiber, Hatzky, Könies, Mishchenko and Sonnendrücker2016).

The gyrokinetic equations in the Hamiltonian formulation ( $p_{\Vert }$ -formulation) are used to calculate the time evolution of the gyro-centre distribution function $f^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)$ of each species:

(3.51) $$\begin{eqnarray}\frac{\text{d}f^{\text{h}}}{\text{d}t}=\frac{\unicode[STIX]{x2202}f^{\text{h}}}{\unicode[STIX]{x2202}t}+\unicode[STIX]{x1D735}f^{\text{h}}\boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}}{\text{d}t}+\frac{\unicode[STIX]{x2202}f^{\text{h}}}{\unicode[STIX]{x2202}\tilde{p}_{\Vert }}\frac{\text{d}\tilde{p}_{\Vert }}{\text{d}t}+\frac{\unicode[STIX]{x2202}f^{\text{h}}}{\unicode[STIX]{x2202}\tilde{\unicode[STIX]{x1D707}}}\frac{\text{d}\tilde{\unicode[STIX]{x1D707}}}{\text{d}t}=0,\end{eqnarray}$$

where $\tilde{p}_{\Vert }$ is the parallel momentum per unit mass.

The marker distribution  $g^{\text{h}}$ has to evolve in the same way as the phase-space distribution function  $f^{\text{h}}$ :

(3.52) $$\begin{eqnarray}\frac{\text{d}g^{\text{h}}}{\text{d}t}=\frac{\unicode[STIX]{x2202}g^{\text{h}}}{\unicode[STIX]{x2202}t}+\unicode[STIX]{x1D735}g^{\text{h}}\boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}}{\text{d}t}+\frac{\unicode[STIX]{x2202}g^{\text{h}}}{\unicode[STIX]{x2202}\tilde{p}_{\Vert }}\frac{\text{d}\tilde{p}_{\Vert }}{\text{d}t}+\frac{\unicode[STIX]{x2202}g^{\text{h}}}{\unicode[STIX]{x2202}\tilde{\unicode[STIX]{x1D707}}}\frac{\text{d}\tilde{\unicode[STIX]{x1D707}}}{\text{d}t}=0.\end{eqnarray}$$

The equations of motion for the gyro-centre trajectories in reduced phase space $(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})$ are

(3.53) $$\begin{eqnarray}\displaystyle \frac{\text{d}\boldsymbol{R}}{\text{d}t} & = & \displaystyle \left(\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)\boldsymbol{b}+\frac{1}{B_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D735}\left(\langle \unicode[STIX]{x1D713}_{\text{eff}}\rangle +\frac{q}{2m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle ^{2}\right)\nonumber\\ \displaystyle & & \displaystyle +\,\frac{m}{q}\left[\frac{\tilde{\unicode[STIX]{x1D707}}B+\tilde{p}_{\Vert }^{2}}{BB_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D735}B+\frac{\tilde{p}_{\Vert }^{2}}{BB_{\Vert }^{\ast \text{h}}}(\unicode[STIX]{x1D735}\times \boldsymbol{B})_{\bot }\right]-\frac{\tilde{p}_{\Vert }}{B_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D73F}\,\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle ,\end{eqnarray}$$
(3.54) $$\begin{eqnarray}\displaystyle \frac{\text{d}\tilde{p}_{\Vert }}{\text{d}t} & = & \displaystyle -\frac{q}{m}\boldsymbol{b}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\left(\langle \unicode[STIX]{x1D713}_{\text{eff}}\rangle +\frac{q}{2m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle ^{2}\right)-\frac{q}{m}\frac{\tilde{p}_{\Vert }}{B_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D73F}\boldsymbol{\cdot }\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \nonumber\\ \displaystyle & & \displaystyle -\,\frac{\tilde{p}_{\Vert }}{B_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D73F}\boldsymbol{\cdot }\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D713}_{\text{eff}}\rangle -\tilde{\unicode[STIX]{x1D707}}\,\unicode[STIX]{x1D735}B\boldsymbol{\cdot }\left[\boldsymbol{b}+\frac{m}{q}\frac{\tilde{p}_{\Vert }}{BB_{\Vert }^{\ast \text{h}}}(\unicode[STIX]{x1D735}\times \boldsymbol{B})_{\bot }\right],\end{eqnarray}$$
(3.55) $$\begin{eqnarray}\frac{\text{d}\tilde{\unicode[STIX]{x1D707}}}{\text{d}t}=0,\end{eqnarray}$$

where we have $\boldsymbol{B}^{\ast \text{h}}=\unicode[STIX]{x1D735}\times \boldsymbol{A}^{\ast \text{h}}$ and $\boldsymbol{A}^{\ast \text{h}}=\boldsymbol{A}_{0}+m\tilde{p}_{\Vert }/q\boldsymbol{b}$ . The Jacobian of phase space  ${\mathcal{J}}_{z}^{\text{h}}$ is given in the Hamiltonian formulation by

(3.56) $$\begin{eqnarray}{\mathcal{J}}_{z}^{\text{h}}=B_{\Vert }^{\ast \text{h}}(\boldsymbol{R},\tilde{p}_{\Vert })=\boldsymbol{b}\boldsymbol{\cdot }\boldsymbol{B}^{\ast \text{h}}=B+\frac{m}{q}\tilde{p}_{\Vert }\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b}).\end{eqnarray}$$

The gyro-averaged effective potential is defined as

(3.57) $$\begin{eqnarray}\langle \unicode[STIX]{x1D713}_{\text{eff}}\rangle \stackrel{\text{def}}{=}\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}-\tilde{p}_{\Vert }\unicode[STIX]{x1D6FF}A_{\Vert }\rangle =\langle \unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\rangle -\tilde{p}_{\Vert }\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle .\end{eqnarray}$$

The particle and current number density are defined as

(3.58) $$\begin{eqnarray}\displaystyle & \displaystyle n_{s}^{\text{h}}(\boldsymbol{x})\stackrel{\text{def}}{=}\int f_{s}^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}, & \displaystyle\end{eqnarray}$$
(3.59) $$\begin{eqnarray}\displaystyle & \displaystyle j_{\Vert s}^{\text{h}}(\boldsymbol{x})\stackrel{\text{def}}{=}q_{s}\int \tilde{p}_{\Vert }\,f_{s}^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}. & \displaystyle\end{eqnarray}$$

The integration is performed over phase space ${\mathcal{J}}_{z}^{\text{h}}\,\text{d}^{6}z=B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}$ . The gyro-averaged particle and current number density are defined as

(3.60) $$\begin{eqnarray}\displaystyle & \displaystyle \bar{n}_{s}^{\text{h}}(\boldsymbol{x})\stackrel{\text{def}}{=}\int f_{s}^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}, & \displaystyle\end{eqnarray}$$
(3.61) $$\begin{eqnarray}\displaystyle & \displaystyle \bar{j}_{\Vert s}^{\text{h}}(\boldsymbol{x})\stackrel{\text{def}}{=}q_{s}\int \tilde{p}_{\Vert }\,f_{s}^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}. & \displaystyle\end{eqnarray}$$

The same definitions are applied to the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f$ , which leads to the quantities: $\unicode[STIX]{x1D6FF}n^{\text{h}}$ , $\unicode[STIX]{x1D6FF}j_{\Vert }^{\text{h}}$ and $\unicode[STIX]{x1D6FF}\bar{n}^{\text{h}}$ , $\unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{h}}$ . The perturbed magnetic potential averaged over the gyro-phase and over the distribution function is defined as

(3.62) $$\begin{eqnarray}\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }_{s}^{\text{h}}\stackrel{\text{def}}{=}\frac{1}{n_{s}^{\text{h}}}\int f_{s}^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})\,\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle (\mathbf{R},\tilde{\unicode[STIX]{x1D707}})\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}.\end{eqnarray}$$

As already in the symplectic case, a long-wavelength approximation is applied to the quasi-neutrality condition. Thus, the quasi-neutrality condition and Ampère’s law take the following form in the Hamiltonian formulation:

(3.63) $$\begin{eqnarray}\displaystyle & \displaystyle \underbrace{-\unicode[STIX]{x1D735}\boldsymbol{\cdot }\left(\frac{q_{\text{i}}^{2}n_{0\text{i}}}{k_{\text{B}}T_{\text{i}}}\unicode[STIX]{x1D70C}_{0\text{i}}^{2}\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\right)}_{\text{ion polarisation density}}=q_{\text{i}}\unicode[STIX]{x1D6FF}\bar{n}_{\text{i}}^{\text{h}}+q_{\text{e}}\unicode[STIX]{x1D6FF}n_{\text{e}}^{\text{h}}, & \displaystyle\end{eqnarray}$$
(3.64) $$\begin{eqnarray}\displaystyle & \displaystyle -\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}A_{\Vert }+\underbrace{\frac{\unicode[STIX]{x1D6FD}_{\text{i}}^{\text{h}}}{\unicode[STIX]{x1D70C}_{0\text{i}}^{2}}\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }_{\text{i}}^{\text{h}}+\frac{\unicode[STIX]{x1D6FD}_{\text{e}}^{\text{h}}}{\unicode[STIX]{x1D70C}_{0\text{e}}^{2}}\unicode[STIX]{x1D6FF}A_{\Vert }}_{\text{skin terms}}=\unicode[STIX]{x1D707}_{0}\left(\unicode[STIX]{x1D6FF}\bar{j}_{\Vert \text{i}}^{\text{h}}+\unicode[STIX]{x1D6FF}j_{\Vert \text{e}}^{\text{h}}\right), & \displaystyle\end{eqnarray}$$

where the beta parameter is $\unicode[STIX]{x1D6FD}_{s}^{\text{h}}=\unicode[STIX]{x1D707}_{0}n_{s}^{\text{h}}k_{\text{B}}T_{s}/B^{2}$ (note that it is $\unicode[STIX]{x1D6FD}_{0s}=\unicode[STIX]{x1D707}_{0}n_{0s}k_{\text{B}}T_{s}/B^{2}$ which depends only on the background density $n_{0s}$ ). The skin terms in Ampère’s law are a result of the Hamiltonian formulation (see § 4.2).

#### 3.3.1 The long-wavelength approximation of the ion skin term

In our model equations we have neglected the gyroradius effects of the electrons which simplifies the electron skin term significantly. However, for the ions we have to take the gyro-averaging into account. In principle, the discretisation of the gyro-averaging of $\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }_{\text{i}}^{\text{h}}$ as part of the ion skin term has to match the discretisation of the gyro-averaging of the perturbed current density $\unicode[STIX]{x1D6FF}\bar{j}_{\Vert \text{i}}^{\text{h}}$ exactly. Otherwise there will be no exact cancellation of the ion skin term and the gyro-averaged adiabatic current term of the ions (see § 4.7). Nevertheless, there are many situations where the long-wavelength approximation of the ion skin term is appropriate (see e.g. Tronko, Bottino & Sonnendrücker Reference Tronko, Bottino and Sonnendrücker2016):

(3.65) $$\begin{eqnarray}\frac{\unicode[STIX]{x1D6FD}_{\text{i}}^{\text{h}}}{\unicode[STIX]{x1D70C}_{0\text{i}}^{2}}\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }_{\text{i}}^{\text{h}}\simeq \unicode[STIX]{x1D735}\boldsymbol{\cdot }(\unicode[STIX]{x1D6FD}_{\text{i}}^{\text{h}}\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}A_{\Vert })+\frac{\unicode[STIX]{x1D6FD}_{\text{i}}^{\text{h}}}{\unicode[STIX]{x1D70C}_{0\text{i}}^{2}}\unicode[STIX]{x1D6FF}A_{\Vert }.\end{eqnarray}$$

Especially, it is very precise for small $k_{\bot }\unicode[STIX]{x1D70C}_{\text{i}}$ when the cancellation problem is most pronounced. But caution has to be taken as Ampère’s law (3.64) is due to the gyro-averaged ion skin term an integro-differential equation whereas after the long-wavelength approximation it becomes a second-order differential equation. This also simplifies the treatment of the radial boundary condition (see also Dominski et al. Reference Dominski, McMillan, Brunner, Merlo, Tran and Villard2017). To avoid problems with the skin term we favour solving Ampère’s law in the symplectic formulation, equation (3.27). Details are discussed for the linear case in § 8.1.3 and for the nonlinear case in § 8.2.1.

#### 3.3.2 The linear case

For linearised PIC simulations an extra evolution equation for $\unicode[STIX]{x1D6FF}f^{\text{h}}$ is derived by inserting the $\unicode[STIX]{x1D6FF}f$ -ansatz

(3.66) $$\begin{eqnarray}f^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)=f_{0}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}})+\unicode[STIX]{x1D6FF}f^{\text{h}}(\boldsymbol{R},\tilde{p}_{\Vert },\tilde{\unicode[STIX]{x1D707}},t)\end{eqnarray}$$

into the evolution equation (3.51) of the gyro-centre distribution function. As part of the linearisation we assume $\unicode[STIX]{x1D6FF}f^{\text{h},\text{lin}}$ to evolve along the unperturbed gyro-centre trajectories of each species, i.e. by

(3.67) $$\begin{eqnarray}\left.\frac{\text{d}}{\text{d}t}\right|_{0}\unicode[STIX]{x1D6FF}f^{\text{h},\text{lin}}=-\unicode[STIX]{x1D735}f_{0}\boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}^{(1)}}{\text{d}t}-\frac{\unicode[STIX]{x2202}f_{0}}{\unicode[STIX]{x2202}\tilde{p}_{\Vert }}\frac{\text{d}\tilde{p}_{\Vert }^{(1)}}{\text{d}t}.\end{eqnarray}$$

The operator $\left.\text{d}/\text{d}t\right|_{0}$ denotes the derivative along the unperturbed orbits, $\text{d}\boldsymbol{R}^{(1)}/\text{d}t$ and $\text{d}\tilde{p}_{\Vert }^{(1)}/\text{d}t$ contain only those terms of the time derivative which depend on $\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}$ and $\unicode[STIX]{x1D6FF}A_{\Vert }$ .

Although $f_{0}$ might be not an exact solution of the unperturbed, collisionless Vlasov equation $\text{d}f_{0}/\text{d}t|_{0}=0$ , we assume here that it is a kinetic equilibrium if a collision operator ${\mathcal{C}}[f_{0}]$ on the right-hand side of (3.51) is taken into account. Thus, on the right-hand side of (3.67) the term $-\text{d}f_{0}/\text{d}t|_{0}$ cancels with the collision operator ${\mathcal{C}}[f_{0}]$ (see (3.9)).

Consistently, we choose the evolution equation for $g^{\text{h}}$ in such a way that the markers follow the unperturbed trajectories

(3.68) $$\begin{eqnarray}\frac{\text{d}g^{\text{h}}}{\text{d}t}=\frac{\unicode[STIX]{x2202}g^{\text{h}}}{\unicode[STIX]{x2202}t}+\unicode[STIX]{x1D735}g^{\text{h}}\boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}^{(0)}}{\text{d}t}+\frac{\unicode[STIX]{x2202}g^{\text{h}}}{\unicode[STIX]{x2202}\tilde{p}_{\Vert }}\frac{\text{d}\tilde{p}_{\Vert }^{(0)}}{\text{d}t}=0.\end{eqnarray}$$

The equations of motion, equations (3.53) and (3.54), have been split into an unperturbed and a perturbed part:

(3.69) $$\begin{eqnarray}\displaystyle & \displaystyle \frac{\text{d}\boldsymbol{R}^{(0)}}{\text{d}t}=\tilde{p}_{\Vert }\boldsymbol{b}+\frac{m}{q}\left[\frac{\tilde{\unicode[STIX]{x1D707}}B+\tilde{p}_{\Vert }^{2}}{BB_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D735}B+\frac{\tilde{p}_{\Vert }^{2}}{BB_{\Vert }^{\ast \text{h}}}(\unicode[STIX]{x1D735}\times \boldsymbol{B})_{\bot }\right], & \displaystyle\end{eqnarray}$$
(3.70) $$\begin{eqnarray}\displaystyle & \displaystyle \frac{\text{d}v_{\Vert }^{(0)}}{\text{d}t}=\frac{\text{d}\tilde{p}_{\Vert }^{(0)}}{\text{d}t}=-\tilde{\unicode[STIX]{x1D707}}\unicode[STIX]{x1D735}B\boldsymbol{\cdot }\left[\boldsymbol{b}+\frac{m}{q}\frac{\tilde{p}_{\Vert }}{BB_{\Vert }^{\ast \text{h}}}(\unicode[STIX]{x1D735}\times \boldsymbol{B})_{\bot }\right], & \displaystyle\end{eqnarray}$$
(3.71) $$\begin{eqnarray}\displaystyle & \displaystyle \frac{\text{d}\boldsymbol{R}^{(1)}}{\text{d}t}=-\frac{q}{m}\left(\boldsymbol{b}+\frac{m}{q}\frac{\tilde{p}_{\Vert }}{B_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D73F}\right)\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle +\frac{1}{B_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D713}_{\text{eff}}\rangle , & \displaystyle\end{eqnarray}$$
(3.72) $$\begin{eqnarray}\displaystyle & \displaystyle \frac{\text{d}\tilde{p}_{\Vert }^{(1)}}{\text{d}t}=-\frac{q}{m}\left(\boldsymbol{b}+\frac{m}{q}\frac{\tilde{p}_{\Vert }}{B_{\Vert }^{\ast \text{h}}}\boldsymbol{b}\times \unicode[STIX]{x1D73F}\right)\boldsymbol{\cdot }\unicode[STIX]{x1D735}\langle \unicode[STIX]{x1D713}_{\text{eff}}\rangle . & \displaystyle\end{eqnarray}$$

If we choose the background to be a shifted Maxwellian (3.10), i.e.  $f_{0}=f_{\text{M}}$ , we can derive:

(3.73) $$\begin{eqnarray}\displaystyle \frac{\unicode[STIX]{x1D735}f_{\text{M}}}{f_{\text{M}}} & = & \displaystyle \frac{1}{f_{\text{M}}}\left(\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}\unicode[STIX]{x1D6F9}}\unicode[STIX]{x1D735}\unicode[STIX]{x1D6F9}+\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}\tilde{E}}\unicode[STIX]{x1D735}\tilde{E}\right)\nonumber\\ \displaystyle & = & \displaystyle \left[\left(\frac{\tilde{E}}{v_{\text{th}}^{2}}-\frac{3}{2}\right)\frac{1}{T}\frac{\text{d}T}{\text{d}\unicode[STIX]{x1D6F9}}+\frac{1}{n_{0}}\frac{\text{d}n_{0}}{\text{d}\unicode[STIX]{x1D6F9}}+\frac{\tilde{p}_{\Vert }-\hat{u} _{0}}{v_{\text{th}}^{2}}\frac{\text{d}\hat{u} _{0}}{\text{d}\unicode[STIX]{x1D6F9}}\right]\unicode[STIX]{x1D735}\unicode[STIX]{x1D6F9}-\frac{\tilde{\unicode[STIX]{x1D707}}}{v_{\text{th}}^{2}}\unicode[STIX]{x1D735}B,\end{eqnarray}$$
(3.74) $$\begin{eqnarray}\frac{1}{f_{\text{M}}}\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}\tilde{p}_{\Vert }}=-\frac{\tilde{p}_{\Vert }-\hat{u} _{0}}{v_{\text{th}}^{2}}.\end{eqnarray}$$

By integrating (3.67) in time we can evolve the perturbed weights $\unicode[STIX]{x1D6FF}w^{\text{h},\text{lin}}$ which are used at the charge and current assignment. The set of potential equations being used for the linearised PIC simulation is identical with (3.63) and (3.64) except that we use the linearised skin terms in Ampère’s law. This is a consequence of the linearised first moment which neglects the nonlinear skin term (see § 4.5).

## 4 The moments of the distribution function

### 4.1 Definition of the moments

Usually, the moments of the distribution function are defined as infinite integrals over velocity space. In the symplectic and Hamiltonian formulation, we define the $k$ th moment by

(4.1) $$\begin{eqnarray}\displaystyle & \displaystyle {\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}]\stackrel{\text{def}}{=}\int _{0}^{2\unicode[STIX]{x03C0}}\int _{0}^{\infty }\int _{-\infty }^{\infty }v_{\Vert }^{k}\,f^{\text{s}}B_{\Vert }^{\ast \text{s}}\,\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}=\int v_{\Vert }^{k}\,f^{\text{s}}B_{\Vert }^{\ast \text{s}}\,\text{d}^{3}v, & \displaystyle\end{eqnarray}$$
(4.2) $$\begin{eqnarray}\displaystyle & \displaystyle {\mathcal{M}}_{k}^{\text{h}}[f^{\text{h}}]\stackrel{\text{def}}{=}\int _{0}^{2\unicode[STIX]{x03C0}}\int _{0}^{\infty }\int _{-\infty }^{\infty }\tilde{p}_{\Vert }^{k}\,f^{\text{h}}B_{\Vert }^{\ast \text{h}}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}=\int \tilde{p}_{\Vert }^{k}\,f^{\text{h}}B_{\Vert }^{\ast \text{h}}\,\text{d}^{3}p, & \displaystyle\end{eqnarray}$$

where we denote $\text{d}^{3}v=\text{d}v_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}$ and $\text{d}^{3}p=\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}$ .

The zeroth moment, i.e.  $k=0$ , is identical with the particle number density and the first moment, i.e.  $k=1$ , is the current number density divided by $q$ .

### 4.2 Transformation of the moments of the distribution function

The moments  ${\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}]$ have to be evaluated for the potential equations in the symplectic formulation. However, when we do a PIC simulation in the Hamiltonian formulation we only have  $f^{\text{h}}(\,\tilde{p}_{\Vert })$ at the marker positions at our disposal. Hence, our aim is to compose ${\mathcal{M}}_{k}^{\text{s}}$ out of ${\mathcal{M}}_{k}^{\text{h}}$ . Unfortunately, $f^{\text{s}}(v_{\Vert })$ is a function of  $v_{\Vert }$ and not of $\tilde{p}_{\Vert }$ . We will need to transform $f^{\text{s}}(v_{\Vert })$ to a form that can be evaluated directly at the $\tilde{p}_{\Vert }$ -position. For that purpose we will use the inverse of the so-called ‘pull-back transformation’ ${\mathcal{T}}^{\ast -1}$ of $f^{\text{s}}(v_{\Vert })$ to define ${\mathcal{T}}^{\ast -1}[f^{\text{s}}](\,\tilde{p}_{\Vert })$ . The pull-back transformation for a function  ${\hat{h}}^{\text{h}}$ is defined by the coordinate transformation (3.28) as

(4.3) $$\begin{eqnarray}{\mathcal{T}}^{\ast }[{\hat{h}}^{\text{h}}](v_{\Vert })\stackrel{\text{def}}{=}{\hat{h}}^{\text{h}}(\,\tilde{p}_{\Vert }(v_{\Vert }))\end{eqnarray}$$

and its inverse for a function  ${\hat{h}}^{\text{s}}$ is

(4.4) $$\begin{eqnarray}{\mathcal{T}}^{\ast -1}[{\hat{h}}^{\text{s}}](\,\tilde{p}_{\Vert })\stackrel{\text{def}}{=}{\hat{h}}^{\text{s}}(v_{\Vert }(\,\tilde{p}_{\Vert })).\end{eqnarray}$$

In case of a ${\hat{h}}^{\text{s}}(v_{\Vert })$ having a finite domain, i.e.  $v_{\Vert }\in [v_{\Vert \text{min}},v_{\Vert \text{max}}]$ , ${\hat{h}}^{\text{h}}(\,\tilde{p}_{\Vert })$ has to be defined consistently in the interval $\tilde{p}_{\Vert }\in [\tilde{p}_{\Vert }(v_{\Vert \text{min}}),\tilde{p}_{\Vert }(v_{\Vert \text{max}})]$ . In our PIC simulation we have to discretise the integrals of the moments with a Monte Carlo discretisation using markers being distributed by the marker distribution  $g^{\text{h}}$ . Hence, the weights are located at $\tilde{p}_{\Vert }$ -positions and should cover the correct domain.

With the definition of the pull-back transformation we can rewrite (3.33) in the following way:

(4.5a,b ) $$\begin{eqnarray}f^{\text{s}}(v_{\Vert })={\mathcal{T}}^{\ast }[f^{\text{h}}](v_{\Vert })\quad \text{and}\quad f^{\text{h}}(\,\tilde{p}_{\Vert })={\mathcal{T}}^{\ast -1}[f^{\text{s}}](\,\tilde{p}_{\Vert }).\end{eqnarray}$$

The same holds for $g^{\text{s}}$ , $g^{\text{h}}$ , see (3.34), and $B_{\Vert }^{\ast \text{s}}$ , $B_{\Vert }^{\ast \text{h}}$ , see  (3.35).

In general, given a mapping $T$ : ${\mathcal{V}}\,\rightarrow \,{\mathcal{P}}$  between spaces ${\mathcal{V}},{\mathcal{P}}$ with their corresponding Jacobians ${\mathcal{J}}_{{\mathcal{V}}},{\mathcal{J}}_{{\mathcal{P}}}$ , the pull-back ${\mathcal{T}}^{\ast }$ of a function  ${\hat{h}}(p)$ is defined as ${\mathcal{T}}^{\ast }[{\hat{h}}](v)={\hat{h}}(p(v))$ with $p\in {\mathcal{P}},v\in {\mathcal{V}}$ . For the integral over a pull-back one has the relations (see e.g. Frankel Reference Frankel2012):

(4.6) $$\begin{eqnarray}\displaystyle & \displaystyle \int _{T^{-1}(\unicode[STIX]{x1D70E})}{\mathcal{T}}^{\ast }[{\hat{h}}_{{\mathcal{P}}}J_{{\mathcal{P}}}]\,\text{d}v=\int _{\unicode[STIX]{x1D70E}}{\hat{h}}_{{\mathcal{P}}}J_{{\mathcal{P}}}\,\text{d}p, & \displaystyle\end{eqnarray}$$
(4.7) $$\begin{eqnarray}\displaystyle & \displaystyle \int _{T(\tilde{\unicode[STIX]{x1D70E}})}{\mathcal{T}}^{\ast -1}[{\hat{h}}_{{\mathcal{V}}}J_{{\mathcal{V}}}]\,\text{d}p=\int _{\tilde{\unicode[STIX]{x1D70E}}}{\hat{h}}_{{\mathcal{V}}}J_{{\mathcal{V}}}\,\text{d}v, & \displaystyle\end{eqnarray}$$

where $\unicode[STIX]{x1D70E}$ is a subset of ${\mathcal{P}}$ and $\tilde{\unicode[STIX]{x1D70E}}$ a subset of ${\mathcal{V}}$ (integration domain) and ${\mathcal{T}}^{\ast }$ has been assumed to be invertible. Note that (4.6) and (4.7), which are an abstract way of performing the integration by substitution, include the integration boundaries.

Using (3.33) and (3.35) together with integration by substitution, we get a relation between the moments ${\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}]$ and the moments ${\mathcal{M}}_{k}^{\text{h}}[f^{\text{h}}]$ :

(4.8) $$\begin{eqnarray}\displaystyle {\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}] & = & \displaystyle \int v_{\Vert }^{k}\,f^{\text{s}}B_{\Vert }^{\ast \text{s}}\,\text{d}^{3}v\nonumber\\ \displaystyle & = & \displaystyle \int _{\tilde{p}_{\Vert }(-\infty )}^{\tilde{p}_{\Vert }(\infty )}v_{\Vert }^{k}(\,\tilde{p}_{\Vert })\,h^{\text{s}}(v_{\Vert }(\,\tilde{p}_{\Vert }))\,\frac{\text{d}v_{\Vert }}{\text{d}\tilde{p}_{\Vert }}\,\text{d}^{3}p\quad \text{with }\frac{\text{d}v_{\Vert }}{\text{d}\tilde{p}_{\Vert }}=1\nonumber\\ \displaystyle & = & \displaystyle \int \left(\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)^{k}\,h^{\text{h}}(\,\tilde{p}_{\Vert })\,\text{d}^{3}p=\int \left(\,\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)^{k}f^{\text{h}}B_{\Vert }^{\ast \text{h}}\,\text{d}^{3}p,\end{eqnarray}$$

where

(4.9a,b ) $$\begin{eqnarray}h^{\text{s}}(v_{\Vert })\stackrel{\text{def}}{=}f^{\text{s}}(v_{\Vert })B_{\Vert }^{\ast \text{s}}(v_{\Vert })\quad \text{and}\quad h^{\text{h}}(\,\tilde{p}_{\Vert })\stackrel{\text{def}}{=}f^{\text{h}}(\,\tilde{p}_{\Vert })B_{\Vert }^{\ast \text{h}}(\,\tilde{p}_{\Vert }).\end{eqnarray}$$

Alternatively, we can perform the same calculation by using the inverse pull-back transformation, equation (4.7):

(4.10) $$\begin{eqnarray}\displaystyle {\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}] & = & \displaystyle \int _{\tilde{\unicode[STIX]{x1D70E}}}v_{\Vert }^{k}\,f^{\text{s}}(v_{\Vert })B_{\Vert }^{\ast \text{s}}\,\text{d}^{3}v=\int _{T(\tilde{\unicode[STIX]{x1D70E}})}{\mathcal{T}}^{\ast -1}[v_{\Vert }^{k}\,f^{\text{s}}B_{\Vert }^{\ast \text{s}}]\,\text{d}^{3}p\nonumber\\ \displaystyle & = & \displaystyle \int _{\unicode[STIX]{x1D70E}}\left(\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)^{k}f^{\text{h}}B_{\Vert }^{\ast \text{h}}\,\text{d}^{3}p.\end{eqnarray}$$

Using (4.8), we get the particle number density (zeroth moment) and the current number density (first moment) of the distribution function as moments in the Hamiltonian formulation (neglecting gyro-averaging to make the derivation not unnecessarily cumbersome):

(4.11) $$\begin{eqnarray}\displaystyle & \displaystyle n^{\text{s}}={\mathcal{M}}_{0}^{\text{s}}[f^{\text{s}}]={\mathcal{M}}_{0}^{\text{h}}[f^{\text{h}}]=n^{\text{h}}, & \displaystyle\end{eqnarray}$$
(4.12) $$\begin{eqnarray}\displaystyle & \displaystyle j_{\Vert }^{\text{s}}=q{\mathcal{M}}_{1}^{\text{s}}[f^{\text{s}}]=q{\mathcal{M}}_{1}^{\text{h}}[f^{\text{h}}]-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }{\mathcal{M}}_{0}^{\text{h}}[f^{\text{h}}]=j_{\Vert }^{\text{h}}-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }n^{\text{h}}. & \displaystyle\end{eqnarray}$$

Note that we were able to derive this result only by using $h^{\text{s}}(v_{\Vert })=h^{\text{h}}(\,\tilde{p}_{\Vert })$ in (4.8).

Since the previous relations are not valid for the background $f_{0}$ , we separately consider the time-dependent moments of $f_{0}$ in the symplectic formulation:

(4.13) $$\begin{eqnarray}\displaystyle {\mathcal{M}}_{k}^{\text{s}}[f_{0}] & = & \displaystyle \int v_{\Vert }^{k}\,f_{0}B_{\Vert }^{\ast \text{s}}\,\text{d}^{3}v\nonumber\\ \displaystyle & = & \displaystyle \int v_{\Vert }^{k}\,f_{0}\left(B+\frac{m}{q}v_{\Vert }\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})\right)\text{d}^{3}v+\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})\int v_{\Vert }^{k}\,f_{0}\unicode[STIX]{x1D6FF}A_{\Vert }\,\text{d}^{3}v\nonumber\\ \displaystyle & = & \displaystyle \int \tilde{p}_{\Vert }^{k}\,f_{0}B_{\Vert }^{\ast \text{h}}\text{d}^{3}p+\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})\int v_{\Vert }^{k}\,f_{0}\unicode[STIX]{x1D6FF}A_{\Vert }\,\text{d}^{3}v\nonumber\\ \displaystyle & = & \displaystyle {\mathcal{M}}_{k}^{\text{h}}[f_{0}]+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }(t)\int v_{\Vert }^{k}\,f_{0}B\,\text{d}^{3}v,\end{eqnarray}$$

where we have just renamed the integration variable of the first integral on the second line ( $v_{\Vert }\rightarrow \tilde{p}_{\Vert }$ ). In the case $f_{0}=f_{\text{M}}$ , we obtain for the zeroth and first moment

(4.14) $$\begin{eqnarray}\displaystyle & \displaystyle n_{0}^{\text{s}}\stackrel{\text{def}}{=}{\mathcal{M}}_{0}^{\text{s}}[f_{0}]=n_{0}+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }\hat{n}_{0}, & \displaystyle\end{eqnarray}$$
(4.15) $$\begin{eqnarray}\displaystyle & \displaystyle j_{0\Vert }^{\text{s}}\stackrel{\text{def}}{=}q{\mathcal{M}}_{1}^{\text{s}}[f_{0}]=j_{0\Vert }+q\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }\hat{n}_{0}\hat{u} _{0}, & \displaystyle\end{eqnarray}$$

where we used the definitions for the particle and current number density of the background (see (3.17) and (3.18)). Thus, the moments of the background $f_{0}$ differ in the symplectic and Hamiltonian formulation due to the differing Jacobians in symplectic and Hamiltonian phase space.

Finally, the corresponding relation for the first two moments of the perturbation to the distribution function  $\unicode[STIX]{x1D6FF}f^{\text{s}}$ is

(4.16) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}n^{\text{s}}={\mathcal{M}}_{0}^{\text{s}}[\unicode[STIX]{x1D6FF}f^{\text{s}}]=n^{\text{s}}-n_{0}^{\text{s}}=\unicode[STIX]{x1D6FF}n^{\text{h}}-\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }\hat{n}_{0}, & \displaystyle\end{eqnarray}$$
(4.17) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}j_{\Vert }^{\text{s}}=q{\mathcal{M}}_{1}^{\text{s}}[\unicode[STIX]{x1D6FF}f^{\text{s}}]=j_{\Vert }^{\text{s}}-j_{0\Vert }^{\text{s}}=\unicode[STIX]{x1D6FF}j_{\Vert }^{\text{h}}-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }n^{\text{h}}-q\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }\hat{n}_{0}\hat{u} _{0}, & \displaystyle\end{eqnarray}$$

where we used (4.11), (4.12) and (4.14), (4.15). For the linearisation of (4.17) we have to replace $n^{\text{h}}\simeq n_{0}$ . Taking the gyro-averaging into account (4.16) and (4.17) become

(4.18) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\bar{n}^{\text{s}} & = & \displaystyle \unicode[STIX]{x1D6FF}\bar{n}^{\text{h}}-\int \frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p,\end{eqnarray}$$
(4.19) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{s}} & = & \displaystyle \unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{h}}-\frac{q^{2}}{m}\int f^{\text{h}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\nonumber\\ \displaystyle & & \displaystyle -\,q\int \hat{u} _{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p.\end{eqnarray}$$

The linearised form of (4.18) and (4.19) can be written as (see also § 4.3.2)

(4.20) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\bar{n}^{\text{s},\text{lin}} & = & \displaystyle \unicode[STIX]{x1D6FF}\bar{n}^{\text{h},\text{lin}}-\int \frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p,\end{eqnarray}$$
(4.21) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{s},\text{lin}} & = & \displaystyle \unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{h},\text{lin}}-\frac{q^{2}}{m}\int \frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\nonumber\\ \displaystyle & & \displaystyle -\,q\int \hat{u} _{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p,\end{eqnarray}$$

where we used (3.21) and

(4.22) $$\begin{eqnarray}\int \left(f_{0}+\tilde{p}_{\Vert }\left.\frac{\unicode[STIX]{x2202}f_{0}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}\right)\,\text{d}\tilde{p}_{\Vert }=\Big[\tilde{p}_{\Vert }\,f_{0}\Big]_{-\infty }^{\infty }=0\quad \text{with }\left.\frac{\unicode[STIX]{x2202}f_{0}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}=-\frac{\tilde{p}_{\Vert }-\hat{u} _{0}}{v_{\text{th}}^{2}}f_{\text{M}}.\end{eqnarray}$$

### 4.3 Effect of the finite velocity sphere on the moments

The markers in reduced phase space are distributed by using the marker distribution function  $g$ (marker loading). For more details see § 10. In a real simulation, we are always limited by finite compute resources which implicates the simulation being restricted by a finite velocity sphere. An exception might be a Maxwellian marker loading which by definition distributes the markers over the infinite velocity sphere but for the price of having a resolution problem at higher velocities (see also § 6.1).

If we assume that the perturbed particle and current number densities are zero at the beginning of the simulation it follows that also the perturbed potentials are zero and thus we have $\tilde{p}_{\Vert }(t_{0})=v_{\Vert }(t_{0})$ for the markers. Hence, the marker loading is done in $(v_{\Vert },v_{\bot })$ -coordinates where the perpendicular velocity coordinate is defined by $v_{\bot }=\sqrt{2B\tilde{\unicode[STIX]{x1D707}}}$ . The markers are distributed in reduced velocity space initially within the limits of a semicircle parametrised by the coordinates. For each species  $s$ the semicircle is centred around $(\hat{u} _{0s},0)$ and its radius is defined by the discretisation parameter $\unicode[STIX]{x1D705}_{\text{v}s}$ :

(4.23) $$\begin{eqnarray}v_{\text{max}s}(\unicode[STIX]{x1D6F9})\stackrel{\text{def}}{=}\unicode[STIX]{x1D705}_{\text{v}s}v_{\text{th}s}(\unicode[STIX]{x1D6F9}).\end{eqnarray}$$

For nonlinear simulations there are two effects. First, the individual marker can be accelerated and thus change its kinetic energy and potentially leave its initial velocity sphere. Second, the $v_{\Vert }$ - and $\tilde{p}_{\Vert }$ -coordinates evolve differently due to the coordinate transformation (3.28). Depending on the value of  $\unicode[STIX]{x1D6FF}A_{\Vert }(t)$ the $\tilde{p}_{\Vert }$ -velocity sphere is oscillating around the $v_{\Vert }$ -velocity sphere which stays fixed as a whole. This has an effect when we calculate the moments of the distribution function within a finite velocity sphere. For a finite velocity sphere the moments ${\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}]$ and ${\mathcal{M}}_{k}^{\text{h}}[f^{\text{h}}]$ are approximated by the integrals:

(4.24) $$\begin{eqnarray}\displaystyle & \displaystyle \widetilde{{\mathcal{M}}}_{k}^{\text{s}}[f^{\text{s}}]\stackrel{\text{def}}{=}\int _{0}^{2\unicode[STIX]{x03C0}}\int _{0}^{v_{\bot \text{max}}}\int _{v_{\Vert \text{min}}}^{v_{\Vert \text{max}}}v_{\Vert }^{k}\,f^{\text{s}}\frac{B_{\Vert }^{\ast \text{s}}}{B}v_{\bot }\,\text{d}v_{\Vert }\,\text{d}v_{\bot }\,\text{d}\unicode[STIX]{x1D6FC}, & \displaystyle\end{eqnarray}$$
(4.25) $$\begin{eqnarray}\displaystyle & \displaystyle \widetilde{{\mathcal{M}}}_{k}^{\text{h}}[f^{\text{h}}]\stackrel{\text{def}}{=}\int _{0}^{2\unicode[STIX]{x03C0}}\int _{0}^{v_{\bot \text{max}}}\int _{\tilde{p}_{\Vert \text{min}}}^{\tilde{p}_{\Vert \text{max}}}\tilde{p}_{\Vert }^{k}\,f^{\text{h}}\frac{B_{\Vert }^{\ast \text{h}}}{B}v_{\bot }\,\text{d}\tilde{p}_{\Vert }\,\text{d}v_{\bot }\,\text{d}\unicode[STIX]{x1D6FC} & \displaystyle\end{eqnarray}$$

with the limits:

(4.26a-c ) $$\begin{eqnarray}v_{\Vert \text{min}}=-v_{\text{max}}+\hat{u} _{0},\quad v_{\Vert \text{max}}=v_{\text{max}}+\hat{u} _{0}\quad \text{and}\quad v_{\bot \text{max}}=v_{\text{max}},\end{eqnarray}$$
(4.27) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{p}_{\Vert \text{min}}=\tilde{p}_{\Vert }(v_{\Vert \text{min}})=-\sqrt{v_{\max }^{2}-v_{\bot }^{2}}+\hat{u} _{0}+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle , & \displaystyle\end{eqnarray}$$
(4.28) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{p}_{\Vert \text{max}}=\tilde{p}_{\Vert }(v_{\Vert \text{max}})=\sqrt{v_{\max }^{2}-v_{\bot }^{2}}+\hat{u} _{0}+\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle . & \displaystyle\end{eqnarray}$$

The moments of the background distribution function $\widetilde{{\mathcal{M}}}_{k}^{\text{s}}[f_{0}]$ and $\widetilde{{\mathcal{M}}}_{k}^{\text{h}}[f_{0}]$ can be calculated analytically for the finite sphere if we suppose a shifted Maxwellian (3.10). This will be done for the particle number density (zeroth moment) and the current number density (first moment) in the following sections. The difference between the results of the finite and infinite integrals can be expressed by correction factors $\tilde{C}_{\text{v}}(\unicode[STIX]{x1D705}_{\text{v}s},\unicode[STIX]{x1D6FE}_{s})$ being defined in appendix D. Only in the Hamiltonian formulation do they depend on the parameter

(4.29) $$\begin{eqnarray}\unicode[STIX]{x1D6FE}_{s}(\boldsymbol{R},t)\stackrel{\text{def}}{=}\frac{q_{s}}{m_{s}}\frac{\langle \unicode[STIX]{x1D6FF}A_{\Vert }(\boldsymbol{R},t)\rangle }{v_{\text{th}s}(\unicode[STIX]{x1D6F9})},\end{eqnarray}$$

which denotes the shift of the $\tilde{p}_{\Vert }$ -velocity sphere. In the limit of $\unicode[STIX]{x1D6FE}_{s}\rightarrow 0$ , we approach the symplectic case and in the limit of $\unicode[STIX]{x1D705}_{\text{v}s}\rightarrow \infty$ the case of an infinite velocity sphere.

Although the formal description by a finite velocity sphere is the correct way, one might argue that it is of minor importance as the velocity sphere can always be chosen sufficiently large in numerical simulation to approximate the case of an infinite velocity sphere sufficiently well. However, the correct description by a finite velocity sphere results in a faster convergence rate with the parameter $\unicode[STIX]{x1D705}_{\text{v}}$ having the benefit of needing a smaller number of markers $N_{\text{p}}$ (Hatzky et al. Reference Hatzky, Könies and Mishchenko2007, figure 2) and being able to select a larger time-step size  $\unicode[STIX]{x0394}t$ in numerical simulation.

#### 4.3.1 The nonlinear case

Again, we can express the moments $\widetilde{{\mathcal{M}}}_{k}^{\text{s}}[f^{\text{s}}]$ by the moments $\widetilde{{\mathcal{M}}}_{k}^{\text{h}}[f^{\text{h}}]$ using the following relation (compare with (4.8) and (4.10)):

(4.30) $$\begin{eqnarray}\displaystyle \widetilde{{\mathcal{M}}}_{k}^{\text{s}}[f^{\text{s}}] & = & \displaystyle \int _{0}^{2\unicode[STIX]{x03C0}}\int _{0}^{v_{\bot \text{max}}}\int _{v_{\Vert \text{min}}}^{v_{\Vert \text{max}}}v_{\Vert }^{k}h^{\text{s}}(v_{\Vert })\,\text{d}v_{\Vert }\,\frac{v_{\bot }}{B}\,\text{d}v_{\bot }\,\text{d}\unicode[STIX]{x1D6FC}\nonumber\\ \displaystyle & = & \displaystyle \int _{0}^{2\unicode[STIX]{x03C0}}\int _{0}^{v_{\bot \text{max}}}\int _{\tilde{p}_{\Vert }(v_{\Vert \text{min}})}^{\tilde{p}_{\Vert }(v_{\Vert \text{max}})}\left(\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)^{k}h^{\text{h}}(\,\tilde{p}_{\Vert })\,\text{d}\tilde{p}_{\Vert }\,\frac{v_{\bot }}{B}\,\text{d}v_{\bot }\,\text{d}\unicode[STIX]{x1D6FC}\nonumber\\ \displaystyle & = & \displaystyle \int _{0}^{2\unicode[STIX]{x03C0}}\int _{0}^{v_{\bot \text{max}}}\int _{\tilde{p}_{\Vert \text{min}}}^{\tilde{p}_{\Vert \text{max}}}\left(\tilde{p}_{\Vert }-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \right)^{k}f^{\text{h}}\frac{B_{\Vert }^{\ast \text{h}}}{B}\,\text{d}\tilde{p}_{\Vert }\,v_{\bot }\,\text{d}v_{\bot }\,\text{d}\unicode[STIX]{x1D6FC}.\end{eqnarray}$$

The particle and current number density, now having a tilde, can be written as (neglecting gyro-averaging to make the derivation not unnecessarily cumbersome)

(4.31) $$\begin{eqnarray}\displaystyle & \displaystyle {\tilde{n}}^{\text{s}}=\widetilde{{\mathcal{M}}}_{0}^{\text{s}}[f^{\text{s}}]=\widetilde{{\mathcal{M}}}_{0}^{\text{h}}[f^{\text{h}}]={\tilde{n}}^{\text{h}}, & \displaystyle\end{eqnarray}$$
(4.32) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{j}_{\Vert }^{\text{s}}=q\widetilde{{\mathcal{M}}}_{1}^{\text{s}}[f^{\text{s}}]=q\widetilde{{\mathcal{M}}}_{1}^{\text{h}}[f^{\text{h}}]-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }\widetilde{{\mathcal{M}}}_{0}^{\text{h}}[f^{\text{h}}]=\tilde{j}_{\Vert }^{\text{h}}-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }{\tilde{n}}^{\text{h}}. & \displaystyle\end{eqnarray}$$

Therefore, equations (4.31) and (4.32) are a generalisation of (4.11) and (4.12).

The moments of the background $f_{0}=f_{\text{M}}$ can be derived analogously to (4.14) and (4.15):

(4.33) $$\begin{eqnarray}\displaystyle & \displaystyle {\tilde{n}}_{0}^{\text{s}}=\widetilde{{\mathcal{M}}}_{0}^{\text{s}}[f_{0}]=\,C_{\text{v}2}n_{0}+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}2}\hat{n}_{0}, & \displaystyle\end{eqnarray}$$
(4.34) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{j}_{0\Vert }^{\text{s}}=q\widetilde{{\mathcal{M}}}_{1}^{\text{s}}[f_{0}]=\,q\hat{n}_{0}\left[C_{\text{v}2}\hat{u} _{0}+\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}C_{\text{v}3}(v_{\text{th}}^{2}+\hat{u} _{0}^{2})+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}2}\hat{u} _{0}\right]. & \displaystyle \nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$

We used the correction factors $\tilde{C}_{v}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})$ (see appendix D). In the symplectic case, where we have $\unicode[STIX]{x1D6FE}=0$ , we will use the abbreviations $C_{\text{v}1}=\tilde{C}_{\text{v}1}(\unicode[STIX]{x1D705}_{\text{v}},0)$ , $C_{\text{v}2}=\tilde{C}_{\text{v}2}(\unicode[STIX]{x1D705}_{\text{v}},0)$ and $C_{\text{v}3}=\tilde{C}_{\text{v}3}(\unicode[STIX]{x1D705}_{\text{v}},0)$ .

As the $\unicode[STIX]{x1D6FF}f$ -ansatz separates the background part  $f_{0}$ from the perturbed part  $\unicode[STIX]{x1D6FF}f$ , we can further simplify these equations by choosing an infinite velocity sphere for the analytic calculation of the unperturbed quantities:

(4.35) $$\begin{eqnarray}\displaystyle & \displaystyle {\tilde{n}}_{0}^{\text{s}}=n_{0}+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}2}\hat{n}_{0}, & \displaystyle\end{eqnarray}$$
(4.36) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{j}_{0\Vert }^{\text{s}}=j_{0\Vert }+q\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}2}\hat{n}_{0}\hat{u} _{0}, & \displaystyle\end{eqnarray}$$

Taking the gyro-averaging into account (4.35) and (4.36) become:

(4.37) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{\bar{n}}_{0}^{\text{s}}=\bar{n}_{0}+q\int _{v_{\text{fin}}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}v, & \displaystyle\end{eqnarray}$$
(4.38) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{\bar{j}}_{0\Vert }^{\text{s}}=\bar{j}_{0\Vert }+q\int _{v_{\text{fin}}}\hat{u} _{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}v. & \displaystyle\end{eqnarray}$$

With the help of (4.35)–(4.38) one sees that the quasi-neutrality condition and Ampère’s law take the following form in the symplectic formulation for a finite velocity sphere (compare with (3.26) and (3.27)):

(4.39) $$\begin{eqnarray}\displaystyle \hspace{-30.0pt} & & \displaystyle -\unicode[STIX]{x1D735}\boldsymbol{\cdot }\left(\frac{q_{\text{i}}^{2}n_{0\text{i}}}{k_{\text{B}}T_{\text{i}}}\unicode[STIX]{x1D70C}_{0\text{i}}^{2}\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\right)=q_{\text{i}}\unicode[STIX]{x1D6FF}\tilde{\bar{n}}_{\text{i}}^{\text{s}}+q_{\text{e}}\unicode[STIX]{x1D6FF}{\tilde{n}}_{\text{e}}^{\text{s}}\nonumber\\ \displaystyle \hspace{-30.0pt} & & \displaystyle \quad +\,q_{\text{i}}\int _{v_{\text{fin}}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}f_{\text{Mi}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}v+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}q_{\text{e}}C_{\text{v}2\text{e}}\hat{n}_{0\text{e}}\,\unicode[STIX]{x1D6FF}A_{\Vert },\nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$
(4.40) $$\begin{eqnarray}-\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}A_{\Vert }-\unicode[STIX]{x1D707}_{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}q_{\text{e}}C_{\text{v}2\text{e}}\hat{n}_{0\text{e}}\hat{u} _{0\text{e}}\,\unicode[STIX]{x1D6FF}A_{\Vert }=\unicode[STIX]{x1D707}_{0}(\unicode[STIX]{x1D6FF}\tilde{\bar{j}}_{\Vert \text{i}}^{\text{s}}+\unicode[STIX]{x1D6FF}\tilde{j}_{\Vert \text{e}}^{\text{s}}).\end{eqnarray}$$

Note that we did not neglect the gyro-averaging here and that the symplectic set of potential equations holds for both the linear and nonlinear case.

Unfortunately, a simple relation for the moments of $\unicode[STIX]{x1D6FF}f^{\text{s}}$ , which expresses them in the Hamiltonian formulation (compare with (4.18) and (4.19)), cannot be derived for the case of a finite velocity sphere. However, it is possible for the linear case which will be addressed in the next section.

#### 4.3.2 The linear case

Only in the linear case, it is feasible to express the symplectic moments by the Hamiltonian ones. Within a linear simulation the markers are pushed along the unperturbed trajectories, e.g. for the parallel dynamics by  (3.70). Hence, $v_{\Vert }(t)=\tilde{p}_{\Vert }(t)$ is valid during the whole simulation. In addition, the total energy or rather the velocity of each marker is conserved. Therefore, the limits of the initial velocity sphere do not change over time and the markers cannot leave the initial loading sphere.

Due to the oscillating limits of the finite velocity sphere the moments of the background $f_{0}=f_{\text{M}}$ in the Hamiltonian formulation are complicated expressions (compare with (3.21) and (3.23)):

(4.41) $$\begin{eqnarray}\displaystyle \hspace{-24.0pt}{\tilde{n}}_{0}^{\text{h}}=\widetilde{{\mathcal{M}}}_{0}^{\text{h}}[f_{0}] & = & \displaystyle \hat{n}_{0}\left\{\tilde{C}_{\text{v}2}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})+\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}[\tilde{C}_{\text{v}2}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})\hat{u} _{0}+\tilde{C}_{\text{v}4}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})]\right\},\end{eqnarray}$$
(4.42) $$\begin{eqnarray}\displaystyle \hspace{-24.0pt}\tilde{j}_{0\Vert }^{\text{h}}=q\widetilde{{\mathcal{M}}}_{1}^{\text{h}}[f_{0}] & = & \displaystyle q\hat{n}_{0}\left\{\tilde{C}_{\text{v}2}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})\hat{u} _{0}+\tilde{C}_{\text{v}4}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})\right.\nonumber\\ \displaystyle \hspace{-24.0pt} & & \displaystyle \left.+\,\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\left[\tilde{C}_{\text{v}3}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})(v_{\text{th}}^{2}+\hat{u} _{0}^{2})+2\tilde{C}_{\text{v}4}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})\hat{u} _{0}\right]\right\}.\end{eqnarray}$$

After linearisation it is simple to split these equations into a background and perturbed part. Hence, we derive the linearised background particle and current number densities in the Hamiltonian formulation:

(4.43) $$\begin{eqnarray}\displaystyle {\tilde{n}}_{0}^{\text{h},\text{lin}}=\widetilde{{\mathcal{M}}}_{0}^{\text{h},\text{lin}}[f_{0}] & = & \displaystyle \hat{n}_{0}\left\{C_{\text{v}2}+\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}[C_{\text{v}2}\hat{u} _{0}+\tilde{C}_{\text{v}4}^{\text{lin}}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})]\right\},\end{eqnarray}$$
(4.44) $$\begin{eqnarray}\displaystyle \tilde{j}_{0\Vert }^{\text{h},\text{lin}}=q\widetilde{{\mathcal{M}}}_{1}^{\text{h},\text{lin}}[f_{0}] & = & \displaystyle q\hat{n}_{0}\bigg\{C_{\text{v}2}\hat{u} _{0}+\tilde{C}_{\text{v}4}^{\text{lin}}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})\nonumber\\ \displaystyle & & \displaystyle +\,\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\left[C_{\text{v}3}(v_{\text{th}}^{2}+\hat{u} _{0}^{2})+2\tilde{C}_{\text{v}4}^{\text{lin}}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})\hat{u} _{0}\right]\!\bigg\}.\end{eqnarray}$$

Due to the $\unicode[STIX]{x1D6FF}f$ -ansatz we can choose again an infinite velocity sphere for the analytic calculation of the unperturbed quantities. As a result, the background particle and current number density take the following form:

(4.45) $$\begin{eqnarray}\displaystyle {\tilde{n}}_{0}^{\text{h},\text{lin}} & = & \displaystyle \hat{n}_{0}\left\{1+\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}[\hat{u} _{0}+\tilde{C}_{\text{v}4}^{\text{lin}}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})]\right\}\nonumber\\ \displaystyle & = & \displaystyle n_{0}+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}1}\hat{n}_{0},\end{eqnarray}$$
(4.46) $$\begin{eqnarray}\displaystyle \tilde{j}_{0\Vert }^{\text{h},\text{lin}} & = & \displaystyle q\hat{n}_{0}\bigg\{\hat{u} _{0}+\tilde{C}_{\text{v}4}^{\text{lin}}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})+\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\left[v_{\text{th}}^{2}+\hat{u} _{0}^{2}+2\tilde{C}_{\text{v}4}^{\text{lin}}(\unicode[STIX]{x1D705}_{\text{v}},\unicode[STIX]{x1D6FE})\hat{u} _{0}\right]\!\bigg\}\nonumber\\ \displaystyle & = & \displaystyle j_{0\Vert }+\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}1}n_{0}+q\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}1}\hat{n}_{0}\hat{u} _{0},\end{eqnarray}$$

where we neglected the gyro-averaging and used (3.21), (3.23) and (D 12). We can see that new terms, which depend linearly on $\unicode[STIX]{x1D6FF}A_{\Vert }$ , appear. They originate from the fact that the moments of the background distribution function  $f_{0}$ have been integrated over boundaries depending linearly on  $\unicode[STIX]{x1D6FF}A_{\Vert }$ . As all these additional terms depend on $C_{\text{v}1}$ , they vanish for an infinite velocity sphere.

Next, we have to linearise the zeroth and first moment, equations (4.31) and (4.32):

(4.47) $$\begin{eqnarray}\displaystyle & \displaystyle {\tilde{n}}^{\text{s},\text{lin}}={\tilde{n}}^{\text{h},\text{lin}}, & \displaystyle\end{eqnarray}$$
(4.48) $$\begin{eqnarray}\displaystyle & \displaystyle \tilde{j}_{\Vert }^{\text{s},\text{lin}}=\tilde{j}_{\Vert }^{\text{h},\text{lin}}-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}2}n_{0}, & \displaystyle\end{eqnarray}$$

where we used (4.43).

So with equations (4.35)–(4.36), (4.45)–(4.48) and (D 1) we get the corresponding relation for the first two moments of $\unicode[STIX]{x1D6FF}f^{\text{s}}$ :

(4.49) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}{\tilde{n}}^{\text{s},\text{lin}}=\widetilde{{\mathcal{M}}}_{0}^{\text{s}}[\unicode[STIX]{x1D6FF}f^{\text{s},\text{lin}}]={\tilde{n}}^{\text{s},\text{lin}}-{\tilde{n}}_{0}^{\text{s}}=\unicode[STIX]{x1D6FF}{\tilde{n}}^{\text{h},\text{lin}}-\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}3}\hat{n}_{0}, & \displaystyle\end{eqnarray}$$
(4.50) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}\tilde{j}_{\Vert }^{\text{s},\text{lin}}=q\widetilde{{\mathcal{M}}}_{1}^{\text{s}}[\unicode[STIX]{x1D6FF}f^{\text{s},\text{lin}}]=\tilde{j}_{\Vert }^{\text{s},\text{lin}}-\tilde{j}_{0\Vert }^{\text{s}}=\unicode[STIX]{x1D6FF}\tilde{j}_{\Vert }^{\text{h},\text{lin}}-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}3}n_{0}-q\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}3}\hat{n}_{0}\hat{u} _{0}. & \displaystyle \nonumber\\ \displaystyle & & \displaystyle\end{eqnarray}$$

Unfortunately, it is not obvious how to generalise (4.49) and (4.50) to include the gyro-averaging. For this purpose it is much more convenient to use the linearised pull-back transformation which will be done in the next section. However, the result should be stated here (compare with (4.20) and (4.21)):

(4.51) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\tilde{\bar{n}}^{\text{s},\text{lin}} & = & \displaystyle \unicode[STIX]{x1D6FF}\tilde{\bar{n}}^{\text{h},\text{lin}}-\int _{p_{\text{fin}}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p,\quad\end{eqnarray}$$
(4.52) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\tilde{\bar{j}}_{\Vert }^{\text{s},\text{lin}} & = & \displaystyle \unicode[STIX]{x1D6FF}\tilde{\bar{j}}_{\Vert }^{\text{h},\text{lin}}-\frac{q^{2}}{m}\int _{p_{\text{fin}}}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\nonumber\\ \displaystyle & & \displaystyle -\,q\int _{p_{\text{fin}}}\hat{u} _{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p.\end{eqnarray}$$

And finally, our linear model equations for the quasi-neutrality condition and Ampère’s law in the Hamiltonian formulation take the following form for a finite velocity sphere (compare with (3.63) and (3.64)):

(4.53) $$\begin{eqnarray}\displaystyle & & \displaystyle -\unicode[STIX]{x1D735}\boldsymbol{\cdot }\left(\frac{q_{\text{i}}^{2}n_{0\text{i}}}{k_{\text{B}}T_{\text{i}}}\unicode[STIX]{x1D70C}_{0\text{i}}^{2}\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}\unicode[STIX]{x1D719}\right)=q_{\text{i}}\unicode[STIX]{x1D6FF}\tilde{\bar{n}}_{\text{i}}^{\text{h},\text{lin}}+q_{\text{e}}\unicode[STIX]{x1D6FF}{\tilde{n}}_{\text{e}}^{\text{h},\text{lin}}\nonumber\\ \displaystyle & & \displaystyle \quad +\,q_{\text{i}}\int _{p_{\text{fin}}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\left(1-\frac{\tilde{p}_{\Vert }^{2}}{v_{\text{thi}}^{2}}\right)f_{\text{Mi}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\nonumber\\ \displaystyle & & \displaystyle \quad +\,\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}q_{\text{e}}C_{\text{v1e}}\hat{n}_{0\text{e}}\,\unicode[STIX]{x1D6FF}A_{\Vert },\end{eqnarray}$$
(4.54) $$\begin{eqnarray}\displaystyle & & \displaystyle -\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}A_{\Vert }+\frac{\unicode[STIX]{x1D6FD}_{0\text{i}}}{\unicode[STIX]{x1D70C}_{0\text{i}}^{2}}\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }_{\text{i}}^{\text{h},\text{lin}}+C_{\text{v3e}}\frac{\unicode[STIX]{x1D6FD}_{0\text{e}}}{\unicode[STIX]{x1D70C}_{0\text{e}}^{2}}\unicode[STIX]{x1D6FF}A_{\Vert }\nonumber\\ \displaystyle & & \displaystyle \quad -\,\unicode[STIX]{x1D707}_{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}q_{\text{e}}C_{\text{v1e}}\hat{n}_{0\text{e}}\hat{u} _{0\text{e}}\,\unicode[STIX]{x1D6FF}A_{\Vert }=\unicode[STIX]{x1D707}_{0}\left(\unicode[STIX]{x1D6FF}\tilde{\bar{j}}_{\Vert \text{i}}^{\text{h},\text{lin}}+\unicode[STIX]{x1D6FF}\tilde{j}_{\Vert \text{e}}^{\text{h},\text{lin}}\right),\end{eqnarray}$$

where we used (4.49)–(4.52). The significance of the correction factor  $C_{\text{v3}}$ as part of the electron skin term was already pointed out in Mishchenko et al. (Reference Mishchenko, Hatzky and Könies2004a ).

The perturbed magnetic potential averaged with the weighting factor $\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0s})/v_{\text{th}s}^{2}$ over the gyro-phase and over the background distribution function is defined as

(4.55) $$\begin{eqnarray}\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }_{s}^{\text{h},\text{lin}}\stackrel{\text{def}}{=}\frac{1}{n_{0s}}\int _{p_{\text{fin}}}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0s})}{v_{\text{th}s}^{2}}f_{\text{M}s}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert s}^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p.\end{eqnarray}$$

Note that the averaging differs from the nonlinear case (compare with (3.62)).

For large scale modes it is usually appropriate to use a long-wavelength approximation for the first term of the second line of (4.39) and (4.53). Thus, we get for the two terms of the second and third line of (4.53):

(4.56) $$\begin{eqnarray}\unicode[STIX]{x1D735}\boldsymbol{\cdot }\left[\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}q_{\text{i}}C_{\text{v1i}}\hat{n}_{0\text{i}}\unicode[STIX]{x1D70C}_{0\text{i}}^{2}\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}A_{\Vert }\right]+\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }\mathop{\sum }_{s=\text{i},\text{e}}q_{s}C_{\text{v}1s}\hat{n}_{0s}.\end{eqnarray}$$

Furthermore, due to the quasi-neutrality condition the terms on the second line of (4.39) and on the second and third line of (4.53) would vanish if the gyro-averaging of the ions would be neglected and the velocity spheres of the ions and electrons would be of equal size, i.e.  $C_{\text{v1i}}=C_{\text{v1e}}$ and $C_{\text{v}2\text{e}}=C_{\text{v2i}}$ . Therefore for large scale modes, the contribution of these terms is usually small. In addition, the terms on the second and third line of (4.53) are much smaller than the terms on the second line of  (4.39) due to $C_{\text{v}1s}\ll C_{\text{v}2s}$ .

### 4.4 The linearised pull-back transformation of the weights

For later reference we rederive (4.49) and (4.50) using the pull-back transformation. The pull-back transformation (see also § 4.2) of the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ , is defined by (3.43). In the linear case with a shifted Maxwellian (3.10) as background  $f_{0}$ , this gives

(4.57) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}f^{\text{s},\text{lin}}(v_{\Vert })=\unicode[STIX]{x1D6FF}f^{\text{h},\text{lin}}(\,\tilde{p}_{\Vert })-\unicode[STIX]{x1D6FF}f^{\text{ad}}(\,\tilde{p}_{\Vert }),\end{eqnarray}$$

where we have introduced the so-called ‘adiabatic part’ of the perturbation to the distribution function in the Hamiltonian formulation:

(4.58) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}f^{\text{ad}}(\,\tilde{p}_{\Vert })\stackrel{\text{def}}{=}-\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \left.\frac{\unicode[STIX]{x2202}f_{0}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}=\frac{q}{m}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \frac{\tilde{p}_{\Vert }-\hat{u} _{0}}{v_{\text{th}}^{2}}f_{\text{M}}.\end{eqnarray}$$

The corresponding pull-back transformation of the perturbed weights is (see (3.49))

(4.59) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}w^{\text{s,lin}}(v_{\Vert })=\unicode[STIX]{x1D6FF}w^{\text{h},\text{lin}}(\,\tilde{p}_{\Vert })-\unicode[STIX]{x1D6FF}w^{\text{ad}}(\,\tilde{p}_{\Vert }),\end{eqnarray}$$

where we have introduced the so-called ‘adiabatic weight’

(4.60) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}w^{\text{ad}}(\,\tilde{p}_{\Vert })\stackrel{\text{def}}{=}\frac{\unicode[STIX]{x1D6FF}f^{\text{ad}}(\,\tilde{p}_{\Vert })}{g(\,\tilde{p}_{\Vert })}.\end{eqnarray}$$

Again, we neglect the gyro-averaging to make the derivation not unnecessarily cumbersome. The first two moments of the adiabatic part are:

(4.61) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}{\tilde{n}}^{\text{ad}} & = & \displaystyle \widetilde{{\mathcal{M}}}_{0}^{\text{h}}[\unicode[STIX]{x1D6FF}f^{\text{ad}}]=-\frac{q}{m}\unicode[STIX]{x1D6FF}A_{\Vert }\int _{p_{\text{fin}}}\left.\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}B_{\Vert }^{\ast \text{h}}\,\text{d}^{3}p\nonumber\\ \displaystyle & = & \displaystyle -\frac{q}{m}\unicode[STIX]{x1D6FF}A_{\Vert }\int _{p_{\text{fin}}}\left(\left.\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}+\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\tilde{p}_{\Vert }\left.\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}\right)B\,\text{d}^{3}p\nonumber\\ \displaystyle & = & \displaystyle \frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v}3}\hat{n}_{0},\end{eqnarray}$$
(4.62) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\tilde{j}_{\Vert }^{\text{ad}} & = & \displaystyle q\widetilde{{\mathcal{M}}}_{1}^{\text{h}}[\unicode[STIX]{x1D6FF}f^{\text{ad}}]=-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }\int _{p_{\text{fin}}}\tilde{p}_{\Vert }\left.\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}B_{\Vert }^{\ast \text{h}}\,\text{d}^{3}p\nonumber\\ \displaystyle & = & \displaystyle -\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }\int _{p_{\text{fin}}}\left(\tilde{p}_{\Vert }\left.\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}+\frac{m}{q}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\tilde{p}_{\Vert }^{2}\left.\frac{\unicode[STIX]{x2202}f_{\text{M}}}{\unicode[STIX]{x2202}v_{\Vert }}\right|_{v_{\Vert }=\tilde{p}_{\Vert }}\right)B\,\text{d}^{3}p\nonumber\\ \displaystyle & = & \displaystyle \frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v3}}n_{0}+q\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\unicode[STIX]{x1D6FF}A_{\Vert }C_{\text{v3}}\hat{n}_{0}\hat{u} _{0},\end{eqnarray}$$

where we used (3.21). As expected, this leads to the same result as (4.49) and (4.50):

(4.63) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}{\tilde{n}}^{\text{s},\text{lin}}=\unicode[STIX]{x1D6FF}{\tilde{n}}^{\text{h},\text{lin}}-\unicode[STIX]{x1D6FF}{\tilde{n}}^{\text{ad}}, & \displaystyle\end{eqnarray}$$
(4.64) $$\begin{eqnarray}\displaystyle & \displaystyle \unicode[STIX]{x1D6FF}\tilde{j}_{\Vert }^{\text{s},\text{lin}}=\unicode[STIX]{x1D6FF}\tilde{j}_{\Vert }^{\text{h},\text{lin}}-\unicode[STIX]{x1D6FF}\tilde{j}_{\Vert }^{\text{ad}}. & \displaystyle\end{eqnarray}$$

Taking the gyro-averaging into account (4.61) and (4.62) can be generalised:

(4.65) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\tilde{\bar{n}}^{\text{ad}} & = & \displaystyle \int _{p_{\text{fin}}}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p,\end{eqnarray}$$
(4.66) $$\begin{eqnarray}\displaystyle \unicode[STIX]{x1D6FF}\tilde{\bar{j}}_{\Vert }^{\text{ad}} & = & \displaystyle \frac{q^{2}}{m}\int _{p_{\text{fin}}}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\nonumber\\ \displaystyle & & \displaystyle +\,q\int _{p_{\text{fin}}}\hat{u} _{0}\frac{\boldsymbol{b}\boldsymbol{\cdot }(\unicode[STIX]{x1D735}\times \boldsymbol{b})}{B}\frac{\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0})}{v_{\text{th}}^{2}}f_{\text{M}}\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle \,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\,B\,\text{d}\boldsymbol{R}\,\text{d}^{3}p.\end{eqnarray}$$

### 4.5 Marker representation of the symplectic moments

For our PIC algorithm we need the moments in the symplectic formulation to be discretised by the weights (we neglect gyro-averaging):

(4.67) $$\begin{eqnarray}{\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}]=\int v_{\Vert }^{k}\frac{f^{\text{s}}}{g^{\text{s}}}\,g^{\text{s}}B_{\Vert }^{\ast \text{s}}\,\text{d}^{3}v\simeq \frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}v_{\Vert p}^{k}w_{p}^{\text{s}}(v_{\Vert p}).\end{eqnarray}$$

Taking the coordinate transformation (3.28) and the invariance of the weights, equation (3.36), into account this can be rewritten as

(4.68) $$\begin{eqnarray}{\mathcal{M}}_{k}^{\text{s}}[f^{\text{s}}]\simeq \frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\left(\,\tilde{p}_{\Vert p}-\frac{q}{m}\unicode[STIX]{x1D6FF}A_{\Vert p}\right)^{k}w_{p}^{\text{h}}(\,\tilde{p}_{\Vert p}).\end{eqnarray}$$

For simplicity we have neglected the spatial binning of the weights by a test or shape function, respectively (see § 7.1). This would be needed to resolve the spatial dependency of the moments by a Monte Carlo approach.

The first moment contains an extra term depending on $\unicode[STIX]{x1D6FF}A_{\Vert }$ which is the so-called ‘skin term’:

(4.69) $$\begin{eqnarray}{\mathcal{M}}_{1}^{\text{s}}[f^{\text{s}}]\simeq \frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\left[\tilde{p}_{\Vert p}w_{p}^{\text{h}}(\,\tilde{p}_{\Vert p})-\frac{q}{m}\unicode[STIX]{x1D6FF}A_{\Vert p}w_{p}^{\text{h}}(\,\tilde{p}_{\Vert p})\right]=\frac{1}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\left(\tilde{p}_{\Vert p}-\frac{q}{m}\unicode[STIX]{x1D6FF}A_{\Vert p}\right)w_{p}^{\text{h}}(\,\tilde{p}_{\Vert p}).\end{eqnarray}$$

This equation is of key importance. It shows that two terms with relatively large statistical errors are highly statistically correlated so that the error of their difference is smaller. We can speak about a cancellation of statistical error. Thus, the purpose of the skin term is twofold: it is needed to keep the first moment invariant under the coordinate transformation (3.28) and in its marker representation to cancel statistical error.

For simplicity we assume an infinite velocity sphere and choose the background to be a shifted Maxwellian (3.10), i.e.  $f_{0}=f_{\text{M}}$ . In addition, we use the $\unicode[STIX]{x1D6FF}f$ -ansatz to diminish the statistical error of the first moment (compare with (4.17)):

(4.70) $$\begin{eqnarray}q{\mathcal{M}}_{1}^{\text{s}}[f^{\text{s}}]\simeq j_{0\Vert }-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }n_{0}+\frac{q}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\left[\tilde{p}_{\Vert p}\unicode[STIX]{x1D6FF}w_{p}^{\text{h}}(\,\tilde{p}_{\Vert p})-\frac{q}{m}\unicode[STIX]{x1D6FF}A_{\Vert p}\unicode[STIX]{x1D6FF}w_{p}^{\text{h}}(\,\tilde{p}_{\Vert p})\right].\end{eqnarray}$$

On the one hand, the statistical error is reduced by separating the background current  $j_{0\Vert }$ as an analytic expression but, on the other hand, the cancellation of statistical noise by the now analytic background part of the skin term (second term on the right-hand side) is excluded. So the $\unicode[STIX]{x1D6FF}f$ -ansatz is not fully compatible with the cancellation of the statistical error of the skin term (see § 3.2.1). This is the key problem of the Hamiltonian formulation.

The linearised form of (4.70) neglects the nonlinear skin term (last term on the right-hand side):

(4.71) $$\begin{eqnarray}q{\mathcal{M}}_{1}^{\text{s},\text{lin}}[f^{\text{s}}]\simeq j_{0\Vert }-\frac{q^{2}}{m}\unicode[STIX]{x1D6FF}A_{\Vert }n_{0}+\frac{q}{N_{\text{p}}}\mathop{\sum }_{p=1}^{N_{\text{p}}}\tilde{p}_{\Vert p}\,\unicode[STIX]{x1D6FF}w_{p}^{\text{h}}(\,\tilde{p}_{\Vert p})\end{eqnarray}$$

and should be only used in linearised PIC simulations.

### 4.6 Statistical error of the moments of the distribution function

In this section, we will focus again on the statistical error when evaluating the first two moments of the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ . In the following, we want to calculate approximately the statistical error of both moments if there would be just the adiabatic part of the distribution function. To make the calculation simpler we approximate $B_{\Vert }^{\ast \text{h}}\simeq B$ and neglect the gyro-averaging. Relying on the definitions given in § 2, the integration is performed just over velocity space (using ${\mathcal{J}}_{p}\,\text{d}^{3}p\simeq B\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}$ ) and can be formulated as calculating an expected value with respect to the marker probability density $g_{\text{M}s}$ simplified by

(4.72) $$\begin{eqnarray}g_{\text{M}s}=\frac{f_{\text{M}s}}{n_{0s}}.\end{eqnarray}$$

Thus, the expected value of the adiabatic density is

(4.73) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}n_{s}^{\text{ad}}=\text{E}_{g_{\text{M}}}[\unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}}]=\int \unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}}\,g_{\text{M}s}{\mathcal{J}}_{p}\,\text{d}^{3}p=0,\end{eqnarray}$$

where

(4.74) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}}\stackrel{\text{def}}{=}\frac{q_{s}n_{0s}\unicode[STIX]{x1D6FF}A_{\Vert }}{k_{\text{B}}T_{s}}(\,\tilde{p}_{\Vert }-\hat{u} _{0s}).\end{eqnarray}$$

The variance is given by (see (2.4)):

(4.75) $$\begin{eqnarray}\text{Var}_{g_{\text{M}}}[\unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}}]=\int (\unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}}-\text{E}_{g_{\text{M}}}[\unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}}])^{2}\,g_{\text{M}s}{\mathcal{J}}_{p}\,\text{d}^{3}p=\int (\unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}})^{2}\,g_{\text{M}s}{\mathcal{J}}_{p}\,\text{d}^{3}p.\end{eqnarray}$$

The variance and thus the standard deviation follows directly as

(4.76) $$\begin{eqnarray}\unicode[STIX]{x1D700}_{\text{n}}^{\text{ad}}\propto \unicode[STIX]{x1D70E}_{\text{n}}^{\text{ad}}=\sqrt{\mathop{\sum }_{s=\text{i},\text{e}}\text{Var}_{g_{\text{M}}}[\unicode[STIX]{x1D6FF}Y_{\text{n}s}^{\text{ad}}]}=\unicode[STIX]{x1D6FF}A_{\Vert }\sqrt{\mathop{\sum }_{s=\text{i},\text{e}}\left(\frac{q_{s}n_{0s}}{\sqrt{m_{s}k_{\text{B}}T_{s}}}\right)^{2}}.\end{eqnarray}$$

This means, although the expected value of the adiabatic density is zero, there is a total statistical error  $\unicode[STIX]{x1D700}_{\text{n}}^{\text{ad}}$ which can be quantified.

In a similar way the expected value of the adiabatic current can be formulated with a redefined random variable

(4.77) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}j_{\Vert s}^{\text{ad}}=\text{E}_{g_{\text{M}}}[\unicode[STIX]{x1D6FF}Y_{\text{j}s}^{\text{ad}}]=\int \unicode[STIX]{x1D6FF}Y_{\text{j}s}^{\text{ad}}\,g_{\text{M}s}{\mathcal{J}}_{p}\,\text{d}^{3}p=\frac{q_{s}^{2}n_{0s}\unicode[STIX]{x1D6FF}A_{\Vert }}{m_{s}},\end{eqnarray}$$

where

(4.78) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}Y_{\text{j}s}^{\text{ad}}\stackrel{\text{def}}{=}\frac{q_{s}^{2}n_{0s}\unicode[STIX]{x1D6FF}A_{\Vert }}{k_{\text{B}}T_{s}}\tilde{p}_{\Vert }(\,\tilde{p}_{\Vert }-\hat{u} _{0s}).\end{eqnarray}$$

Since the expected value does not vanish this time, it gives a contribution to the variance:

(4.79) $$\begin{eqnarray}\unicode[STIX]{x1D700}_{\text{j}}^{\text{ad}}\propto \unicode[STIX]{x1D70E}_{\text{j}}^{\text{ad}}=\sqrt{\mathop{\sum }_{s=\text{i},\text{e}}\text{Var}_{g_{\text{M}}}[\unicode[STIX]{x1D6FF}Y_{\text{j}s}^{\text{ad}}]}=\unicode[STIX]{x1D6FF}A_{\Vert }\sqrt{\mathop{\sum }_{s=\text{i},\text{e}}\left[2+\left(\frac{\hat{u} _{0s}}{v_{\text{th}s}}\right)^{2}\right]\left(\frac{q_{s}^{2}n_{0s}}{m_{s}}\right)^{2}}.\end{eqnarray}$$

From a comparison of (4.76) and (4.79) it follows that the total statistical errors are both proportional to $\unicode[STIX]{x1D6FF}A_{\Vert }$ and thus the relative errors are independent of $\unicode[STIX]{x1D6FF}A_{\Vert }$ . However, there are also differences. The dependence of the errors $\unicode[STIX]{x1D700}_{\text{n}s}^{\text{ad}}$ and $\unicode[STIX]{x1D700}_{\text{j}s}^{\text{ad}}$ on the temperature profile is as follows: there is a dependence for $\unicode[STIX]{x1D700}_{\text{n}s}^{\text{ad}}$ , there is no dependence for $\unicode[STIX]{x1D700}_{\text{j}s}^{\text{ad}}$ as long as $\hat{u} _{0s}=0$ and there is only a weak dependence for $\unicode[STIX]{x1D700}_{\text{j}s}^{\text{ad}}$ as soon as $\hat{u} _{0s}\neq 0$ (note that $\hat{u} _{0s}\ll v_{\text{th}s}$ ). In addition, the dependence on the mass ratio is less distinct for the error  $\unicode[STIX]{x1D700}_{\text{n}s}^{\text{ad}}$ compared to $\unicode[STIX]{x1D700}_{\text{j}s}^{\text{ad}}$ . The latter has the consequence, that the contribution of the ions to the total error is larger for the density moment ( $\unicode[STIX]{x1D700}_{\text{ni}}^{\text{ad}}/\unicode[STIX]{x1D700}_{\text{ne}}^{\text{ad}}\propto \sqrt{m_{\text{e}}/m_{\text{i}}}$ ) than for the current moment ( $\unicode[STIX]{x1D700}_{\text{ji}}^{\text{ad}}/\unicode[STIX]{x1D700}_{\text{je}}^{\text{ad}}\propto m_{\text{e}}/m_{\text{i}}$ ). Hence, the need for variance reduction of the ion contribution is more important for the evaluation of the density moment as for the current moment.

For an arbitrary marker distribution function  $g(\boldsymbol{z})$ we can evaluate the variance of the Hamiltonian distribution function by using the unbiased estimator of the variance from (2.6). Especially for diagnostic purpose (see figure 2), we can derive the error of the total particle number

(4.80) $$\begin{eqnarray}\unicode[STIX]{x1D700}_{\text{n}_{\text{tot}}}\propto \unicode[STIX]{x1D70E}_{\text{n}_{\text{tot}}}=\sqrt{\mathop{\sum }_{s=\text{i},\text{e}}\frac{1}{N_{\text{p}s}-1}\left[\mathop{\sum }_{p=1}^{N_{\text{p}s}}(\unicode[STIX]{x1D6FF}w_{s,p})^{2}-\frac{1}{N_{\text{p}s}}\left(\mathop{\sum }_{p=1}^{N_{\text{p}s}}\unicode[STIX]{x1D6FF}w_{s,p}\right)^{2}\right]}\end{eqnarray}$$

and the error of the total current

(4.81) $$\begin{eqnarray}\unicode[STIX]{x1D700}_{\text{j}_{\text{tot}}}\propto \unicode[STIX]{x1D70E}_{\text{j}_{\text{tot}}}=\sqrt{\mathop{\sum }_{s=\text{i},\text{e}}\frac{q_{s}^{2}}{N_{\text{p}s}-1}\left[\mathop{\sum }_{p=1}^{N_{\text{p}s}}(\tilde{p}_{\Vert }\unicode[STIX]{x1D6FF}w_{s,p})^{2}-\frac{1}{N_{\text{p}s}}\left(\mathop{\sum }_{p=1}^{N_{\text{p}s}}\tilde{p}_{\Vert }\unicode[STIX]{x1D6FF}w_{s,p}\right)^{2}\right]}.\end{eqnarray}$$

### 4.7 The cancellation problem

In the previous section, we have seen that the adiabatic part of the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ , contributes to the statistical error in both the evaluation of the particle and current number density. In some simulations, the statistical error becomes the dominant part of the perturbation to the distribution function. This was pointed out by Hatzky et al. (Reference Hatzky, Könies and Mishchenko2007) and the term ‘inaccuracy problem’ was coined. However, historically the problem was seen more restricted as the so-called ‘problem of inexact cancellation’ or ‘cancellation problem’ in Ampère’s law (Chen & Parker Reference Chen and Parker2003). It was not realised that the problem was of more general nature and also affecting the evaluation of the quasi-neutrality equation. Although the inaccuracy problem should be discussed in terms of statistical error, the understanding of the cancellation problem gives a qualitative measure of the ratio between the adiabatic and non-adiabatic part of the current. And at least in case of a Maxwellian marker loading, the adiabatic part of the current for a given species is proportional to its statistical error (see (4.79)).

If we approximate $B_{\Vert }^{\ast \text{h}}\simeq B$ and $C_{\text{v}3}=1$ , the linearised skin term is proportional to the adiabatic current term (see (4.62)):

(4.82) $$\begin{eqnarray}\frac{\unicode[STIX]{x1D6FD}_{0}}{\unicode[STIX]{x1D70C}_{0}^{2}}\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }^{\text{h},\text{lin}}=\frac{\unicode[STIX]{x1D707}_{0}q^{2}n_{0}}{m}\overline{\langle \unicode[STIX]{x1D6FF}A_{\Vert }\rangle }^{\text{h},\text{lin}}\simeq \unicode[STIX]{x1D707}_{0}\unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{ad}}.\end{eqnarray}$$

So the skin terms cancel the adiabatic current terms, and Ampère’s law (3.64) in the Hamiltonian formulation simplifies to the form

(4.83) $$\begin{eqnarray}-\unicode[STIX]{x1D735}\boldsymbol{\cdot }\unicode[STIX]{x1D735}_{\bot }\unicode[STIX]{x1D6FF}A_{\Vert }\simeq \unicode[STIX]{x1D707}_{0}\left(\unicode[STIX]{x1D6FF}\bar{j}_{\Vert \text{i}}^{\text{nonad}}+\unicode[STIX]{x1D6FF}j_{\Vert \text{e}}^{\text{nonad}}\right),\end{eqnarray}$$

where the splitting $\unicode[STIX]{x1D6FF}\bar{j}_{\Vert }=\unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{ad}}+\unicode[STIX]{x1D6FF}\bar{j}_{\Vert }^{\text{nonad}}$ was used. Note that in the Hamiltonian formulation the weights express the time evolution of both the adiabatic and non-adiabatic part of the distribution function, which is the reason why we need e.g. the control variates method to separate both parts.

Applying the long-wavelength approximation and neglecting the finite gyroradius effects for the electrons, we compare now $\unicode[STIX]{x1D6FF}j_{\Vert }^{\text{ad}}$ to $\unicode[STIX]{x1D6FF}j_{\Vert }^{\text{nonad}}$ :

(4.84) $$\begin{eqnarray}\frac{\unicode[STIX]{x1D6FF}\bar{j}_{\Vert \text{i}}^{\text{ad}}+\unicode[STIX]{x1D6FF}j_{\Vert \text{e}}^{\text{ad}}}{\unicode[STIX]{x1D6FF}\bar{j}_{\Vert \text{i}}^{\text{nonad}}+\unicode[STIX]{x1D6FF}j_{\Vert \text{e}}^{\text{nonad}}}\stackrel{m_{\text{i}}\gg m_{\text{e}}}{{\approx}}\frac{\unicode[STIX]{x1D6FF}j_{\Vert \text{e}}^{\text{ad}}}{\unicode[STIX]{x1D6FF}\bar{j}_{\Vert \text{i}}^{\text{nonad}}+\unicode[STIX]{x1D6FF}j_{\Vert \text{e}}^{\text{nonad}}}\stackrel{\unicode[STIX]{x1D6FD}_{0\text{i}}\ll 1}{{\approx}}\frac{\unicode[STIX]{x1D707}_{0}q_{\text{e}}^{2}n_{0\text{e}}\unicode[STIX]{x1D6FF}A_{\Vert }}{m_{\text{e}}\unicode[STIX]{x1D6FB}_{\bot }^{2}\unicode[STIX]{x1D6FF}A_{\Vert }}\stackrel{\text{cylinder}}{=}\frac{\unicode[STIX]{x1D707}_{0}q_{\text{e}}^{2}n_{0\text{e}}}{m_{\text{e}}k_{\bot }^{2}}.\end{eqnarray}$$

The ratio scales with $n_{0}$ but the non-adiabatic part can become quite small depending on

(4.85) $$\begin{eqnarray}k_{\bot }^{2}=k_{r}^{2}+k_{\unicode[STIX]{x1D717}}^{2}=\left\{\begin{array}{@{}ll@{}}\left({\displaystyle \frac{\unicode[STIX]{x03C0}}{2a}}\right)^{2}\quad \quad & \text{for }m=0\\ \frac{\unicode[STIX]{x03C0}^{2}+\left({\displaystyle \frac{m}{r/a}}\right)^{2}}{a^{2}}\quad \quad & \text{for }m\neq 0,\end{array}\right.\end{eqnarray}$$

where $m$ is the poloidal mode number and $a$ the minor radius. We have assumed the minimal value of  $k_{\bot }$ within a cylinder, giving $k_{r}=\unicode[STIX]{x03C0}/(2a)$ for $m=0$ and $k_{r}=\unicode[STIX]{x03C0}/a$ for $m\neq 0$ . It is obvious that  $k_{\bot }^{2}$ scales with $\propto 1/a^{2}$ , i.e. for larger minor radii the non-adiabatic part becomes very small. Especially in the MHD limit $k_{\bot }\unicode[STIX]{x1D70C}\rightarrow 0$ the non-adiabatic part becomes arbitrarily small, which makes it so hard to simulate MHD modes within the gyrokinetic model. In such a case, the non-adiabatic part, the signal, is quite small compared to the adiabatic part which gives no contribution to the evaluation of $\unicode[STIX]{x1D6FF}A_{\Vert }$ in Ampère’s law as it cancels on both sides. Nevertheless, it produces a statistical error, the noise, as we have seen in the previous section. Hence, one can speak about a problem of inexact cancellation as the noise remains.

Figure 2. A simulation of a shear Alfvén wave in a slab using exclusively electrons. (a,b) The standard deviation of the total particle number $\unicode[STIX]{x1D70E}_{\text{n}_{\text{tot}}}$ and the standard deviation of the total current  $\unicode[STIX]{x1D70E}_{\text{j}_{\Vert \text{tot}}}$ as a function of time. The standard deviation of the perturbed electron markers in the Hamiltonian formulation (dashed red line) and in the symplectic formulation (solid blue line). The simulation has been performed with $N_{\text{pe}}=10^{6}$ electron markers and $N_{z}=16$ B-splines in the parallel field direction (see § 12.1).

Figure 3. The quantity $1/(ak_{\bot })^{2}$ for various poloidal modes $m$ as a function of the normalised minor radius $r/a$ .

How pronounced the cancellation problem is for a given configuration depends on the poloidal Fourier mode $m$ and the radial position $r$ under consideration. A quantitative estimate can be given by the factor $1/(ak_{\bot })^{2}$ . It can be of the order of a million for an ITER-like configuration. In figure 3, we depict $1/(ak_{\bot })^{2}$ for various poloidal modes as a function of the normalised minor radius. The cancellation problem is most pronounced for the $m=0$ mode since $k_{\unicode[STIX]{x1D717}}$ vanishes. As the $m=0$ mode plays a major role in the formation of the zonal flow, it is of key importance to diminish the statistical error when evaluating the density and current terms. For the other poloidal modes having $m\neq 0$ the situation is less severe. In contrast to the $m=0$ case where $k_{\bot }$ is independent of the radial position, $1/k_{\bot }^{2}$ is zero at the centre and ascends to its largest value at the edge.

However, for the quasi-neutrality equation the contribution of the adiabatic part vanishes and the inaccuracy problem is not as obvious as for Ampère’s law where the skin terms are proportional to the statistical error.

## 5 The control variates method in gyrokinetic PIC simulation

The $\unicode[STIX]{x1D6FF}f$ -PIC method is a control variates method (see Aydemir (Reference Aydemir1994) and appendix A) with the restriction $\tilde{\unicode[STIX]{x1D6FC}}=1$ in (A 1), when it comes to the evaluation of the moments of the distribution function. Whenever a small deviation from an equilibrium state should be calculated the knowledge about the initial state $f(t_{0})=f_{0}$ of the system can be used to construct an effective control variate as long as the system does not evolve too far from its initial state, i.e.  $|\unicode[STIX]{x1D6FF}f|\ll |f_{0}|$ . For such situations the use of a control variate is a valuable enhancement of the full- $f$ PIC method, which has naturally a problem in resolving relatively small changes of the system. In such a case, the discretisation of the perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f$ , is achieved by the perturbed weights $\unicode[STIX]{x1D6FF}w=(f-f_{0})/g=\unicode[STIX]{x1D6FF}f/g$ . For nonlinear PIC simulation, without collisions (with collisions see Sonnendrücker et al. (Reference Sonnendrücker, Wacher, Hatzky and Kleiber2015)) and in absence of sinks and sources, the perturbed weight $\unicode[STIX]{x1D6FF}w_{p}$ can easily be calculated by the direct $\unicode[STIX]{x1D6FF}f$ -method (Allfrey & Hatzky Reference Allfrey and Hatzky2003)

(5.1) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}w_{p}(\boldsymbol{z}_{p}(t))=w_{p}-\frac{f_{0}(\boldsymbol{z}_{p}(t))}{g_{p}}\end{eqnarray}$$

as $w_{p}$ and $g_{p}$ do not change in time. This holds for both the symplectic and the Hamiltonian formulation.

Note that the local Maxwellian described in (3.10) corresponds to a kinetic equilibrium with collisions but it is not necessarily a solution of the unperturbed Vlasov equation. Therefore, when a local Maxwellian is used, the zonal components of the perturbed electrostatic potential could evolve strongly in the initial phase of the simulation (Idomura, Tokuda & Kishimoto Reference Idomura, Tokuda and Kishimoto2005). In the worst case, these fields could become so strong that they would prevent e.g. ion temperature gradient (ITG) mode growth (see Angelino et al. Reference Angelino, Bottino, Hatzky, Jolliet, Sauter, Tran and Villard2006) and would complicate the interpretation of the physical results. There are two possibilities to simplify matters. First, one could use an equilibrium distribution function which does not correspond to a real neoclassical equilibrium e.g. in a tokamak a ‘canonical Maxwellian’ (Angelino et al. Reference Angelino, Bottino, Hatzky, Jolliet, Sauter, Tran and Villard2006). Second, one could introduce a source term  ${\mathcal{S}}$ on the right-hand side of our gyrokinetic evolution equations, equations (3.1) and (3.51). In the following, we choose the second approach using the source (see Kleiber et al. Reference Kleiber, Hatzky, Könies, Kauffmann and Helander2011, (31))

(5.2) $$\begin{eqnarray}{\mathcal{S}}=\left.\frac{\text{d}}{\text{d}t}\right|_{0}f_{0}=\unicode[STIX]{x1D735}f_{0}\boldsymbol{\cdot }\frac{\text{d}\boldsymbol{R}^{(0)}}{\text{d}t}+\frac{\unicode[STIX]{x2202}f_{0}}{\unicode[STIX]{x2202}v_{\Vert }}\frac{\text{d}v_{\Vert }^{(0)}}{\text{d}t},\end{eqnarray}$$

which compensates for the contribution from $f_{0}$ not being a kinetic equilibrium. This results in a generalised form of (5.1):

(5.3) $$\begin{eqnarray}\unicode[STIX]{x1D6FF}w_{p}(\boldsymbol{z}_{p}(t))=w_{p}+w_{p}^{0}-\frac{f_{0}(\boldsymbol{z}_{p}(t))}{g_{p}},\end{eqnarray}$$

where the weights $w_{p}^{0}$ are given by integrating the equation

(5.4) $$\begin{eqnarray}\frac{\text{d}w_{p}^{0}}{\text{d}t}=\frac{{\mathcal{S}}_{p}}{g_{p}}\quad \text{with }w_{p}^{0}(t_{0})=0\end{eqnarray}$$

along the perturbed trajectories of the markers in time.

## 6 Sources of errors in PIC simulation

### 6.1 The sampling error vs. the statistical error

We would like to point out that there are further sources of error beside the statistical error being inherent to the PIC method. One source of error comes from the sampling of the distribution function which is done at the position of the markers. Initially, the markers are distributed in phase space according to the marker distribution function $g(\boldsymbol{z}(t_{0}))$ . This is equivalent to assigning each marker a certain phase-space volume $\unicode[STIX]{x1D6FA}_{p}$ (see (2.10)), which does not change in time as long as the simulation is collisionless (Liouville’s theorem). Thus, one should favour volume preserving time integration schemes to implement this property at the numerical level. Nevertheless, the shape of  $\unicode[STIX]{x1D6FA}_{p}$ can evolve and be deformed to very elongated structures, so-called ‘filaments’. However, the motion of the whole filament is evolved just by its marker position although parts of the filament might be too remote to be correctly represented by the motion of the marker position. In such a case of phase mixing, the distribution function can become under resolved.

Thus, the markers supply two essential features. First, they sufficiently resolve the dynamics of the evolution of the distribution function (sampling of the characteristics) and second, they reduce the statistical error when evaluating the moments of the distribution function. These two purposes do not lead necessarily to the same optimisation strategy for the marker distribution function. Although a certain marker distribution might be optimal to minimise the statistical variance and hence the statistical error of a particular moment, it might not be optimal for another one. And in addition, it could lead to under- or over-resolving the dynamics of the distribution function in some parts of phase space.

Although the gyrokinetic simulation resolves only the two-dimensional reduced velocity space, the integral of e.g. the particle number density, equation (3.13), is a three-dimensional integral over velocity space. To minimise the statistical error it is advantageous to sample also the coordinate  $\unicode[STIX]{x1D6FC}$ by markers. Otherwise the statistical variance of the zeroth moment would become large. However, from the point of view of resolving the dynamics, it makes little sense to resolve a velocity coordinate which has no dynamics at all. So it is not obvious what would be optimal in such a situation. Especially, as there is also a big difference in the dynamic evolution of a linear and a nonlinear simulation.

### 6.2 The accuracy of the equilibrium quantities

Special care has to be taken when the quantities of the equilibrium magnetic field are provided. These quantities are either calculated from an ad hoc equilibrium which is usually costly or they are imported from the mesh of an equilibrium solver. Hence, it is common to precalculate these quantities and to store them on a grid. Whenever needed they can be extrapolated at the position of the markers. However, problems can occur e.g. in a tokamak when magnetic coordinates (see appendix C) are used to discretise the potential solver. In this case, the mesh of the potential solver will resolve very fine structures close to the origin, as the magnetic coordinates behave there more or less like polar coordinates. Thus, it is important that the mesh of the equilibrium quantities also uses magnetic coordinates to be consistent with the discretisation of the potential equations (see § 7.2). Only then will both meshes show the same high resolution close to the origin. In addition, the explicit mesh size of the equilibrium grid has to be adjusted in a convergence study.

### 6.3 Further sources of error

Another source of error is the discretisation error when integrating the equations of motion of the markers, equations (3.53) and (3.54), in time (marker pushing). Electromagnetic PIC simulations are already quite complex, which makes time integrators of higher order, like the fourth-order Runge–Kutta method, appropriate.

Last but not least there is the discretisation error due to the discretisation of the potential equations which could be e.g. a finite difference or finite element discretisation (see § 7).

## 7 Finite element discretisation of the potential equations

In the following, we want to consider the discretisation of a function $h(\boldsymbol{x},t)$ with $K$ finite elements by using Galerkin’s method (for a more detailed introduction see Höllig (Reference Höllig2003)). The function $h(\boldsymbol{x},t)$ is approximated by $\tilde{h}(\boldsymbol{x},t)$ which is a linear combination of finite element basis functions $\unicode[STIX]{x1D6EC}(\boldsymbol{x})$ . The discretisation of $h(\boldsymbol{x},t)$ is given by the ansatz

(7.1) $$\begin{eqnarray}h(\boldsymbol{x},t)\simeq \tilde{h}(\boldsymbol{x},t)=\mathop{\sum }_{j=1}^{K}\tilde{c}_{j}(t)\,\unicode[STIX]{x1D6EC}_{j}(\boldsymbol{x}),\end{eqnarray}$$

where the finite element basis consists of the elements $\unicode[STIX]{x1D6EC}_{j}(\boldsymbol{x})$ (where $j$ is a multi-index). In order to solve the equation ${\mathcal{L}}h=f$ with ${\mathcal{L}}$ being a linear operator one defines a scalar product by

(7.2) $$\begin{eqnarray}(u,v)\stackrel{\text{def}}{=}\int u(\boldsymbol{x})v(\boldsymbol{x})\,{\mathcal{J}}_{x}\,\text{d}^{3}x,\end{eqnarray}$$

and the Galerkin approach requires

(7.3) $$\begin{eqnarray}({\mathcal{L}}\tilde{h}-f,\unicode[STIX]{x1D6EC}_{j})=0\quad \Rightarrow \quad ({\mathcal{L}}\tilde{h},\unicode[STIX]{x1D6EC}_{j})=(f,\unicode[STIX]{x1D6EC}_{j})\quad \forall \unicode[STIX]{x1D6EC}_{j}.\end{eqnarray}$$

Note that the geometrical information is included in the Jacobian ${\mathcal{J}}_{x}$ which makes the method simple to use especially for complex coordinate systems.

One obtains the B-spline coefficients $\tilde{c}_{j}$ by solving the matrix equation

(7.4) $$\begin{eqnarray}\mathop{\sum }_{j=1}^{K}l_{kj}\tilde{c}_{j}=\tilde{b}_{k},\quad k=1,\ldots ,K,\end{eqnarray}$$

where

(7.5) $$\begin{eqnarray}l_{kj}\stackrel{\text{def}}{=}({\mathcal{L}}\unicode[STIX]{x1D6EC}_{j},\unicode[STIX]{x1D6EC}_{k})\end{eqnarray}$$

and the load vector $\tilde{\boldsymbol{b}}$ (the projection of $f(\boldsymbol{x})$ onto the finite element basis functions):

(7.6) $$\begin{eqnarray}\tilde{b}_{k}\stackrel{\text{def}}{=}(f,\unicode[STIX]{x1D6EC}_{k}).\end{eqnarray}$$

In the special case of ${\mathcal{L}}$ being the identity, the matrix of (7.4) is called the mass matrix labelled by  $\unicode[STIX]{x1D648}$ . As the support of the finite elements is finite, the mass matrix $\unicode[STIX]{x1D648}$ has a sparse structure. Accordingly, the matrix equations can usually be solved by sparse linear algebra packages.

After solving the Galerkin matrix equation for $\tilde{\boldsymbol{c}}$ , the function  $\tilde{h}(\boldsymbol{x})$ is well defined at every point in configuration space. Operators like integrals and derivatives acting on  $h(\boldsymbol{x})$ can be performed by letting them act on the elements  $\unicode[STIX]{x1D6EC}_{j}(\boldsymbol{x})$ of the finite element basis. Complex geometrical issues are easily handled by the corresponding calculus of the finite elements  $\unicode[STIX]{x1D6EC}_{j}(\boldsymbol{x})$ . As mentioned in § A.1, an advantage of the finite element discretisation in combination with a variational principle is that it gives an energy-conserving scheme (Lewis Reference Lewis1970) for the full- $f$ method.

For our discretisation, we choose B-splines as finite elements. The three dimensional B-spline $\unicode[STIX]{x1D6EC}_{j}^{d}(\boldsymbol{x})$ is a product of one-dimensional (1-D) B-splines of order $d$ (tensor product B-splines, see e.g. Höllig Reference Höllig2003). The B-spline sequence $(\unicode[STIX]{x1D6EC}_{j}^{d})$ consists of non-negative functions which sum to one, i.e. in mathematical terms, $(\unicode[STIX]{x1D6EC}_{j}^{d})$ provides a partition of unity (DeBoor Reference DeBoor1978, p. 111), which is important for the conservation of the particle and current number density.

### 7.1 Discretisation of the charge and current assignments

In the charge and current assignments the weights of the markers of species  $s$ are projected onto the finite element basis. Using the unbiased Monte Carlo estimator of the expected value, equation (2.5), one can derive the charge- and current-assignment vectors without gyro-averaging

(7.7) $$\begin{eqnarray}\displaystyle n_{s,k}^{\text{h}} & \stackrel{\text{def}}{=} & \displaystyle q_{s}\int f_{s}^{\text{h}}\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{x})B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\,\text{d}\boldsymbol{x}\nonumber\\ \displaystyle & = & \displaystyle q_{s}\int \frac{f_{s}^{\text{h}}}{g_{s}^{\text{h}}}\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{R})\,g_{s}^{\text{h}}B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\simeq \frac{q_{s}}{N_{\text{p}s}}\mathop{\sum }_{p=1}^{N_{\text{p}s}}w_{s,p}\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{R}_{p}),\end{eqnarray}$$
(7.8) $$\begin{eqnarray}\displaystyle j_{\Vert s,k}^{\text{h}} & \stackrel{\text{def}}{=} & \displaystyle \unicode[STIX]{x1D707}_{0}q_{s}\int \tilde{p}_{\Vert }\,f_{s}^{\text{h}}\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}-\boldsymbol{x})\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{x})B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}^{3}p\,\text{d}\boldsymbol{x}\nonumber\\ \displaystyle & \simeq & \displaystyle \frac{\unicode[STIX]{x1D707}_{0}q_{s}}{N_{\text{p}s}}\mathop{\sum }_{p=1}^{N_{\text{p}s}}\tilde{p}_{\Vert p}w_{s,p}\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{R}_{p})\end{eqnarray}$$

and with gyro-averaging (Fivaz et al. Reference Fivaz, Brunner, de Ridder, Sauter, Tran, Vaclavik and Villard1998; Hatzky et al. Reference Hatzky, Tran, Könies, Kleiber and Allfrey2002)

(7.9) $$\begin{eqnarray}\displaystyle \bar{n}_{s,k}^{\text{h}} & \stackrel{\text{def}}{=} & \displaystyle q_{s}\int f_{s}^{\text{h}}\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{x})B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}\,\text{d}\boldsymbol{x}\nonumber\\ \displaystyle & = & \displaystyle q_{s}\int \frac{f_{s}^{\text{h}}}{g_{s}^{\text{h}}}\left(\frac{1}{2\unicode[STIX]{x03C0}}\mathop{\oint }_{0}^{2\unicode[STIX]{x03C0}}\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{R}+\unicode[STIX]{x1D746})\,\text{d}\unicode[STIX]{x1D6FC}\right)\,g_{s}^{\text{h}}B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\hat{\unicode[STIX]{x1D6FC}}\nonumber\\ \displaystyle & \simeq & \displaystyle \frac{q_{s}}{N_{\text{p}s}}\mathop{\sum }_{p=1}^{N_{\text{p}s}}w_{s,p}\,\frac{1}{2\unicode[STIX]{x03C0}}\mathop{\oint }_{0}^{2\unicode[STIX]{x03C0}}\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{R}_{p}+\unicode[STIX]{x1D746}_{s,p})\,\text{d}\unicode[STIX]{x1D6FC},\end{eqnarray}$$
(7.10) $$\begin{eqnarray}\displaystyle \bar{j}_{\Vert s,k}^{\text{h}} & \stackrel{\text{def}}{=} & \displaystyle \unicode[STIX]{x1D707}_{0}q_{s}\int \tilde{p}_{\Vert }\,f_{s}^{\text{h}}\,\unicode[STIX]{x1D6FF}(\boldsymbol{R}+\unicode[STIX]{x1D746}-\boldsymbol{x})\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{x})B_{\Vert }^{\ast \text{h}}\,\text{d}\boldsymbol{R}\,\text{d}\tilde{p}_{\Vert }\,\text{d}\tilde{\unicode[STIX]{x1D707}}\,\text{d}\unicode[STIX]{x1D6FC}\,\text{d}\boldsymbol{x}\nonumber\\ \displaystyle & \simeq & \displaystyle \frac{\unicode[STIX]{x1D707}_{0}q_{s}}{N_{\text{p}s}}\mathop{\sum }_{p=1}^{N_{\text{p}s}}\tilde{p}_{\Vert p}w_{s,p}\,\frac{1}{2\unicode[STIX]{x03C0}}\mathop{\oint }_{0}^{2\unicode[STIX]{x03C0}}\unicode[STIX]{x1D6EC}_{k}^{d}(\boldsymbol{R}_{p}+\unicode[STIX]{x1D746}_{s,p})\,\text{d}\unicode[STIX]{x1D6FC},\end{eqnarray}$$

which results in the finite element load vectors $\boldsymbol{n}_{s}^{\text{h}}$ , $\boldsymbol{j}_{\Vert s}^{\text{h}}$ and $\bar{\boldsymbol{n}}_{s}^{\text{h}}$ , $\bar{\boldsymbol{j}}_{\Vert s}^{\,\text{h}}$ . Analogously the charge- and current-assignment vectors for the background $f_{0}$ and perturbation to the distribution function, $\unicode[STIX]{x1D6FF}f^{\text{h}}$ , can be defined. Finally, all these definitions can be extended to the symplectic case.

The integral over the gyro-phase angle $\unicode[STIX]{x1D6FC}$ is usually discretised with an