A posteriori error estimates for the virtual element method

An posteriori error analysis for the virtual element method (VEM) applied to general elliptic problems is presented. The resulting error estimator is of residual-type and applies on very general polygonal/polyhedral meshes. The estimator is fully computable as it relies only on quantities available from the VEM solution, namely its degrees of freedom and element-wise polynomial projection. Upper and lower bounds of the error estimator with respect to the VEM approximation error are proven. The error estimator is used to drive adaptive mesh refinement in a number of test problems. Mesh adaptation is particularly simple to implement since elements with consecutive co-planar edges/faces are allowed and, therefore, locally adapted meshes do not require any local mesh post-processing.


Introduction
The virtual element method (VEM) is a numerical framework introduced in [7] for the approximation of solutions of partial differential equations (PDEs). Key attributes of VEM are its ability to permit the use of meshes with very general polygonal/polyhedral elements [2,4,8,9,12,18,21] and the seamless incorporation of approximation spaces with arbitrary global regularity [12]. There has been a strong interest in recent years in the development of numerical methods on general polygonal/polyhedral meshes [7,10,19,20,22,[25][26][27]30,39,44], not least due to the potential appeal of such mesh generality in the context of Lagrangian and/or adaptive refinement/coarsening algorithms. The virtual element method revolves around a virtual element space of trial and test functions, defined implicitly through local PDE problems on each element. The local spaces are designed to contain a space of (physical frame) polynomials, ultimately responsible for the accuracy of the method, as well as a complementary space of more general non-polynomial functions. In this respect, VEM belongs to the wide family of Generalised FEM [5], as do other approaches to general meshes such as the Polygonal FEM [39], numerical multiscale methods (see, e.g. [1,28,33] and the references therein), and, so-called, Trefftz-type methods in general [6,29,31,38]. Quite differently from all the above approaches, though, the virtual element method does not require the evaluation of non-polynomial functions, even in a rough fashion. Instead, to produce fully computable and accurate VEM formulations on general meshes, the method's degrees of freedom are carefully chosen so that relevant projections of the local virtual element functions into the local subspace of polynomials are computable. A crucial consequence of this approach is that the VEM computed solution is not available in the form of a (virtual element) function. Rather, the solution is represented via the values of its degrees of freedom, from which we can access, for instance, the piecewise polynomial projection of the corresponding complete virtual element function.
Given the virtual nature of the method, the design and analysis of fully computable a posteriori error bounds for VEM is a challenging task. In [13], a posteriori error bounds for the C 1 -conforming VEM for the two-dimensional Poisson problem are proven. The C 1 -continuity of the VEM space was employed to circumvent the fact that the inter-element normal fluxes of the virtual basis functions are not computable in the more standard C 0 -conforming method. Furthermore, the analysis of [13] relies on a Clément-type interpolant construction requiring quadratic (or higher-order) virtual element spaces. To the best of our knowledge, [13] is the only a posteriori error analysis for VEM currently available in the literature.
In this work, we present a new residual-type a posteriori error analysis for the C 0 -conforming VEM introduced in [21] for the discretization of second order linear elliptic reaction-convection-diffusion problems with non-constant coefficients in two and three dimensions. We circumvent the fact that the VEM solution normal fluxes are not computable by replacing them by a suitable projection of the fluxes instead, resulting in the introduction of virtual inconsistency terms in the a posteriori error estimator to account for this replacement error. Moreover, the analysis is based on a new Clément-type VEM interpolant in two and three dimensions, which, crucially, allows for minimal regularity interpolation by linear VEM functions. This new interpolant, which may be of independent interest, is constructed starting from the standard finite element Clément interpolant on a regular sub-triangulation; cf. also [35] for a related idea for a two-dimensional VEM interpolant. In two spatial dimensions, the resulting constants in the Clément interpolation estimate are dependent on the respective FEM Clément interpolant on a regular sub-triangulation, which are in principle available [40,42], along with other computable quantities. In the three-dimensional case, when general polygonal element interfaces are present in the mesh, a second, not easily computable in general, constant appears; see Remark 12 for a detailed discussion. Once equipped with the above developments, a posteriori bounds are derived by careful treatment of the inconsistency terms, whereby appropriate projection operators are introduced into the discrete problem formulation; we refer to [37] for a related general framework for a posteriori analysis of inconsistent discontinuous Galerkin methods. Lemma 18 gives a lower bound for these inconsistency terms, indicating that they are of correct order up to data oscillation. Although the focus of this work is the VEM introduced in [21], the proof of the a posteriori error bounds is quite general and can be adapted in a straightforward manner to other VEM approaches, such as the VEM proposed in [9], cf. the discussion in Sect. 7 below.
Adaptive mesh refinement driven by a posteriori error estimators is a well established tool for the efficient numerical solution of PDEs exhibiting local, numerically challenging, solution features. In this context, the extreme mesh flexibility allowed by the VEM approach offers a number of potential advantages. For instance, locally adapted meshes do not require any local post-processing: very general polygonal/polyhedral meshes are admissible due to the physical frame polynomial subspaces included in the VEM space, therefore removing any restrictions posed by maximum angle conditions or mesh-distortion, as is the case for standard adaptive FEM. Moreover, VEM avoids the need to introduce additional degrees of freedom for hanging node/face removal ('green refinement') during mesh refinement: hanging nodes introduced by the refinement of a neighbouring element are simply treated as new nodes since adjacent co-planar elemental interfaces are perfectly acceptable. Furthermore, in the VEM context, coarsening becomes trivial and inexpensive to implement as node removal does not necessitate any further local mesh modification. The latter is particularly attractive in the context of numerical solution of evolution PDEs where mesh-coarsening is standard practice to track evolving fronts and singularities efficiently. Indeed, apart from making mesh change straightforward to implement, the mesh flexibility offered by VEM may have the potential to provide complexity reduction with respect to standard FEM on traditional simplicial or box-type meshes. At the time of writing, no results in this direction are available.
The remainder of this work is structured as follows. In Sect. 2, we describe the model problem and in Sect. 3 we introduce the virtual element method. Some fundamental approximation results are presented in Sect. 4, which are used to prove upper and lower bounds for an a posteriori error estimator in Sect. 5. This estimator is then used in Sect. 6 within an automatic mesh adaptivity algorithm in a series of numerical examples, confirming numerically its optimality. Finally, in Sect. 7 we give some concluding remarks.
Below, we shall use standard notation for the relevant function spaces. For a Lipschitz domain ω ⊂ R d , d = 2, 3, we denote by H s (ω) the Hilbert space of index s ≥ 0 of real-valued functions defined on ω, endowed with the seminorm | · | s,ω and norm · s,ω ; further (·, ·) ω stands for the standard L 2 inner-product. Finally, |ω| denotes the d-dimensional Hausdorff measure of ω.

The continuous problem
Let Ω ⊂ R d be a polygonal domain for d = 2 or a polyhedral domain for d = 3 and consider the linear second order elliptic boundary value problem d×d is a strongly elliptic symmetric diffusion tensor, i.e. there exist κ * , κ * > 0, independent of v and x, such that for almost every x ∈ Ω and for any v ∈ [H 1 0 (Ω)] d , with |·| denoting the standard Euclidean norm on R d . Finally, we assume that, for almost every x ∈ Ω, there exists a constant μ 0 such that Problem (2.1) can be written in variational form: Find u ∈ H 1 0 (Ω) such that with (·, ·) denoting the (standard) L 2 inner-product over Ω. Following [21], we split the bilinear form on the left-hand side of (2.4) into its symmetric and skew-symmetric parts

5b)
and we consider the problem written in the equivalent form: find u ∈ H 1 0 (Ω) such that Rewriting the bilinear form in this fashion is a useful step in view of preserving the coercivity of A at the virtual (discrete) level, independently of the mesh size. An alternative VEM based on the original variational form (2.4) and without assuming coercivity is presented in [9], whose well-posedness relies on selecting sufficiently small mesh size. The a posteriori error analysis presented below can be also applied with to the method of [9], with minor modifications, cf. Sect. 7.
3 The virtual element method 3 We note that, in particular, partitions including non-convex elements are allowed, as also are elements with consecutive co-planar edges/faces, such as those typical of locally refined meshes with hanging nodes. We also make the following mesh regularity assumptions which are standard in this context, cf. [2,21].

Assumption 1 (Mesh regularity)
We assume the existence of a constant ρ > 0 such that 1. Every element E of T h is star-shaped with respect to a ball of radius ρh E ; 2. For every element E of T h and every interface s of E, h s ≥ ρh E ; 3. For d = 3, every interface s ∈ S h viewed as a 2-dimensional element satisfies assumptions 1 and 2 above.
Remark 2 (Global shape regularity) As in the a priori setting [7,21], the a posteriori error analysis presented below extends in a straightforward fashion to the case of polygonal/polyhedral elements which result from simply connected finite union of sub-elements each satisfying Assumption 1. Moreover, the extension of the VEM a priori error analysis recently presented in [11] for the d = 2 case, indicates that it may be possible to relax the condition on the size of the interfaces. This hypothesis is explored numerically in Sect. 6. Therein, by not imposing any restrictions on the size of the edges in the mesh, we show that the performance of the method or the estimators are not affected in practice. An immediate consequence of the above, simplifying, mesh regularity assumptions is that each element E admits a sub-triangulation T E h , a partition of E into triangles when d = 2 and tetrahedra when d = 3, in such a way that the resulting global triangulation T h := E∈T h T E h is shape regular. For d = 2 this is obtained by joining each vertex of E with a point with respect to which E is starred. For d = 3 the same procedure can be applied starting from the corresponding triangulation of each face.
Throughout the paper, we denote by Π 0 : L 2 (E) → P (E) the L 2 (E)-orthogonal projection onto the space P (E) of polynomials with total degree , for any E ∈ T h and ∈ N ∪ {0}.

Virtual element spaces
We begin by recalling the construction of the conforming virtual element space from [21]. For each T h and p ∈ N, we shall construct a virtual element space V h ⊂ H 1 0 (Ω) of order p in an element-wise fashion; V h will be of order p ∈ N if, for each element E ∈ T h , the space V E h := V h | E contains the space P p (E) of polynomials of degree p on E. In general, the space V E h will also contain non-polynomial functions. However, the distinctive idea of VEM is that of computability based on degrees of freedom, which stems from the view that the complement of P p (E) in V E h is made up of functions which are deemed expensive to evaluate.

Definition 3 (Computability)
A term is computable if it may be evaluated using the data of the problem, the degrees of freedom, and the polynomial component of the virtual element space only.
We shall consider two types of degrees of freedom: nodal values and polynomial moments.
The local virtual element space is constructed recursively in space dimensions. We first consider the case d = 2. On each edge interface s ∈ ∂ E we take V s h := P p (s) and we define the auxiliary space W E h as The elements of W E h can be uniquely identified by the following set of degrees of freedom [2,21]: These degrees of freedom make the terms Π 0 [21]; for instance, the projection Π 0 p v h is given directly by the internal degrees of freedom M E p . The crucial property P p (E) ⊂ W E h would still be satisfied by the smaller space obtained by requiring Δv h ∈ P p−2 (E) instead of Δv h ∈ P p (E) in (3.1). This is, indeed, the original virtual element space introduced in [7]. However, since the elemental projection Π 0 p is not computable in the original VEM space of [7], a different subspace of W E h with the same dimension as the original space of [7] was introduced [2]. In the latter subspace definition, the L 2 -projection onto P p (E) is computable using the extra higher-order moments.
The crucial observation is that, in the polynomial space In particular, and following [21], this can simply be taken as the projection corresponding to the Euclidean inner product on DoF(V E h ). This is, indeed, the choice used in all numerical tests presented in Sect. 6.
Given any such projection operator Π * p , we can define a local virtual element space V E h ⊂ W E h by clamping the internal higher-order moments: By construction, the space V E h is identified by the degrees of freedom DoF(V E h ) of (3.2), used to define Π * p . A counting argument shows that the cardinality of the above sets of degrees of freedom is Representative examples are illustrated in Fig. 1. Further, we note that it is also possible to compute Π 0 p v h and Π 0 p−1 ∇v h for each v h ∈ V E h just from the reduced set of degrees of freedom since we can access the higher order moments through Π * p ; we refer to [21, Section 4.1] for the details. For the case d = 3, we first (re-)define V s h on each face s ∈ ∂ E to be the 2dimensional virtual element space given by (3.3). The construction of the local virtual element space on E now follows by defining the auxiliary space W E h of (3.1) and final space V E h of (3.3) in exactly the same way as the 2-dimensional case. In the 3-dimensional case, V E h is identified by the following set of degrees of freedom [21]: Fig. 1 The VEM degrees of freedom for a hexagonal mesh element for p = 1, 2, 3, 4; nodal values and edge moments are marked by a circle; internal moments are marked by a square. The number of moments per edge is given by N 1, p−2 = 0, 1, 2, 3 and the number of internal moments is given by N 2, p−2 = 0, 1, 3, 6 for p = 1, 2, 3, 4, respectively Therefore, the dimension of the local space for d = 3 is where ν E and ν E denote, respectively, the number of vertices and edges of E, cf. [21]. Representative examples are illustrated in Fig. 2. Finally, the global space is constructed from these local spaces as and the global degrees of freedom are obtained by collecting the local ones, with the nodal and interface degrees of freedom corresponding to internal entities counted only once. Those on the boundary are fixed to be equal to zero in accordance with the ambient space H 1 0 (Ω).

Discrete formulation
We shall now recall the VEM for (2.6) introduced in [21]. For every E ∈ T h , let a E and b E be the elemental continuous forms obtained by restriction of the forms in (2.5a) and (2.5b) onto the element E, respectively. A virtual bilinear form which is split into the symmetric and skew-symmetric discrete bilinear forms a E h and b E h corresponding to the continuous forms a E and b E , respectively. To define A E h precisely, we begin by introducing the concept of admissible stabilising forms.
Definition 5 (Admissible stabilising forms) Let E ∈ T h . Two computable (in the sense of Definition 3), symmetric, and positive definite bilinear forms R are said to be local admissible bilinear forms for stabilising the diffusion and reaction terms in (2.5a) if they respectively satisfy for some constant C stab independent of E and h. A practical choice of admissible stabilising bilinear forms is given in (6.1). We note here a trivial consequence of the above definition which will be useful in the analysis: where κ E * , μ E 0 are the local counterparts of κ * , μ 0 , respectively. A virtual element stabilising term S E may then be defined as the linear combination of any pair of admissible diffusion and reaction stabilising forms: with S E 1 , S E 0 admissible stabilising bilinear forms and s 1 , s 0 positive constants. We prefer to keep the dependence of the stabilising form on the constants s 1 and s 0 explicit, to be able to study their influence on the constants in the a posteriori bounds below.
For every E ∈ T h , the local symmetric and skew-symmetric discrete bilinear forms

Remark 6 (Polynomial consistency and stability) If
. This property is referred to as polynomial consistency in the VEM literature [7]. Furthermore, Definition 5 ensures the following stability property: there exists a positive constant C stab , independent of h and the mesh element E, such that h yields the coercivity of A h . We refer to [21] for the details. The virtual element method (VEM) then reads: find u h ∈ V h such that

Approximation properties
The conforming virtual element space introduced above satisfies optimal properties for approximating sufficiently smooth functions. In particular, the theory in [17] for star-shaped domains may be used to prove the following theorem regarding the approximation properties of the L 2 (E)-orthogonal projection to polynomials.

Theorem 7 (Approximation using polynomials) Suppose that Assumption
The positive constant C proj depends only on the polynomial degree and the mesh regularity.
We shall make use of standard bubble functions on polygons/polyhedra below. A bubble function ψ E ∈ H 1 0 (E) for a polygon/polyhedron E can be constructed piecewise as the sum of the (polynomial) barycentric bubble functions (cf. [3,41]) on each d-simplex of the shape-regular sub-triangulation of the mesh element E discussed in Remark 2.

Lemma 8 (Interior bubble functions) Let E ∈ T h and let ψ E be the corresponding bubble function. There exists a constant C bub , independent of h E such that for all q ∈ P p (E)
Lemma 9 (Edge bubble functions) For E ∈ T h , let s ⊂ ∂ E be a mesh interface and let ψ s be the corresponding interface bubble function. There exists a constant C bub , independent of h E such that for all q ∈ P p (s) Here, with slight abuse of notation, the symbol q is also used to denote the constant prolongation of q in the direction normal to s.
We shall first use the above two results to prove an inverse inequality for virtual element functions, made possible by the fact that functions in W E h and V E h have polynomial Laplacians.
Lemma 10 (Inverse inequality) Suppose that Assumption 1 is satisfied. Let E ∈ T h and let w ∈ H 1 (E) be such that Δw ∈ P p (E). There exists a constant C inv , independent of w, h and E, such that Proof We first require an auxiliary polynomial inverse inequality . This may be proven by selecting v = qψ E in the definition of the dual norm, viz.
and using Lemma 8. Applying this to Δw ∈ P p (E), we find that Now, using (4.1), along with an integration by parts, we deduce The result then follows by applying the Cauchy-Schwarz inequality.
The above inverse estimate will be used to prove an approximation theorem (Theorem 11 below) for the virtual element spaces considered in this work. The proof of Theorem 11 is inspired by [35,Prop. 4.2], where a related result is obtained in the much simpler setting of the original virtual element space of [7] for d = 2 only. As the construction in [35,Prop. 4.2] does not appear to generalize to d = 3, we use a different construction for the Clément-type interpolant below.
We begin by recalling some classical polynomial interpolation results on simplicial triangulations. Assumption 1 implies the existence of a globally shape-regular subtriangulation T h of T h , cf. Remark 2. We use this to define v c as the classical Clément interpolant [24] of v of degree p over the sub-triangulation T h . Then, the following approximation estimates hold [24] for all T ∈ T h , withĈ Clem a positive constant depending only on the polynomial degree p and on the mesh regularity. Here, T denotes the usual finite element patch relative to T .
Theorem 11 (Approximation using virtual element functions) Suppose that Assumption 1 is satisfied and let V h denote the virtual element space (3.5).
C Clem being a positive constant, depending only on the polynomial degree p and the mesh regularity.
Proof We denote by v c the Clément interpolant defined over a sub-triangulation T h and satisfying (4.2). It is assumed that all edges of the polygonal/polyhedral mesh T h are also edges of the sub-triangulation T h , cf. Remark 2.
Case d = 2. We start by interpolating v c into the enlarged virtual element space W h . More specifically, we define w I elementwise as the solution of the problem and v c is a polynomial of degree p on each edge of E, we may conclude that Arguing as in [35,Proposition 4.2], we may show that and, therefore, Now, w I allows us to construct an interpolant v I ∈ V h using the definition of V E h [given in (3.3)] on each E ∈ T h . By definition, the two interpolants v I and w I are equal on the mesh skeleton S h and for all Integration by parts yields Identity (4.6) can then be rewritten as since v I and w I have the same moments of up to degree p − 2, while v I and Π * p w I share the same moments of degree p and p − 1. The Cauchy-Schwarz inequality then implies that Further, from the stability of the L 2 projection we get where I denotes the identity operator on the space P p (E). Thus, by Lemma 10. Further, adding and subtracting Π 0 p w I and using the stability of Π * p and then using the Poincaré inequality (either on each 2-simplex of the shape-regular sub-triangulation E or directly on E, cf. [40]), we obtain for some uniform constants C * 0 and C P > 0 which depend on the shape regularity constant. Then, the triangle inequality, the stability of Π 0 p with constant, say, C 0 , and (4.5) imply that Finally, the triangle inequality, the above bound, and (4.5), imply with Since v I and v c are equal on ∂ E, we may apply the Poincaré inequality to this to obtain a bound on v c − v I 0,E , with an extra power of h E .
The required bounds of |v − v I | r,E , r = 0, 1, now follow by the triangle inequality, adding and subtracting v and Π 0 p v to the right-hand side of (4.7), using once again the triangle inequality, and applying the bounds (4.2) and Theorem 7.
Case d = 3. The proof in this case is based on using on each face s ∈ ∂ E the construction just considered for d = 2 and then extending this inside E.
Let Further, from w s I we may use the 2-dimensional construction to obtain v s I ∈ V s h . This face interpolant satisfies (4.7) with s in place of E, namely: with Π 0,s p v c denoting the L 2 -projection of the restriction of v c to s. Collecting the face-wise definitions we obtain a continuous interpolant v ∂ E I on ∂ E. With this, we first construct w I on E as the solution of the problem so that w I ∈ W E h by definition, as in the case d = 2 (cf. (3.1)).

In view of bounding |w
Recall that, for all s ∈ ∂ E, we have (v s I −v c )| ∂s = 0. Moreover, by Assumption 1, over s we may construct a shape-regular pyramid P s ⊂ E with |P s | ≥ ρ|E|. By the Trace Theorem applied to s ∈ ∂ P s , there exists ϕ s ∈ H 1 (P s ) with ϕ s | ∂ P s \s = 0 and a constant C T > 0 such that The constant C T can be bounded uniformly over all s by a generalised scaling argument, cf. [21] and the references therein. Hence, defining ϕ = s∈∂ E ϕ s +v c −Π 0 p v c , where each ϕ s should be interpreted as its extension to zero on E, we have by con- Thus, as in the case d = 2, we have It just remains to bound the first term on the right-hand side. To this end, we use the Sobolev Interpolation Theorem and Poincaré inequality (facewise, cf. the case d = 2 above): for some constant C S > 0 which, again, can be bounded uniformly over all s by a generalised scaling argument. To obtain the last bound above we used (4.8) applied to s ∈ ∂ E. The interface terms above are further bounded by applying Theorem 7, yielding Using this bound in (4.10) and the latter in (4.9), we finally obtain with C 2 > 0 depending on the (uniformly bounded) number ν E of interfaces of E and on the constants C T , C P , C S , C inv , and C proj . Now, given w I , we can construct an interpolant v I ∈ V h exactly as in the 2dimensional case and following the same (dimension-independent) argument derive the bound (4.7). This latter bound, combined with (4.11), yields for some C 3 > 0 depending on C 1 and C 2 .
From (4.12) we can derive the required bound in the L 2 -norm by resorting to the scaled Poincaré-Friedrichs inequality [16] and recalling (4.12): The interface terms on the right-hand side can be further bounded using the Poincaré inequality once more and (4 .8): Finally, combining this bound with (4.13) yields The statement of the theorem now follows, as in the case d = 2.
Remark 12 For d = 3, the proof of the above VEM approximation result makes use of both the Trace Theorem and Sobolev Interpolation Theorem applied to each mesh interface. This was necessitated by the hierarchical construction of the local virtual element spaces with respect to spatial dimension. The associated constants are uniformly bounded but depend on the polygonal shape of the mesh interfaces, and as such are not easily accessible in general. However, if the mesh interfaces are triangular or the method is constructed on the sub-triangulation of each mesh interface, the proof does only depend on easily computable quantities.

A posteriori error analysis
We shall now derive a residual-type a posteriori error bound for the error in the standard energy norm: for v ∈ H 1 (ω), for any ω ⊆ Ω. The coercivity and continuity of the bilinear form A in this norm follow from the assumptions on the coefficients κ and μ, which ensure that for v ∈ H 1 0 (Ω), where C equiv := √ (1 + C PF )/κ * , with C PF the Poincaré-Friedrichs constant, and C equiv := √ max{κ * , μ ∞ }, cf. [21]. The coercivity and continuity of A h in this norm are then inherited from A through the virtual element stability property (3.9).
To account for the effects of data oscillation, we introduce the following piecewisepolynomial approximations of the PDE coefficients: For quantities which may be discontinuous across the mesh skeleton, we define the jump operator · across a mesh interface s ∈ S h as follows. If s ∈ S int h , then there exist E + and E − such that s ⊂ ∂ E + ∩ ∂ E − . Denote by v ± the trace of the vector-valued function v| E ± on s from within E ± and by n ± s the unit outward normal on s from E ± . Then, v := v + · n + s + v − · n − s . If, on the other hand, s ∈ S bdry h , then v := v · n s , with v representing the trace of v from within the element E having s as an interface and n s is the unit outward normal on s from E.

The residual equation
Define e := u − u h ∈ H 1 0 (Ω), and let v ∈ H 1 0 (Ω). Then, we have, respectively, for any χ ∈ V h , since u satisfies the weak form of the PDE problem and u h is the virtual element solution. Notice that, in contrast to a posteriori bounds for standard finite element approximations, additional terms appear in the virtual element residual equation. These terms represent the virtual inconsistency of the VEM.

A posteriori error bound
We shall estimate each term on the right-hand side of (5.3) separately, to arrive to a computable error bound. To this end, an integration by parts and straightforward manipulation yields the identity for any w ∈ H 1 0 (Ω). Using this and the data approximations introduced in (5.2), (5.3) may be rewritten as are the element and edge residuals, and the element and edge data oscillation terms, respectively, and is the 'virtual' residual.

Theorem 13 (Upper bound) Let u h ∈ V h be the virtual element solution to problem (3.10). Then, there exists a constant C, independent of h, u and u h , such that
where and Ψ E encompasses the virtual inconsistency terms, defined as the sum of Proof Let e I ∈ V h be the interpolant of e satisfying the bounds of Theorem 11. Then, upon setting v = e and χ = e I in (5.4), coercivity yields For I and I I , we use the Cauchy-Schwarz inequality and the bounds of Theorem 11 to find that For I I I , we use the properties of the L 2 -projection to find that where ω s = E + ∪ E − with E + and E − the elements meeting at the edge s. Noting that

Bounding the edge terms V I and V I I requires the use of the scaled trace inequality
we can bound I V as Using now (3.6), and introducing the mesh Peclét number by Pe E := h E β ∞,E /κ E * , we arrive to Focussing on V , we begin by observing the identity (due to the properties of the with the last bound resulting from the Cauchy-Schwarz inequality and Theorem 7. From Definition 5 and Theorem 7, we may bound the final term by Combining the last two bounds, along with (3.6), we conclude that The skew-symmetric terms can be treated completely analogously, yielding has zero average, and using (5.9). The stability bound (3.6), further implies Combining the bounds for the symmetric and skew-symmetric terms above, we deduce The result then follows by combining the individual bounds above and using the equivalence of the energy-and H 1 -norms.
The terms of η E echo the standard element and edge residual terms while Θ E are data oscillation terms, familiar from the residual a posteriori analysis of finite element methods [3,41,43]. In the present virtual context, however, these terms involve only the polynomial part of u h , as they would not be computable for u h itself. As a result, remainder terms also appear in the estimator, collected in Ψ E . The term S E , on the other hand, provides a computable estimate for the quality of the approximation Π 0 p u h of u h .

Remark 14
We note that the term Ψ E 3 does not vanish when the PDE coefficients are constant, as Ψ E i , i = 1, 2, 4 do. It is possible to circumvent this quite easily by modifying the skew-symmetric bilinear form b E h to use the degree p projection of the gradient. The resulting method and the respective (modified) estimators are still computable in just the same manner as the current method (cf. [21]), since the virtual element functions are polynomials of degree p on each edge.
The estimator of Theorem 13 is also an estimator for the error between u and the projection of u h , and we have the following result.

Corollary 15 (Bound for the projected solution) Let η E , Ψ E , S E and Θ E be the terms of the estimator in Theorem 13. Then,
Proof Using the triangle inequality and the definition of the stabilising term, we have The result follows by Theorem 13.

Lower bound
We now prove local lower bounds of the error in the energy norm by the a posteriori error estimate. To this end, we make use of element and edge bubble functions satisfying the bounds of Lemmas 8 and 9 respectively.

Theorem 16 (Local lower bound)
Let η E , S E and Θ E be as in Theorem 13. Then, where is the patch consisting of the element E and its neighbours, and μ d−1 denotes the (d − 1)-dimensional measure.
The constant C depends on C stab , C equiv , C bub , ρ, Ω and the PDE coefficients, but is independent of h, u and u h .
Proof First observe that R E ∈ P p+q (E) for some q ∈ N ∪{0} representing the degree of the polynomials used for the data approximations in (5.2). From (5.4) with χ = 0 and the fact that ψ E | ∂ E = 0, we deduce Arguing as in (5.10), with ψ E R E in place of e − e I , we find that and consequently, using the properties of the interior bubble functions given in Lemma 8, Using Lemma 8 again this becomes and therefore we arrive at For the face residual, we start by extending J s into ω s through a constant prolongation in the direction normal to the face s, Arguing as before and using Lemma 9, we find that Applying Lemma 9 again, using the bound for the element residual, and multiplying by h Using Assumption 1 and putting these bounds together completes the proof.
This local lower bound then immediately provides a corresponding global lower bound, by simply summing the local estimates over the whole of T h . Furthermore, Theorem 16 and triangle inequality also provide us with the following lower bound on the error between the solution u and the projected virtual element solution Π 0 p u h .

Corollary 17 (Lower bound for the projected solution)
Let η E , S E and Θ E be defined as in Theorem 13. Then, where is the patch consisting of the element E and its neighbours, and μ d−1 denotes the (d − 1)-dimensional measure.
In addition to a lower bound for the residual part of the estimator, η E , we have the following control on the virtual inconsistency terms Ψ E indicating that these are also of optimal order up to data oscillation.

Lemma 18 (Lower bound for the inconsistency terms) We have
Proof We have, respectively, using the stability of the L 2 projection operator and (2.2). Using (5.9) and (3.6) the final term can be controlled by the stabilising term, resulting in the required bound. A completely analogous argument can be applied to each of the remaining terms of Ψ E .

Numerical results
We present a series of numerical experiments aimed at testing the practical behaviour of the estimator derived in Theorem 13. In addition, we propose an adaptive algorithm based on the estimator which is applied to a variety of test problems. The above analysis is valid by only requiring a set of abstract assumptions for the stabilisation forms S E 1 , S E 0 and for the projector Π * p [which in turn defines the space V E h through (3. 3)], giving rise to a number of possibilities. Here, we focus on a specific scheme by providing precise choices for Π * p , S E 1 and S E 0 . Define the bilinear form with dof r (w h ) denoting the value of the r th local degree of freedom of w h with respect to an arbitrary but fixed ordering of the degrees of freedom on the element E. This bilinear form corresponds to the Euclidean inner product on the space R N E consisting of vectors of degrees of freedom. Following [21, Section 4.1], we define Π * p to be the orthogonal projection onto the polynomial space P p (E) with respect to I E (·, ·), and we fix where κ E , and μ E are some constant approximations of κ, and μ over E (e.g., local averages), respectively, resulting in

Remark 19
Note that the internal degrees of freedom of (I −Π 0 p )v h are equal to zero, and hence the above stablilising term reduces to a term active only on the mesh skeleton.

Uniformly generated meshes
As a first test to verify the asymptotic behaviour of the estimator, we consider the test problem 2) (a) (b) Fig. 3 The first two non-convex, in general, meshes used in the uniform sequence described in Sect. 6.1.
Vertices are marked with a dot, and may separate coplanar edges. a The first mesh, with 25 elements. b The second mesh with 100 elements for Ω = (0, 1) 2 , and fix f such that the exact solution is given by u(x, y) = sin(π x) sin(π y), on a uniformly generated sequence of meshes consisting of nonconvex polygonal elements. The first two meshes in the uniform sequence are shown in Fig. 3. Figure 4 depicts the convergence history of the H 1 (Ω)-seminorm error and of the estimator on this sequence of meshes, indicating that both converge at the optimal rate for polynomial degrees p = 1, 2 and 3. The effectivity of the estimator is defined by with η E , Θ E , S E and Ψ E as in Theorem 13. Asymptotically the effectivity becomes constant throughout the mesh sequence, tending to approximately 5.7 for p = 1, 3 for p = 2, and 1.84 for p = 3.

Adaptive refinement
We shall use a typical adaptive algorithm for elliptic problems reading: solve→ estimate→mark→refine. In this context, given a polygonal subdivision of Ω, one solves the VEM problem, estimates the error using the a posteriori error bound (Theorem 13), marks a subset of elements for refinement and, subsequently, refines. The Dörfler/bulk marking strategy is used below for the mark step, marking the subset of mesh elements M ⊂ T h with the largest estimated errors such that for some θ ∈ (0, 1). Here, we pick θ = 0.4.
To refine a polygonal element we divide elements by connecting the midpoint of each planar element face to its barycentre; see Fig. 5 for an illustration for a hexagon. Note that this strategy simply reduces to the standard refinement strategy for a square element. By refining in this fashion, hanging nodes may be introduced. Nevertheless, this is trivially accounted for in the VEM setting as the method is able to handle polygonal elements with an arbitrary number of faces. This is a flexibility which we take advantage of in these examples by imposing no restriction on the number of hanging nodes allowed on each face. In this extreme mesh flexibility, more exotic refinement strategies are certainly possible, but we leave the development of these for future work.
Remark 20 (On the mesh assumptions) By imposing no restriction on the number of hanging nodes per face, we are at risk of violating Assumption 1 by producing meshes which contain very small faces. However, this requirement does not seem to be necessary for the virtual element method to remain accurate and stable in practice. P Q R Fig. 5 An illustration of the refinement strategy used for polygonal elements. Edges and vertices of the original element are shown by solid lines and points. To refine the element, the midpoint of each planar face of its boundary is connected to its barycentre. Note that the two edges P Q and Q R are not bisected as the refinement treats P R as a single planar face, adding only a new edge from Q to the barycentre. Consequently, the result of refining two neighbours in the mesh is independent of the order in which they are refined This is demonstrated in Sect. 6.3, where the effect of limiting the number of hanging nodes allowed per edge is also studied, and the results in either case are found to be very similar.
We consider the general convection-reaction-diffusion problem and forcing function f chosen in accordance with two different benchmark solutions:  Figs. 8 and 9. We first observe that, once the asymptotic regime is reached in each case, the error measured in the H 1 (Ω)seminorm (shown in Fig. 6a for Problem 1 and in Fig. 8a for Problem 2) converges with the theoretical optimal rate of N − p/2 , despite the low regularity of the true solution around the reentrant corner for Problem 1.
The initial rapid drop-off in error for Problem 1 is explained by examining the magnitudes of the various components of the estimator for p = 1, given in Fig. 6d. In particular, it is clear that the data oscillation term initially dominates the estimator and, comparing with the mesh after 28 iterations, shown in Fig. 7b, it appears to be driving the refinement around the Gaussian centred at (0.5, 0.5). Once this is sufficiently resolved, the element and face residual terms begin to dominate, resulting in the expected refinement around the singularity at the reentrant corner. This is shown in Fig. 7c, after 40 iterations. Number of degrees of freedom The key difficulty of Problem 2 is the presence of an interior sharp layer which is completely unresolved by the initial mesh. To test the resilience of the estimator in this challenging context, the initial mesh is chosen to consist of warped hexagons which are not aligned with the interior layer; see Fig. 9a for an illustration. As with Problem 1, the data oscillation terms initially dominate the estimator until the the mesh Number of degrees of freedom

Jumping diffusion coefficient
We now consider the Kellogg problem [32], in which the diffusion coefficient κ is piecewise constant across the domain Ω = (0, 1) 2 , such that for fixed 0 < a < 1 and b > 0, and no reaction or convection terms. This problem has weak solution u(r, θ) = r α g(θ ), where (r, θ) denote the polar coordinates centred at the point (a, a), and The parameters σ, α and b are required to satisfy a certain set of nonlinear relations [32], and following [15] we take the approximate values σ = 5.49778714378214, α = 0.25, b = 25.27414236908818. The Kellogg problem is a common example used to test a posteriori estimators on a problem with pathological coefficients and a known weak solution. Typically, this problem is studied in the case when κ is piecewise constant with respect to the initial mesh, see, e.g. [14,23,34,36]. Recently, the case in which the diffusion jumps are not aligned with the initial mesh has been studied in [15] in the context of adaptive FEM. To really test the applicability of our estimator, we consider both cases here on a variety of different meshes.
Whether the mesh is aligned with the problem or not is dictated by the parameter a. We first consider a = 2 5 on a square mesh, so the discontinuities of κ are matched by the initial mesh. The behaviour of the error and estimator for this problem are shown in Fig. 10. Moreover, for this problem, we also compare the effect of limiting the mesh to have just one hanging node per edge, or allowing an unlimited number of hanging nodes to be produced. In both cases, we use the Dörfler strategy from (6.4) with θ = 0.6 to select the subset of elements to be refined. See Fig. 11 for an illustration of the initial mesh and the final adapted meshes for both limited and unlimited hanging nodes per edge. In either case, the error under adaptive refinement eventually decays at the theoretical optimal rate of N −1/2 , where N is the number of degrees of freedom. It may also be seen that the H 1 -seminorm error is slightly lower for the case of a limited number of hanging nodes, although the estimated error is approximately the same for both cases. Consequently, the effectivity of the estimator is slightly better for the method with no limit on the number of hanging nodes.
Next, we consider a = 2 √ 2 randomised quadrilateral mesh alongside a more standard square mesh. For brevity, we only report here the results when an unlimited number of hanging nodes were allowed in the mesh, as limiting the number of hanging nodes leads to almost identical results in terms of convergence. There are, however, differences in the final meshes obtained in each case: illustrations of the initial meshes and the final meshes for both limited and unlimited hanging nodes are given in Fig. 13. Figure 12 shows the behaviour of the error (c) (d) Fig. 12 The behaviour of the estimator for the Kellogg problem (see Sect. 6.3) when the discontinuities in κ cannot align with any mesh in the sequence, using p = 1 and three different types of mesh, depicted in Fig. 13. a The error ∇(u − Π 0 k u h ) 0,Ω . b Convergence of the estimator. c Effectivity of the estimator. d The non-zero components of the estimator on the square mesh and estimator under adaptive refinement on the three sequences of meshes: the error in the H 1 (Ω)-seminorm (Fig. 12a) appears to reach the theoretical optimal convergence rate of N −1/2 on the square and randomised quadrilateral meshes, and maintains a near-optimal rate of approximately N −0.35 on the Voronoi mesh. These rates are also reflected by the convergence of the estimator, shown in Fig. 12b, resulting in good effectivities (Fig. 12c) which remain roughly constant on the Voronoi and randomised quadrilateral meshes.
We note the sudden jump in the magnitude of the estimated error after 7 iterations of adaptive refinement starting from the square mesh in Fig. 13a. Comparing with Fig. 12d, which shows the relative magnitudes of the various terms comprising the estimator on the square mesh, it is apparent that this jump is caused by a jump in the value of the data oscillation term Θ. Noting that for p = 1 the coefficient approximation κ h is piecewise constant, we conclude that it is in fact only the edge data oscillation term which is non-zero and thus responsible for this effect. Further investigation indicates that this jump occurs in situations such as that illustrated in Fig. 14, and 13 The initial and final adapted meshes when solving the Kellogg problem (see Sect. 6.3) when the discontinuity in κ cannot align with any mesh in the sequence. a The initial square mesh. b The initial Voronoi mesh. c The initial randomised quadrilateral mesh. d The final square mesh with no limit on the number of hanging nodes. e The final Voronoi mesh with no limit on the number of hanging nodes. f The final randomised quadrilateral mesh with no limit on the number of hanging nodes. g The final square mesh with one hanging node per edge. h The final Voronoi mesh with one hanging node per edge. i The final randomised quadrilateral mesh with one hanging node per edge discontinuities of κ, it is possible for it to get arbitrarily close. This is a highly desirable trait from the point of view of generating a well-adapted mesh. Nonetheless, the standard (isotropic) refinement strategy used on squares produces a mesh with edges close to the diffusion discontinuity, such as the ones depicted in Fig. 14b, only if the previous iteration contains elements as in Fig. 14a with the lines of discontinuity of κ passing close to its centre. This is problematic because the roughly equal distribution of the central element in Fig. 14a and its four neighbours among the different zones of κ mean that the approximation κ h will be very similar on each of the five elements, (a) (b) Fig. 14 The element of the mesh containing the intersection of the lines along which κ is discontinuous, before and after refinement. Solid lines indicate edges in the mesh, dotted lines indicate the lines along which κ jumps. a The lines along which κ is discontinuous pass close to the centre of the element. b After refinement, the discontinuities of κ are suddenly very close to mesh edges and thus the edge term of the data oscillation indicator will be very small. However, once this parent is refined, each child is almost entirely in a single zone of κ, so the approximations κ h will be very different on each of the children. This will, then, cause the reported error to dramatically increase. Moreover, since the discontinuities of κ lie along lines with irrational coordinates, it is clear that this situation could occur an arbitrary number of times in the refinement sequence, causing problems with the effectivity of the estimator. Clearly the real culprit here is the symmetry of the situation and, consequently, a way to prevent such problems occurring is to use unstructured meshes. This claim can be substantiated by the fact that the same difficulty does not occur with the randomised quadrilateral or Voronoi meshes.

Conclusions and extensions
We have derived and analysed a residual a posteriori error estimate for the C 0conforming virtual element method of [21] applied to general second order elliptic problems with nonconstant coefficients. This analysis has given rise to a fully computable a posteriori error estimator which we have shown to be equivalent to the error between the true solution and the virtual element approximation, measured in the energy norm. The analysis rests crucially on a new Clément-type interpolation result. We have also presented an extensive set of numerical results to demonstrate the behaviour of this estimator when used to drive an adaptive algorithm on a variety of problems using several families of meshes, consisting of general polygonal elements. We stress that the analysis above can also be applied to other related virtual element formulations of the same problem subject to only minor modifications. For instance, the same a posteriori analysis can be applied to the corresponding VEM obtained by discretising problem (2.4) directly, without splitting the differential operator into its symmetric and skew-symmetric parts. The resulting local discrete bilinear form would take the form (see [9] for a similar approach). The analysis would provide the same a posteriori error estimator presented in Theorem 13, but without the term Ψ E 3 .