Low-Cost Unattended Design of Miniaturized 4 × 4 Butler Matrices with Nonstandard Phase Differences

Design of Butler matrices dedicated to Internet of Things and 5th generation (5G) mobile systems—where small size and high performance are of primary concern—is a challenging task that often exceeds capabilities of conventional techniques. Lack of appropriate, unified design approaches is a serious bottleneck for the development of Butler structures for contemporary applications. In this work, a low-cost bottom-up procedure for rigorous and unattended design of miniaturized 4 × 4 Butler matrices is proposed. The presented approach exploits numerical algorithms (governed by a set of suitable objective functions) to control synthesis, implementation, optimization, and fine-tuning of the structure and its individual building blocks. The framework is demonstrated using two miniaturized matrices with nonstandard output-port phase differences. Numerical results indicate that the computational cost of the design process using the presented framework is over 80% lower compared to the conventional approach. The footprints of optimized matrices are only 696 and 767 mm2, respectively. Small size and operation frequency of around 2.6 GHz make the circuits of potential use for mobile devices dedicated to work within a sub-6 GHz 5G spectrum. Both structures have been benchmarked against the state-of-the-art designs from the literature in terms of performance and size. Measurements of the fabricated Butler matrix prototype are also provided.


Introduction
Antenna arrays are the key components of modern communication devices. Their popular applications include long-term evolution and 5th generation (5G) cellular technology, where good performance is crucial for sustaining high data-transfer rates. Apart from the radiators, feeding network is an integral component of the antenna array. Its role is to provide appropriate excitation of the radiators (both magnitude-and phase-wise) so as to ensure the desired beamforming capability of the array [1][2][3][4][5][6].
Common array feeding realizations include variants of corporate [7][8][9] and series networks [10][11][12]. More complex structures are based on a combination of series-parallel feeds [13], networks with tunable power dividers [14], as well as structures dedicated to operating in mm-wave spectrum [15]. Butler matrices (BMs) and their derivatives belong to another class of feeding networks [16][17][18][19][20]. Their characteristic feature is the beamforming capability resulting from the ability to provide different phase shifts between the output ports depending on the selected input [17,20]. The conventional 4 × 4 (i.e., with four input and four output ports) BM-which offers output-port phase shifts of ±45 • and ±135 • , respectively-is a composite structure comprising four 90 • hybrid couplers, with two crossovers and two 45 • phase shifters (PSs) [21][22][23][24]. The circuit is characterized by large dimensions and lack of flexibility in terms of attainable phase differences. Consequently, the usefulness of standard BMs for modern mobile systems, including sensor networks and/or Internet of Things (IoT) devices interconnected using the 5G backbone, is limited.
The following paragraphs contain more details concerning design of BM structures for contemporary applications.
In recent years, the design of Butler matrices for space-limited applications has gained significant attention of the research community [16,[25][26][27][28][29][30][31][32][33]. Popular miniaturization techniques are oriented toward replacing BM building blocks with their miniaturized counterparts [29,30]. The latter are often implemented in the form of composite cells, representing appropriate combination of the high-and low-impedance transmission lines (TLs) [29,34]. However, the application of left/right-handed TLs for reduction of BM footprint has also been reported [28]. Miniaturization of conventional crossovers-realized as a composition of two branch line couplers (BLCs) interconnected through a 90 • TL-is one of the leading approaches to design of small BMs [28][29][30]. Notwithstanding, substitution of conventional BM building blocks with their miniaturized counterparts results in modest miniaturization rates [28,30]. From this perspective, replacement of standard crossovers with alternative topologies featuring either small footprints or improved phase-related functionality seems to be an interesting and preferable alternative to compact BM design [19,35]. Other miniaturization methods depart from standard BM topologies. Instead, they eliminate certain matrix subcomponents or exploit multilayer substrates for structure implementation [26,27,32]. On the other hand, miniaturization resulting from application of the mentioned techniques is often insufficient to increase usefulness of BMs for spacelimited systems [28,30]. Furthermore, the discussed compact topologies do not address the problem of improving BM performance.
The design of Butler matrices with increased flexibility in terms of achievable outputport phase differences has also been investigated in the literature [19,23,25,36]. The considered methods involve redesign of conventional structures, either through modification of their building blocks (i.e., couplers or delay lines) [19,23] or introduction of additional sub-circuits affecting BM performance [25,36]. In [23], enhanced control over beamforming capabilities has been achieved by replacing conventional 90 • hybrid couplers by structures with adjustable phase shifts. In [19], similar effect has been achieved using PSs with unequal electrical lengths. The reasoning behind both methods is that improved control of phase can be achieved by increasing the number of relevant design parameters. An alternative technique, proposed in [36], boils down to enhancement of the conventional BM through connection of phase-reconfigurable TLs to its output ports. The mentioned methods proved to be useful for performance enhancement. However, they neglect the need of circuit miniaturization, which is important for application of matrices in contemporary systems.
The design solutions discussed above indicate that the difficulties related to improving BMs' functionality and their miniaturization are perceived as separate problems. In other words, size-reduction strategies considered in the literature are dedicated to conventional topologies [28,29], whereas BM designs featuring improved performance suffer from large dimensions [23,36]. The development of BMs characterized by both small size and improved functionality is important for contemporary, space-limited applications such as IoT and sensor networks. Another problem is that the discussion of methods used to obtain specific solutions of miniaturized and/or performance-enhanced Butler matrices is often neglected in the literature. Instead, the emphasis is put on structure synthesis or analysis of the specific case study. In practice, however, synthesis (or generation of BM components) is just the beginning of the design procedure, which has to be followed by circuit assembly and careful adjustment of parameters oriented toward determination of the desired performance.
Due to high complexity, Butler matrix design is realized as a multistage process. The structure is first divided into individual building blocks, which are subject to a set of simplified design tasks oriented toward obtaining the desired performance [29]. Once the design parameters of all subcomponents are found, the matrix is assembled. The main benefit of such a strategy is that each step gradually approximates a satisfactory design solution. On the other hand, conventional design approaches involve time-consuming and repetitive analysis of the structure (and/or its building blocks). These often involve modifications of geometry parameters followed by visual inspection of structure responses [34,37]. Although the concept of semi-manual design proved to be fairly successful for circuits characterized by a relatively low number of variables and simple responses [37][38][39], its applicability to more complex problems, such as tuning of corporate feeds or BM topologies, is limited. One of the reasons is that the discussed structures are characterized by complex electrical responses that cannot be reliably tracked using manual techniques. From this perspective, Butler matrix design stages should include synthesis, development of BM components, and fine-tuning of the assembled circuit oriented toward maximization of performance. For the sake of reliability, each step should be controlled by appropriate algorithm and rigorously defined objective function. Yet another problem-rarely considered in the literature in terms of BM design [29]-is a large design cost associated with BM optimization. The latter stems from the necessity of using numerically expensive electromagnetic (EM) simulation models in order to obtain accurate responses of the matrix, and a large number of evaluations required by the optimization algorithm to converge.
Analysis of the available literature indicates that the design of BM structures for IoT and 5G applications is pertinent to several challenges. First of all, the development of portable electronics is associated with steadily increasing requirements concerning both miniaturization and performance. However, for BM structures, these problems are treated separately, which results in the development of either relatively small components with standard (or close-to standard) performance or bulky circuits with improved functionality [19,23,25,36]. Another problem involves lack of unified and well-established procedure-ranging from components synthesis to final tuning of the assembled circuitdedicated to BM design. Moreover, conventional approaches to the development of BM structures are unreliable and inefficient due to a large number of figures representing matrix performance (all of which have to be accounted for at a time) [27,28,30]. Although numerical optimization seems to be appropriate for addressing the difficulties pertinent to parameter adjustments, it suffers from high cost related to a large number of CPUheavy evaluations required by the algorithm to converge. Consequently, maintaining low computational budget-e.g., using surrogate methods [40][41][42]-is of high importance in increasing the usefulness of algorithm-based approaches to BM design. The motivation of this work is to address the discussed challenges pertinent to design of modern BM circuits.
In this work, a bottom-up procedure for low-cost unattended design of miniaturized 4 × 4 Butler matrices with nonstandard output-port phase differences has been presented. The proposed method consists of two main steps involving synthesis and optimization of BM building blocks, followed by two-stage EM-driven tuning of the assembled matrix. The main contributions of the work include: (i) development of an automated methodology for sequential design of BM matrix components; (ii) determination of rigorously defined objective functions that facilitate handling the components/matrix design using robust numerical optimization algorithms; (iii) integration of all BM design stages (i.e., components synthesis, implementation, integration, and EM-driven tuning of assembled matrix) into a design framework; and (iv) validation of the presented method using two compact BMs with nonstandard output-port phases. The considered case studies involve design of structures featuring phase shifts of {−30 • , 150 • , −120 • , 60 • } and {−20 • , 160 • , −110 • , 70 • }, respectively. The footprints of the optimized circuits are only 696 and 767 mm 2 . To the best of the authors' knowledge, this is the first work that thoroughly discusses the problem concerning automated BM design oriented for both miniaturization and enhancement of electrical performance. According to benchmark results, the computational cost of circuit design using the presented framework is over 80% lower compared to more conventional strategy based on direct EM-driven optimization of the assembled matrix. The optimized structures have been compared against state-of-the-art circuits from the literature in terms of size and performance. A comparison of simulation and measurement results is also provided.

Design Problem and Models of Butler Matrix Components
The automated design process presented here is governed by numerical optimization algorithms. This section contains formulation of the design problem and definition of models used by the proposed framework. For the sake of consistency, a brief discussion of numerical algorithms used in the work is also included. The details concerning mechanisms embedded into the proposed design framework are provided in Section 3.

Design Problem
The design problem can be formulated as a nonlinear minimization task of the form [40] x * = argmin where R(x) is the response of the structure (be it the BM or its sub-circuit) under design. The vectors x and x * represent adjustable parameters of the structure and the optimal design to be found, whereas U denotes an objective function. The goal of (1) is to find x * through minimization of U. The latter "translates" the structure response into a scalar value, which is used by numerical algorithm to govern the design process toward the optimal solution. In other words, the function U represents the structure performance-calculated based on responses of the circuit-at the given design.

Models of Butler Matrix Components
In the proposed bottom-up design framework, the determination of the final BM geometry is preceded by synthesis, implementation, and optimization of its individual components. The responses of BM sub-circuits can be obtained through evaluation of the ideal transmission-line R T (x e ), equivalent-circuit R C (x g ), or electromagnetic R E (x g ) models, respectively. Here, x e denotes the electrical parameters (characteristic impedance and electrical length of the transmission line), whereas x g is the vector of geometry variables. Conceptual illustrations of discussed structure representations are shown in Figure 1. It should be emphasized that the type of structure model matches the complexity of design task and size of the search space in terms of computational cost and accuracy. In other words, the model fidelity increases along with narrowing the search space to the region of interest. Here, the synthesis and correction/refinement of BM components at the level of their electrical parameters is realized using R T model, whereas R C and R E are applied for miniaturization-oriented development of building-blocks topologies and to obtain accurate responses for optimization and fine-tuning of the components.

Butler Matrix Representations
The Butler matrix models used here are defined as follows. Let RBe(y) and RBc(z) denote the high-fidelity EM model of the assembled BM (i.e., the one where all BM subcomponents are implemented and physically interconnected in the form of a single full-wave EM design) and the composite representation of the structure, respectively. The variables y and z represent their design parameters. The composite model responses are obtained using transmission-line theory from the S-parameter characteristics of individual BM subcircuits [43,44]. A conceptual illustration of the RBe and RBc models is shown in Figure 2. It should be noted that evaluation of the RBc model is to be preceded Figure 1. Illustration of structure representations (here, a phase shifter) supported by the design method: (a) ideal transmission line (TL) model R T (x e ) with x e = [Z θ] T , (b) equivalent-circuit model R C (x g ), and (c) high-fidelity electromagnetic (EM) model R E (x g ). The vector of geometry parameters x g = [w l 1 l 2 ] T is the same for R C and R E models.

Butler Matrix Representations
The Butler matrix models used here are defined as follows. Let R Be (y) and R Bc (z) denote the high-fidelity EM model of the assembled BM (i.e., the one where all BM subcomponents are implemented and physically interconnected in the form of a single full-wave EM design) and the composite representation of the structure, respectively. The variables y and z represent their design parameters. The composite model responses are obtained using transmission-line theory from the S-parameter characteristics of individual BM subcircuits [43,44]. A conceptual illustration of the R Be and R Bc models is shown in Figure 2. It should be noted that evaluation of the R Bc model is to be preceded by simulations of its subcomponents (cf. Figure 2b). Nonetheless, the composite model is characterized by much lower evaluation cost compared to R Be . The vector z can be set equal to y but can also represent a composition of electrical and/or geometry variables of the sub-circuits selected for optimization (cf. Figure 2b). Example responses obtained from evaluation of R Be and R Bc models for the z = y are shown in Figure 3. Despite reduced accuracy w.r.t. R Be (especially phase-wise), resulting from neglecting the coupling between adjacent components and loss at their interconnection, the composite model is useful for narrowing the search space to the region of interest before the fine-tuning stage.

Butler Matrix Representations
The Butler matrix models used here are defined as follows. Let RBe(y) and RBc(z) denote the high-fidelity EM model of the assembled BM (i.e., the one where all BM subcomponents are implemented and physically interconnected in the form of a single full-wave EM design) and the composite representation of the structure, respectively. The variables y and z represent their design parameters. The composite model responses are obtained using transmission-line theory from the S-parameter characteristics of individual BM subcircuits [43,44]. A conceptual illustration of the RBe and RBc models is shown in Figure 2. It should be noted that evaluation of the RBc model is to be preceded by simulations of its subcomponents (cf. Figure 2b). Nonetheless, the composite model is characterized by much lower evaluation cost compared to RBe. The vector z can be set equal to y but can also represent a composition of electrical and/or geometry variables of the sub-circuits selected for optimization (cf. Figure 2b). Example responses obtained from evaluation of RBe and RBc models for the z = y are shown in Figure 3. Despite reduced accuracy w.r.t. RBe (especially phase-wise), resulting from neglecting the coupling between adjacent components and loss at their interconnection, the composite model is useful for narrowing the search space to the region of interest before the fine-tuning stage. (a) , 851 6 of 24 (b)    In this illustration, R Bc is calculated using EM model responses of crossover (design x g.c ) and BLCs (two designs: x g.1 and x g.2 ), as well as ideal TL model of phase shifters (two designs: x e.1 and x e.2 ).

Optimization Algorithms
As already mentioned, the proposed design method is governed by numerical optimization algorithms. The framework presented here exploits two routines: a trust-regionbased gradient method and an unconstrained variant of the bisection algorithm. To make the work self-contained, both routines are briefly discussed below.

Figure 2.
A conceptual illustration of the Butler matrix (BM) models: (a) high-fidelity EM model of the assembled matrix where (i)-(iii) denote branch line coupler (BLC), crossover, and phase-shifter, respectively, and (b) a composite representation of the structure. Frequency characteristics of the composite BM are obtained from the responses of individual components. In this illustration, RBc is calculated using EM model responses of crossover (design xg.c) and BLCs (two designs: xg.1 and xg.2), as well as ideal TL model of phase shifters (two designs: xe.1 and xe.2).

Optimization Algorithms
As already mentioned, the proposed design method is governed by numerical optimization algorithms. The framework presented here exploits two routines: a trust-region-based gradient method and an unconstrained variant of the bisection algorithm. To make the work self-contained, both routines are briefly discussed below.

Trust-Region-Based Optimization
The main optimization engine is a gradient algorithm embedded in a trust-region framework. The method generates a series of approximations (i = 1, 2, 3, …) to the final design by solving [45]  x G x (2) where G (i) is the first-order Taylor surrogate constructed from the S-parameter responses of the structure at hand. The model is given as [

Trust-Region-Based Optimization
The main optimization engine is a gradient algorithm embedded in a trust-region framework. The method generates a series of approximations (i = 1, 2, 3, . . . ) to the final design by solving [45] x (i+1) = arg min where G (i) is the first-order Taylor surrogate constructed from the S-parameter responses of the structure at hand. The model is given as [45] Here, R(x (i) ) is the response of the structure at hand. The perturbations for generation of the Jacobian J are obtained using a large-step finite differentiation [45]. Note that the model G (i) may be constructed using a combination of the responses determined from evaluations of the low-and high-fidelity representations of the circuit under design [46]. The parameter r (i) represents the trust-region radius, which is iteratively updated based on the calculated gain ratio, i.e., the obtained versus predicted change of the objective function. The radius is updated using standard rules [45]. A more detailed discussion of the algorithm can be found in [45,46].

Bisection-Based Heuristic
Another algorithm used in this work is a simple unconstrained variant of the bisection method [47]. Let x 0.1 and x 0.2 represent the starting points for the algorithm. Here, the vector x 0.1 is obtained as a result of structure synthesis, whereas x 0.2 represents a perturbation of all design parameters w.r.t. x 0.1 . The algorithm flow is as follows: go to step 2; otherwise go to step 4.

7.
If |x (i) − x (i−1) | ≤ ε, set x * = x (i) and END; otherwise go to step 4. It should be noted that α and ε are user-defined parameters. Here, U(x) = U(R(x)) represents the objective function value calculated based on the model response at the design x. The discussed bisection-based algorithm is useful for approximating dimensions of the miniaturized BM components.

Methodology
The framework presented involves two main design stages: (i) synthesis and design of individual BM components and (ii) optimization of the composite BM model followed by fine-tuning of the assembled structure. Here, a detailed discussion of each design step and a summary of the presented methodology are provided. The numerical and experimental validation of the framework is considered in Section 4.3.

Synthesis of Butler Matrix and Its Components
A conceptual illustration of the considered 4 × 4 Butler matrix with nonstandard output-port phase differences is shown in Figure 4. The structure comprises two pairs of hybrid branch line couplers with adjustable phase, as well as two crossovers and four phase shifters. Excitation of the matrix through the selected input port P j (j = 1, 2, 3, 4) allows for obtaining the phase differences ∆θ j between its output ports P 5-8 . The relation between output phase shifts and electrical lengths of the structure components is given as [23] β (6) put-port phase differences are Δθ1  [−45°; −15°], Δθ2  [135°; 165°], Δθ3  and Δθ4  [45°; 75°], respectively. The parameters β1-3 are used as the sta design of individual matrix components. The illustration of the BLC structure capable of obtaining the desired comparison with conventional 90° hybrid is shown in Figure 5b,c. Given ference βk (k = 1,2), kth coupler consists of a pair of 90° TL sections wit impedance z1.k and equal electrical length of φ1.k, as well as a pair of TLs w z2.k and electrical lengths φ2.k and φ3.k, respectively. Electrical parameters of be obtained from the following equations [48]: 2. tan tan 0.5 1 2 tan 0.5  Here, β 1 and β 2 represent phase shifts introduced by the first and second pair of BLCs, whereas β 3 is the electrical length of the phase shifter. As shown in Figures 4 and 5a, the component β c represents the electrical length of crossovers. Based on (4)-(6), one can infer that phase differences at output ports of the BM are a function of β 2 . Consequently, ∆θ 1 = 0.5β 2 , ∆θ 2 = 0.5β 2 + π, ∆θ 3 = 0.5β 2 − 0.5π, and ∆θ 4 = 0.5β 2 + 0.5π [23]. The feasible ranges of phase shifts β 1 and β 2 -i.e., the ones for which realizable topologies of compact BLCs can be obtained-vary from −30 • to −90 • . Therefore, the attainable output-port phase differences are Although solving (7), (9), and (10) using the considered z2.k provides good estimation of the φ2.k and φ3.k w.r.t. the required phase shift, it does introduce power split error at the operating frequency. To address the problem, the ideal model of the coupler is optimized using algorithm of Section 2.4.1. The objective function is Here, (11) is minimized based on the RT(xe) model responses, where xe = xe.k = [z1.k z2.k φ2.k φ3.k] T represents the vector of electrical parameters of kth BLC (note that φ1.k = 90°). The parameter ΔC = ||S31| − |S21|| denotes the power-split imbalance at the center frequency f0, whereas M = max(|S11|, |S41|) is an in-band performance of the BLC over the frequency range of interest defined around f0 [46]. The figures φ1 = (S21/S31) and φ2 = (S24/S34) are phase shifts at f0. The parameters M0 = −20 dB, φ0.1 = βk, and φ0.2 = βk − π represent target values. The scaling coefficients [α1 α2 α3] = [1000 500 1000] are determined so as to ensure balanced contribution of the design requirements to the aggregated objective function (11), [49,50]. In other words, they maintain similar relative importance of sub-elements in (11) during the optimization process. The selected values are appropriate for BM circuit components synthesized using (7)-(10).

Sequential Design of BM Components
The design of individual BM components is realized as a sequential process that involves determination of BLCs and crossovers geometries, as well as by tuning of the phase shifters. For the given center frequency f0, the crossover-see Figure 5a for illustration-can be considered as a "static" component of the matrix. In other words, its dimensions do not change when the BM is redesigned for another set of output phase differences. Consequently, the same design can be used to realize a range of Δθj. The crossover dimensions are adjusted in two steps. First, the selected geometry is optimized for minimization of U2.1 = max(|S11|, |S33|) at the center frequency f0, where |S11|, |S33| represent reflection of the crossed TLs. Next, the design is oriented toward ensuring that electrical lengths of the crossed transmission lines βc.1 and βc.2 are equal at f0. This is achieved by minimization of the objective function U2.2 = (βc.1 − βc.2) 2 . In each design step, the optimization is carried out using the algorithm of Section 2.4.1 and the RE(xg.c) model responses. Here, the vector xg.c represents geometry parameters of the crossover. The final design xg.c * structure provides equal length lines (βc = βc.1 ≈ βc.2) with low reflection and  The illustration of the BLC structure capable of obtaining the desired phase and its comparison with conventional 90 • hybrid is shown in Figure 5b,c. Given the phase difference β k (k = 1,2), kth coupler consists of a pair of 90 • TL sections with characteristic impedance z 1.k and equal electrical length of ϕ 1.k , as well as a pair of TLs with impedance z 2.k and electrical lengths ϕ 2.k and ϕ 3.k , respectively. Electrical parameters of the circuit can be obtained from the following equations [48]: Here, (8) is solved for ϕ 2.k with z 2.k = 0.5 × 2 0.5 to provide closed-form estimation of BLC parameters: Although solving (7), (9), and (10) using the considered z 2.k provides good estimation of the ϕ 2.k and ϕ 3.k w.r.t. the required phase shift, it does introduce power split error at the operating frequency. To address the problem, the ideal model of the coupler is optimized using algorithm of Section 2.4.1. The objective function is Here, (11) is minimized based on the R T (x e ) model responses, where x e = x e.k = [z 1.k z 2.k ϕ 2.k ϕ 3.k ] T represents the vector of electrical parameters of kth BLC (note that ϕ 1.k = 90 • ). The parameter ∆C = ||S 31 | − |S 21 || denotes the power-split imbalance at the center frequency f 0 , whereas M = max(|S 11 |, |S 41 |) is an in-band performance of the BLC over the frequency range of interest defined around f 0 [46]. The figures ϕ 1 = ∠(S 21 /S 31 ) and ϕ 2 = ∠(S 24 /S 34 ) are phase shifts at f 0 . The parameters M 0 = −20 dB, ϕ 0.1 = β k , and ϕ 0.2 = β k − π represent target values. The scaling coefficients [α 1 α 2 α 3 ] = [1000 500 1000] are determined so as to ensure balanced contribution of the design requirements to the aggregated objective function (11), [49,50]. In other words, they maintain similar relative importance of subelements in (11) during the optimization process. The selected values are appropriate for BM circuit components synthesized using (7)-(10).

Sequential Design of BM Components
The design of individual BM components is realized as a sequential process that involves determination of BLCs and crossovers geometries, as well as by tuning of the phase shifters. For the given center frequency f 0 , the crossover-see Figure 5a for illustrationcan be considered as a "static" component of the matrix. In other words, its dimensions do not change when the BM is redesigned for another set of output phase differences. Consequently, the same design can be used to realize a range of ∆θ j . The crossover dimensions are adjusted in two steps. First, the selected geometry is optimized for minimization of U 2.1 = max(|S 11 |, |S 33 |) at the center frequency f 0 , where |S 11 |, |S 33 | represent reflection of the crossed TLs. Next, the design is oriented toward ensuring that electrical lengths of the crossed transmission lines β c.1 and β c.2 are equal at f 0 . This is achieved by minimization of the objective function U 2.2 = (β c.1 − β c.2 ) 2 . In each design step, the optimization is carried out using the algorithm of Section 2.4.1 and the R E (x g.c ) model responses. Here, the vector x g.c represents geometry parameters of the crossover. The final design x g.c * structure provides equal length lines (β c = β c.1 ≈ β c.2 ) with low reflection and high isolation levels (note that the term isolation refers to attenuation of the signal and is expressed as an absolute value of the transmission-in dB-between the selected pair of ports), all important for high BM performance.
The design stage involves development of compact BLCs. For each coupler, the initial design is synthesized as described in Section 3.1. The design of kth BLC (cf. Section 3.1) can be summarized as follows:

1.
Decompose ideal BLC model to individual TLs.

2.
Use electrical parameters of TLs as the reference for development of miniaturized BLC sections.

3.
Optimize the BLC sections to match electrical parameters of the reference TLs.

4.
Construct miniaturized BLC using the optimized cells and define the vector of its design parameters x g.k .

5.
Optimize compact BLC using objective function (11) and algorithm of Section 2.4.1.
Note that, in step 3, each section of the miniaturized BLC must be optimized to ensure geometrical consistency of the structure, as well as to provide sufficient flexibility for the tuning of phase shifts. In steps 2, 4, and 5, the BLC optimization is carried out using the algorithm of Section 2.4.1, whereas in step 3 the routine of Section 2.4.2 is used. It should be emphasized that R E model evaluations are used only in the last stage of the BLC development, whereas models R T and R C are used in stage 1 and stages 2-4, respectively. Shifting the optimization burden to the simplified models is important for maintaining low cost of coupler development. For more detailed discussion on design of miniaturized BLCs, see [29,34,46,51]. The optimized high-fidelity BLC designs x g.1 * and x g.2 * are used as the starting point for BM tuning.
The final stage of the sequential design process involves development of phase shifters. Conventional BM structures comprise equal-length PSs. Here, however, unequal-length shifters are used to increase flexibility of the BM in terms of control over the output-port phase differences. The electrical parameters of PSs are optimized using the composite BM model. In this step, the model integrates EM responses of optimized BLCs and crossovers (which remain fixed in the optimization process). The phase shifters are implemented in the form of ideal TLs. The composite model is optimized to minimize the following function: where Here, M B and M Bf0 represent the maximum value of the reflection and isolation responses between BM input ports (expressed in dB) within the frequency range of interest and at f 0 , respectively. The figure ∆C E.j denotes the magnitude (in dB) of transmission to the output ports at f 0 when the structure is fed through the jth port (N = 4). Similarly, P E.j is a normalized phase difference at the output ports for excitation through jth port. The parameters M B0 = −15 dB, ∆C 0 = 0.2 dB, P 0 = 1.5 • , and M f0 = −30 dB represent the target values for matrix design. The weights [α 1 α 2 α 3 α 4 ] = [400 1 1 10] are determined based on numerical studies.
The optimized electrical lengths of PSs are used as the target for determination of each shifter's physical dimensions. The phase shifters are implemented in the form of meandered TLs. The optimization of each PS is realized separately and is oriented toward minimization of (12). The initial dimensions of the meander lines are determined based on the transmission-line theory. The design objective for refinement of pth (p = 1, 2, 3, 4) phase shifter geometry is U 4 = (e 0.p − e p ) 2 , where e p is the electrical length of the structure under design and e 0.p represents the target length. Due to low evaluation cost and quick convergence of the algorithm (2), each meandered PS is implemented only in the form of high-fidelity model R E . The Butler matrix components determined in this stage of the design process are used for further optimization and tuning of the assembled structure.

Butler Matrix Optimization and Fine-Tuning
For the sake of low computational cost, the BM optimization is realized using the composite model R Bc . Here, the parameters of all BLCs and PSs are enabled for adjustment. The optimization is oriented toward minimization of (12) using algorithm of Section 2.4.1.
The main goal of this step is to further narrow down the search space to the region of interest so as to reduce the number of R Be evaluations required to find the final design. It should be reiterated that the evaluation cost of R Be is much higher (at least 3-fold) compared to the simulation cost of the composite structure.
Fine-tuning of the structure is again realized through minimization of (12). Here, the low cost of the process is maintained using a modified Taylor-expansion model (3), which exploits Jacobian J constructed based on simulations of the composite model, whereas the high-fidelity model simulations are performed only at the center design (i.e., R(x (i) ) = R Be (y (i) )) [46]. Consequently, each iteration of the tuning process requires only two EM simulations of each subcomponent (i.e., a total of four and eight simulations for couplers and phase shifters, respectively) and single evaluation of the assembled BM. The design y * obtained after the fine-tuning stage is the final solution of the presented design process.

Summary of the Design Framework
The proposed framework for design of miniaturized Butler matrices with unconventional phase differences can be summarized as follows: 1.
Define the desired performance of the Butler matrix.

7.
Perform topology development of kth miniaturized BLC and optimize its EM model. 8.
If k = 2, go to step 9; otherwise set k = k + 1 and go to step 7. 9.
Optimize ideal models of phase shifters through minimization of (12), set p = 1. 10. Generate initial dimensions of pth PS and optimize its EM model (cf. Section 3.2). 11. If p = 4, go to step 12; otherwise set p = p + 1 and go to step 10. 12. Optimize composite model of the BM. 13. Perform fine-tuning of the BM.
It should be noted that the design process described here is automated. Consequently, once the models of components and assembled BM are prepared, the role of the user is reduced only to definition of the performance requirements. The design bounds for constrained optimization stages (i.e., the ones governed by the TR algorithm) are defined as ±30% around the starting point for each step. The computational cost of the design process realized using the proposed method is comparable to around a dozen of R Be model simulations. Typically, each step that involves EM model evaluations requires no more than 10 iterations of the algorithm (2) to converge. The cost associated with synthesis of BM, BLCs, and PSs is negligible as it only requires evaluations of the transmission line or equivalent-circuit models. It is worth mentioning that conventional matrix with phase shifts of ±45 • and ±135 • is just a special case for the presented methodology. Consequently, the proposed framework is applicable to the design of eight port BMs, for which the requirements concern size reduction, performance enhancement, or combination of thereof. From this perspective, the methodology represents a generalized approach to design of 4 × 4 Butler matrices discussed in Section 3.

Numerical Results and Experiment
In this section, the proposed bottom-up design framework is demonstrated based on two examples of a compact 4 × 4 Butler matrices. Both structures are implemented on a dielectric substrate with ε r = 3.48, h = 0.168 mm, and tan δ = 0.0037. The center frequency for the considered BMs is set to f 0 = 2.6 GHz, whereas the desired operational bandwidth is from 2.5 to 2.7 GHz. The considered range covers sub-6 GHz bands used by the 5G technology. The presented framework has been benchmarked against conventional approach to BM design. Furthermore, the considered matrices have been compared against the state-of-the-art designs from the literature. The measurement results obtained for one of the matrices have also been included and discussed.
The presented framework is implemented in MATLAB. The latter controls synthesis of BM components, integration of circuits, optimization, as well as bidirectional communication with external simulation packages. Evaluations of the equivalent-circuit models are performed using Keysight ADS software, whereas simulations of the EM models are handled using CST Studio packages.
The geometry of the assembled Butler matrix is shown in Figure 9b. The design parameters of the composite model RBc used for optimization are z = [ xg.1 xg.2 l1 l2 l3 l4 Figure 10. Each row shows responses of the structure "seen" from different input port (P1…4). Gray and black lines represent magnitude and phase responses, respectively.
The simulation results indicate that optimization of the composite model is important for improving transmission responses of the matrix, whereas fine-tuning provides correction of the output-port phase differences. At the final design y * , the RBe model response features in-band matching and isolation above the level of 15.5 dB. Furthermore, at the center frequency it offers insertion loss imbalance below 0.5 dB and phase shift errors below 2.6°. It should be noted that although the optimized design slightly violates the target values of (12)-defined in Section 3.2-this is justified as the objective function comprises a composition of design requirements, balanced by the user-defined weighting factors.  The simulation results indicate that optimization of the composite model is important for improving transmission responses of the matrix, whereas fine-tuning provides correction of the output-port phase differences. At the final design y * , the R Be model response features in-band matching and isolation above the level of 15.5 dB. Furthermore, at the center frequency it offers insertion loss imbalance below 0.5 dB and phase shift errors below 2.6 • . It should be noted that although the optimized design slightly violates the target values of (12)-defined in Section 3.2-this is justified as the objective function comprises a composition of design requirements, balanced by the user-defined weighting factors. Table 1 provides more information on performance of the matrix in terms of reflection R BW , isolation I BW , transmission imbalance ∆M BW , and phase imbalance ∆P BW within 2.5 to 2.7 GHz band, as well as transmission ∆M f0 and phase ∆P f0 imbalances at the center frequency. The quantities R BW and I BW refer to the worst-case in-band performance (across all considered responses), whereas imbalance represents maximum difference between the considered groups of characteristics, either within the band of interest or at the center frequency. The proposed design procedure has been benchmarked against the method where the fine-tuning step governed by algorithm (2) involves only evaluations of the R Be model [45]. For fair comparison, it is assumed that the design z (0) -obtained through individual optimization of BM components-is used as a starting point for adjustment of the structure topology. The assumption seems justified considering that it follows the industry-wide divide-and-conquer strategy to design of complex circuits [27,36,53,54]. The tests have been performed on an Intel Xeon machine with 32 GB RAM. For the given machine, the average cost of EM simulation of BLC, PS, and assembled BM amounts to 4.5 min, 36 s, and 41 min, respectively. The results shown in Table 2 indicate that the proposed procedure is capable of yielding a design with competitive performance compared to the method that does not blend R Be and R Bc models, yet at a fraction of its computational cost (13.2 h for the presented approach vs. 91.6 h for conventional method). The number of model evaluations in Table 2 refers to all simulations of particular models required to find the final design.

Butler Matrix 2
The second design example is the Butler matrix structure featuring phase shifts  36] T has been found after nine iterations of (2). The optimized BM is characterized by the dimensions of 25.9 mm × 29.6 mm and overall footprint of only 766.6 mm 2 . The response characteristics of the matrix are shown in Figure 11. The BM features in-band reflection below −14 dB and isolation higher than 20 dB, respectively. Moreover, at f 0 the structure offers insertion loss imbalance below 0.55 dB and phase shift imbalance below ±2.5 • w.r.t. the target values. It should be noted that the slightly worsened in-band reflection compared to the structure of Section 4.1 results from the narrow bandwidth of the BLC required for the realization of the −45 • phase shift. The performance characteristics of the optimized structure are gathered in Table 3 (for explanation of used performance metrics, see Section 4.2).  Figure 11. The BM features in-band reflection below −14 dB and isolation higher than 20 dB, respectively. Moreover, at f0 the structure offers insertion loss imbalance below 0.55 dB and phase shift imbalance below ±2.5° w.r.t. the target values. It should be noted that the slightly worsened in-band reflection compared to the structure of Section 4.1 results from the narrow bandwidth of the BLC required for the realization of the −45° phase shift. The performance characteristics of the optimized structure are gathered in Table 3 (for explanation of used performance metrics, see Section 4.2).

Numerical and Experimental Validation
Simulation-based validation of the optimization results has been performed through evaluation of the optimized designs using a solver based on the method of moments (Keysight ADS). A comparison of the performance characteristics obtained using CST Studio and Keysight ADS is shown in Figures 12 and 13, respectively. The simulation results for BMs feature reflection below −12 dB and isolation above 13 dB within the bandwidth of interest. Regardless of the selected excitation port, the first design offers the insertion-loss imbalance below 0.9 dB and phase shift error below 8°. For the second

Numerical and Experimental Validation
Simulation-based validation of the optimization results has been performed through evaluation of the optimized designs using a solver based on the method of moments (Keysight ADS). A comparison of the performance characteristics obtained using CST Studio and Keysight ADS is shown in Figures 12 and 13, respectively. The simulation results for BMs feature reflection below −12 dB and isolation above 13 dB within the bandwidth of interest. Regardless of the selected excitation port, the first design offers the insertion-loss imbalance below 0.9 dB and phase shift error below 8 • . For the second matrix, the values are 1.5 dB and 9 • , respectively. The obtained results indicate that the EM simulation results are valid.  For further verification of the results, the structure of Section 4.2 has been fabricated and measured. The photograph of a manufactured prototype is shown in Figure 14. To facilitate the measurement process, feed lines of the structure have been modified to make space for connectors. A comparison of reflection/isolation characteristics obtained from simulations and measurements is shown in Figure 15. The misalignment between the responses (most notably the frequency shift) is the consequence of the systematic error caused by differences between the relative permittivity and thickness of the substrate used for simulations and measurements. As can be seen from Figure 16, accounting for systematic errors significantly improves alignment between the responses. Other factors affecting the discrepancies include manufacturing tolerances, as well as assembly-   For further verification of the results, the structure of Section 4.2 has been fabricated and measured. The photograph of a manufactured prototype is shown in Figure 14. To facilitate the measurement process, feed lines of the structure have been modified to make space for connectors. A comparison of reflection/isolation characteristics obtained from simulations and measurements is shown in Figure 15. The misalignment between the responses (most notably the frequency shift) is the consequence of the systematic error caused by differences between the relative permittivity and thickness of the substrate used for simulations and measurements. As can be seen from Figure 16, accounting for systematic errors significantly improves alignment between the responses. Other factors affecting the discrepancies include manufacturing tolerances, as well as assembly-  For further verification of the results, the structure of Section 4.2 has been fabricated and measured. The photograph of a manufactured prototype is shown in Figure 14. To facilitate the measurement process, feed lines of the structure have been modified to make space for connectors. A comparison of reflection/isolation characteristics obtained from simulations and measurements is shown in Figure 15. The misalignment between the responses (most notably the frequency shift) is the consequence of the systematic error caused by differences between the relative permittivity and thickness of the substrate used for simulations and measurements. As can be seen from Figure 16, accounting for systematic errors significantly improves alignment between the responses. Other factors affecting the discrepancies include manufacturing tolerances, as well as assembly-(manual positioning/soldering of the connectors) and measurement-related errors (also resulting from tolerances of the matching loads) [55]. It should be emphasized that connectors alter electrical properties of the measured BM, which is mostly due to discontinuities at their interface with microstrip lines. The effect contributes to degradation of measured performance as compared to simulations. Nevertheless, the agreement between the simulations and measurements can be considered acceptable. A summary of the measured structure performance is provided in Table 4. In many cases, resemblance between EM simulations and measurements can be further increased by excluding the effects of the fixture (here, connectors along with introduced feeding lines) on the device under test [56]. Unfortunately, high sensitivity to precision of the assembly makes the method unsuitable for the discussed setup.
Sensors 2021, 21,851 nectors alter electrical properties of the measured BM, which is mostly d nuities at their interface with microstrip lines. The effect contributes to measured performance as compared to simulations. Nevertheless, the tween the simulations and measurements can be considered acceptable. the measured structure performance is provided in Table 4. In many case between EM simulations and measurements can be further increased by effects of the fixture (here, connectors along with introduced feeding lines under test [56]. Unfortunately, high sensitivity to precision of the assem method unsuitable for the discussed setup.

of 24
nectors alter electrical properties of the measured BM, which is mostly due to discontinuities at their interface with microstrip lines. The effect contributes to degradation of measured performance as compared to simulations. Nevertheless, the agreement between the simulations and measurements can be considered acceptable. A summary of the measured structure performance is provided in Table 4. In many cases, resemblance between EM simulations and measurements can be further increased by excluding the effects of the fixture (here, connectors along with introduced feeding lines) on the device under test [56]. Unfortunately, high sensitivity to precision of the assembly makes the method unsuitable for the discussed setup.      (a)

Comparison with Benchmark Structures
The optimized designs from Sections 4.1 and 4.2 have been compared in terms of size and performance with other planar BMs from the literature [16,21,23,[27][28][29][30]. Whenever possible (all benchmark designs except the one reported in [28]), the performance figures have been obtained based on the EM simulation results. The considered figures include magnitude imbalance ΔMf0 and phase shift ΔPf0 at the center frequency, as well as bandwidth BW. The latter is expressed in percent and calculated as a ratio of the difference between the upper and lower frequencies (i.e., the ones for which isolation is above 15 dB and reflection below −15 dB) to the specified f0. All of the considered figures represent the worst-case scenario for a series of analyses w.r.t. all input ports. For fair comparison of size, the dimensions of all matrices are expressed in terms of a guided wavelength λg (defined for the given center frequency and electrical parameters of the sub-

Comparison with Benchmark Structures
The optimized designs from Sections 4.1 and 4.2 have been compared in terms of size and performance with other planar BMs from the literature [16,21,23,[27][28][29][30]. Whenever possible (all benchmark designs except the one reported in [28]), the performance figures have been obtained based on the EM simulation results. The considered figures include magnitude imbalance ∆M f0 and phase shift ∆P f0 at the center frequency, as well as bandwidth BW. The latter is expressed in percent and calculated as a ratio of the difference between the upper and lower frequencies (i.e., the ones for which isolation is above 15 dB and reflection below −15 dB) to the specified f 0 . All of the considered figures represent the worst-case scenario for a series of analyses w.r.t. all input ports. For fair comparison of size, the dimensions of all matrices are expressed in terms of a guided wavelength λ g (defined for the given center frequency and electrical parameters of the substrate used to implement the circuit). The results shown in Table 5 indicate that the designs obtained using the proposed procedure are characterized by competitive performance (with particular emphasis on unconventional output-port phase differences). Moreover, at roughly 10-fold smaller size w.r.t. conventional BM, the obtained designs outperform other structures in terms of miniaturization. It should be emphasized that compact dimensions, competitive magnitude/phase performance, and the center frequency of around 2.6 GHz make the optimized BMs of potential use for IoT devices interconnected through a 5G backbone [57]. * Worst-case performance obtained for excitation of the BM using ports P 1-4 ; # Multilayer structure implemented using substrates with different electrical parameters; $ With respect to conventional Butler matrix; ! Simulation results not available-data obtained from measurements; & Calculated for reflection below −15 dB and isolation above 15 dB.

Conclusions
In this work, a bottom-up framework for low-cost automated design of 4 × 4 Butler matrices with nonstandard output-port phase shifts has been presented. The technique involves sequential development of BM components followed by optimization of the composite structure representation and fine-tuning of the assembled matrix. Each design step is governed by optimization algorithm. The proposed approach has been demonstrated using two compact BM structures designed to operate at 2.6 GHz frequency. The first circuit, designed to provide phase shifts of {−30 • , 150 • , −120 • , 60 • }, offers simulationbased reflection below −15 dB within the 2.5-2.7 GHz band, along with low transmission and phase shift imbalance of 0.5 dB and ±2.5 • . The second structure realizes phase shifts of {−20 • , 160 • , −110 • , 70 • } with in-band reflection below −14 dB, as well as phase and magnitude imbalance below 0.55 dB and ±2.5 • , respectively.
The computational cost of BM design using the proposed strategy is over 80% lower compared to more conventional approach that does not exploit composite models. The structures have been compared against state-of-the-art BMs from the literature in terms of performance and size. With the footprints of only 696 and 767 mm 2 , the optimized circuits outclassed BMs from the literature in terms of size reduction while providing competitive performance. The unique feature of the presented designs-which makes them of potential use for IoT systems interconnected through the 5G communication network-is that they address requirements concerning both size reduction and enhanced performance, whereas the benchmark designs address only one of these figures at a time. The measurements of the fabricated BM prototype are also provided.
Future work will focus on increasing the flexibility of BMs in terms of attainable output-port phase shift as well as development of design methods that support algorithmdriven integration of feeding networks and radiators into a complete antenna array.