Nanowire systems: technology and design

Nanosystems are large-scale integrated systems exploiting nanoelectronic devices. In this study, we consider double independent gate, vertically stacked nanowire field effect transistors (FETs) with gate-all-around structures and typical diameter of 20 nm. These devices, which we have successfully fabricated and evaluated, control the ambipolar behaviour of the nanostructure by selectively enabling one type of carriers. These transistors work as switches with electrically programmable polarity and thus realize an exclusive or operation. The intrinsic higher expressive power of these FETs, when compared with standard complementary metal oxide semiconductor technology, enables us to realize more efficient logic gates, which we organize as tiles to realize nanowire systems by regular arrays. This article surveys both the technology for double independent gate FETs as well as physical and logic design tools to realize digital systems with this fabrication technology.


Introduction
Nanosystems are integrated systems exploiting nanoelectronic devices. Extreme miniaturization has multiple positive effects, including better electronic properties (e.g. performance) and lower cost. In particular, this work considers silicon nanowire (SiNW) technology as a possible replacement/enhancement of current device technologies and design issues for

Technology overview
Here, we introduce the technology of DG-SiNWFETs and the associated circuit structures.

(a) Transistors with controllable polarity
The ambipolar conduction phenomenon is observable in several nanoscale FET devices (45 nm node and below), including silicon [10], carbon nanotubes [11] and graphene [12]. The control of the ambipolarity allows us to adjust the device polarity online. Such transistors, i.e. with a controllable polarity, have been experimentally fabricated in several novel technologies, such as carbon nanotubes [13], graphene [14] and SiNWs [15,16]. To the best of our knowledge, Sacchetto et al. [17] and De Marchi et al. [18] were the first to fabricate and test successfully SiNW transistors with independent individual control. They introduced DG-SiNWFETs where one gate controls the polarity (i.e. type of carrier, n or p), whereas the other gate controls the carrier flow in the channel. The operation of these FETs is enabled by the regulation of Schottky barriers on source/drain junctions through the additional gate.
In particular, De Marchi et al. [18] fabricated vertically stacked SiNWFETs, featuring two gateall-around electrodes (figure 1). Vertically stacked GAA SiNWs represent a natural evolution of FinFET structures, providing better electrostatic control over the channel and consequently superior scalability properties [18].
In the device, one gate electrode, the control gate (CG), acts conventionally by turning on and off the device. The other electrode, the polarity gate (PG), acts on the side regions of the device, in proximity of the source/drain (S/D) Schottky junctions, switching the device polarity dynamically between n-and p-type (figure 2). The input and output voltage levels are compatible, resulting in directly cascadable logic gates. It should be noted that owing to the device geometries, the two gates are not identical from a size standpoint. Indeed, the PG is roughly two times bigger than the CG, leading to differences in their timing responses. Such a behaviour can be easily compensated at the design level by assigning the signal with the lowest frequency/switching activity to the slowest gate terminal.
Thanks to their one-dimensional structure, DG-SiNWFETs demonstrate remarkable electrostatic performances. Figure 2 depicts the subthreshold slopes of 64 mV/Dec and 70 mV/Dec for the p-type and n-type parts of the characteristic, respectively, hence competing with the most advanced FinFET technologies [3]. In addition, the one-dimensional electrostatic control over the channel coupled to the use of a Schottky barrier-based injection mechanism enables very low off-current densities of a few pA per µm when compared with few tens of pA per µm for low-power FinFETs [3]. These combined facts qualify the presented device technology as high-performance low-standby-power technology.

(b) Logic operations with double-gate field effect transistors
Digital circuits using these transistors can exploit both gates as inputs, thereby enabling the design of compact cells that implement XOR more efficiently than in CMOS. Indeed, in the context of digital operations, DG-SiNWFETs realize intrinsically an XOR characteristic, because the transistor is ON when PG = CG, i.e. PG ⊕ CG = 1, and consequently is OFF when PG ⊕ CG = 1. Figure 3 presents a pseudo-logic XOR gate. The device in the pull-down network is polarized by means of the PG. In the case of the n-type polarization, the characteristics of a pseudo-logic inverter are obtained (green). In the p-type polarization, a buffer is obtained (blue). As shown in the inset truth table, an XOR function can be implemented by a single transistor and a pull-up.
The unique feature of this device of being polarized electrostatically was first used to build a reconfigurable logic cell [6], and later used to define a static XOR-intensive logic family [7]. In particular, a full-swing two-input XOR gate can be achieved by using a complementary pullup and parallel transistors to avoid threshold drops. The XOR and XNOR implementations,  Three-dimensional sketch of the SiNWFET featuring two independent gates and its associated symbol (a). Tilted SEM views of an array of fabricated devices before creation of the control gate (b) and after addition of the polarity gates (c). S/D pillars and nanowires (green), PG (violet) and CG (red) are shown [18].  reported in figure 4, require four transistors, whereas the traditional full-swing static CMOS implementation uses eight transistors [20]. Various families of logic gates can be designed for DG-SiNWFETs. In particular, one can extend the principle shown in figure 4 to design arbitrary combinational logic functions. Alternatively, fewer transistors can be used by either using a dynamic (or resistive) load, or by correcting the reduced swing owing to threshold drops by using an output buffer. Examples of realizations of arbitrary functions are shown in figure 5.   Figure 3. Pseudo-logic XOR characteristic obtained using a single SiNWFET with controllable polarity [19]. Figure 4. Two-input XOR (a) and XNOR (b) gates built with DG-SiNWFETs [7]. Tiles are configured to realize logic functions that are part of a complex system such as a processor [19]. (Online version in colour.)

Sea of tiles: how to deal with the routing congestion
Regular layout fabrics have the advantage of higher yield as they maximize layout manufacturability. In this section, we describe a novel architecture, called SoT, which is an array of logic tiles that are uniformly spread across the chip. The concept is illustrated in figure 6. Each tile is a template that can be wired to implement an elementary logic gate, such as a NOR, NAND, XOR, DFF or more generally a single-output combinational logic function. Note first that functions realized in ambipolar technology are not restricted to be unate. It is important to stress that the choice of logic tile (or tiles) to use in an array is important, as larger tiles can implement more complex functions, but waste devices for smaller functions, as in the case of gate arrays.

(a) Towards a regular gate arrangement
Layout regularity is one of the key features required to increase the yield of integrated circuits at advanced technology nodes [8]. Various regular fabrics have been proposed throughout the evolution of the semiconductor industry, with some recent approaches explained in [8,21,22]. In gate-array fabric style, a sea of prefabricated transistors is customized to obtain a desired logic gate. The customization of generic gate arrays comes at a large area cost as well as routing overhead, thereby increasing the performance gap between application-specific integrated circuits (ASICs) and gate arrays. However, strict design rules, at 22 nm technology node and beyond, have led to ASIC cell layouts with arrays of gates with a constant gate pitch, which resemble a sea-of-gates layout style. In Bobba et al. [9], a logic tile was defined as a fixed pattern of prefabricated transistor pairs grouped together. Uncommitted tiles can then be mapped to logic cells by connecting the gates and the S/D free terminals.  (i) Dumbbell-stick diagram Similar to the CMOS stick diagrams [23], dumbbell-stick diagrams abstract the topology of logic gates with DG FETs technology. They are a convenient means for designing compact layouts and for minimizing the cell routing complexity. Figure 7a shows the dumbbell-stick diagram and how it is inspired by the physical shape of the device. The suspended SiNWs between the source and drain contacts form the basic dumbbell. The CG and the PG constitute the sticks.  . The personalization of the tile is reminiscent of the methods used for CMOS cells, which determine an optimum sequence of pairs with a minimum number of gaps [24]. Figure 8a shows an example of a two-input NAND gate with the PGs biased to either G ND or V DD . Figure 8b shows its equivalent dumbbell-stick diagram.

(b) Layout techniques
(iii) Layout technique for simple binate logic gates  . XOR is shown in figure 9, where gates with similar polarity are grouped together to reduce routing. From the dumbbell-stick diagram, we can observe that the PUN and PDN are placed next to each other, which is possible with DG-SiNWFET technology as the transistors are field controlled to make them p-type or n-type. More complex cell designs have been proposed which leverage upon embedded XOR functionality of DG FETs [7,25,26].

(iv) Layout technique for sequential elements
Sequential elements can still be efficiently mapped onto a set of tiles. Indeed, sequential elements often embed transmission gates that can be grouped together. Figure 10 illustrates a D flipflop (DFF) mapped onto an array of tiles. In this implementation, we can observe that the two transmission gates in the master (slave) stage are physically mapped onto tile 1 (tile 3 ), efficiently compacting the overall mapping of the circuit. The inverters in the master, slave and output stages of the DFF are mapped onto tile 2 , tile 4 and tile 5 , respectively. The inverting stage of the clock signal is not depicted.

Logic synthesis
Here, we summarize models and methods for performing effectively logic synthesis and mapping into an SoT.   [19].
Transistors with controllable polarity intrinsically embed the XOR logical connective and thus enable the realization of XOR operators with the same ease as NAND/NORs. The original logic synthesis methods [27][28][29], which are the basis for current commercial tools, use NAND/NOR representations and tend to be less effective for XOR-rich circuits, such as arithmetic operators and data paths. Other methods (e.g. BDS [30]) use binary decision diagrams (BDDs) to fully represent, manipulate and decompose logic functions. Thanks to the advantageous BDD-based XORdecomposition techniques, BDS efficiently synthesizes XOR-intensive circuits. In the following, we show a formalism that is directly applicable to logic circuits to be implemented with XOR primitives, such as those based on DG-SiNWFETs. In particular, we introduce a novel BDD extension, called biconditional binary decision diagrams (BBDDs), that presents the advantage of directly supporting the behaviour of DG-SiNWFETs. Such a representation is canonical and demonstrates powerful properties when coupled to one-pass synthesis methodologies.

(a) Biconditional binary decision diagrams
This section summarizes BBDDs. First, it presents the core logic expansion that drives BBDDs. Then, it gives ordering and reduction rules that make reduced and ordered BBDDs (ROBBDDs) canonical. A detailed description is given in [31].
Note that the biconditional expansion is a special case of the (x i , p)-decomposition in [32] that extends the Shannon expansion. Note that only functions with two or more variables can be decomposed by a biconditional expansion. Indeed, in single variable functions, the XOR and XNOR terms cannot be computed. In such a condition, the biconditional expansion of a single

(ii) Biconditional binary decision diagram structure and ordering
A BBDD is a BDD driven by the biconditional expansion in place of Shannon's expansion. Each non-terminal node in a BBDD has the branching condition biconditional on two variables. We call these two variables the primary variable (PV) and the secondary variable (SV).
An example of a BBDD non-terminal node is provided in figure 11. We refer hereafter to PV = SV and PV = SV edges in a BBDD node simply as the = -edges and = -edges, respectively.
To achieve OBBDDs, a variable order must be imposed for PVs and a rule for the other variables assignment must be provided. We use the following chain variable order (CVO) to address this task. Given a Boolean function f and an order π = (π 0 , π 1 , . . . π n−1 ) of the inputs, PVs and SVs are ordered as with i = 0, 1, . . . , n − 2; PV n−1 = π n−1 SV n−1 = 1.
The CVO is a key factor enabling unique representation of ordered biconditional decision structures. We refer to ordered binary biconditional decision structures as BBDDs ordered by the CVO.

(iii) Biconditional binary decision diagram reduction
As in the case of OBDDs, also OBBDDs can be reduced to improve the representation efficiency, according to a set of rules. The straightforward extension of OBDD reduction rules [4] to OBBDDs corresponds to the iterated merging of isomorphic subgraphs.
Moreover, the OBBDD can be further reduced by eliminating levels with no nodes. Last, subgraphs that represent functions of a single variable can be collapsed into a single BDD node. Reduced OBBDDs are canonical [31].

(b) One-pass logic synthesis
One pass synthesis (OPS) [33] is a logic synthesis methodology where logic optimization and technology mapping phases are combined in a single step carried out through a common data structure, e.g. BDDs. To target XOR-rich functions, we use BBDDs as data structure.  f (x,y,...,z) Figure 12. BBDD node corresponding logic gate and realization in ambipolar and CMOS technologies [31]. (Online version in colour.) In BBDD-based OPS, logic optimization corresponds to the ROBBDD construction. Note that most of the algorithms for ROBDD construction, e.g. BUILD, APPLY [34], etc., can be adapted to ROBBDDs, hence to support the biconditional expansion in place of Shannon's expansion. Standard dynamic variable reordering algorithms can be applied also with the CVO (figure 12).

System-level design issues
The combination of the DG-SiNWFETs technology and BBDS-based synthesis can be applied to the design of both data path and control circuits. In particular, it enables the compact impact implementation of arithmetic functions and opens novel horizons in terms of testing and online fault detection.
(a) Compact arithmetic operators DG-SiNWFETs enable the efficient design of parity circuits. Besides the efficient full-swing fourtransistor XOR gate realization, shown in figure 4, a three-input XOR realization [26] leverages pass-transistor logic, as depicted in figure 13a. Note that in static CMOS, the same gate has 10 devices in place of 4 here [20].
Inspired by this last structure, a four-transistor three-input majority logic gate [35] is shown in figure 13b. This gate relies on the pass-transistor implementation of the MAJ(A, B, C) function rewritten as  Figure 14. Full-adder implementation with eight controllable polarity devices.

B -
Note that in static CMOS, the same gate has 10 devices in place of 4 [20]. Moreover, the four DG-SiNWFETs configuration (of figure 13a) can be generalized to the MUX-like structure depicted in figure 13c. Its functionality corresponds to a multiplexer driven by an XNOR operation between A and B, selecting between two external signals F and G. With different assignments of F and G, it is possible to implement three-input MAJ(F = A, G = C), three-input MIN(F = A , G = C ), three-input XOR(G = C , F = C) and two-input XOR(G = 1, F = 0) logic gates. Therefore, this four-transistor structure can be seen as a generalized arithmetic gate.
The full-adder (FA) is a widely used arithmetic circuit that supports the addition of two binary numbers. It is represented by the following three-input two-output logic function: Controllable polarity transistors offer an advantageous implementation for both the sum and C out functions using two generalized arithmetic gates. Therefore, the full-adder is competitively realized by eight devices, input inverters apart, as depicted by figure 14. The corresponding static (transmission gate) CMOS version has 28 (14) transistors [20].
(b) Self-checking computation Among online testing strategies, self-checking circuits offer an efficient way of testing circuits without adding redundant voter circuitry such as in triple modular redundancy [36]. The most used self-checking technique is the parity prediction scheme [37]. Parity computation relies largely on the XOR operation, and therefore its implementation with the DG-SiNWFET technology can be fairly effective. The design of a self-checking ripple-carry adder has been introduced in [35] and is shown in figure 15.
The adder includes one-bit adders with complemented carry, double-rail checkers and parity generation trees. The complemented carry can be included within the existing FA structure, thanks to a compact minority operator. Indeed, only four extra transistors are required, whereas static CMOS design style needs 12 extra transistors. The parity-generation tree includes cascaded compact two-input XORs. Unfortunately, the compact four-transistor XOR implementation enabled by DG-SiNWFETs does not provide the fault-secure property. Indeed, in the case of a fault on the PGs, there exist some conditions where all the transistors take the same polarity, therefore leading to undetermined levels at the output. For this reason, in [35], a few parts of the circuit (the double-rail checkers) are still implemented using a traditional static CMOS implementation to guarantee the self-checking property. Nevertheless, the use of DG-SiNWFETs opens new opportunities also for fault-tolerant architectures.  Figure 15. Self-checking n-bit adder using carry-checking parity-prediction scheme [36].

Conclusion
We have presented here a complete design framework for nanoelectronic computational systems that leverage DG-SiNWFET technology. This framework includes semiconductor process development, device and circuit design, models and design tool research as well as architecting overall systems. In particular, we have shown the synergy of research results coming from novel device fabrication with circuit and architectural design. This research aims at achieving scalable arrays of nanodevices within regular arrangements, as a way to mitigate wiring variability. Last but not least, we have shown the challenges in design automation for nanotechnologies at various levels of abstraction.
Funding statement. This research is supported by the ERC senior grant no. NanoSys ERC-2009-AdG-246810.