Integrated Device-Fabric Explorations and Noise Mitigation in Nanoscale Fabrics

Prithish Narayanan, Jorge Kina, Pavan Panchapakeshan, Chi On Chui, and Csaba Andras Moritz

Abstract—An integrated device-fabric methodology for evaluating and validating nanoscale computing fabrics is presented. The methodology integrates physical layer assumptions for materials and device structures with accurate 3-D simulations of device electrostatics and operations and circuit level noise and cascading validations. Electrical characteristics of six different Crossed Nanowire Field Effect Transistors (xnwFETs) are simulated and current and capacitance data obtained. Behavioral models incorporating device data are generated and used in fabric level simulations to evaluate noise implications of devices and sequencing schemes. Device characteristics are found to have different implications for logic ‘1’ and logic ‘0’ noise with faster devices being more (less) resilient to logic ‘1’ (logic ‘0’) noise. A new noise resilient dynamic sequencing scheme is presented which isolates logic ‘0’ noise events and prevents them from propagating (in cascaded circuit stages) thereby enabling faster devices. Performance implications and optimizations for fabrics incorporating the new noise resilient scheme are discussed. The scheme is also analyzed and validated against an external noise source (power supply drooping). These results show that noise resilient nano-fabrics can be designed through a combination of device engineering and fabric level optimizations of the sequencing scheme. Performance optimizations and implications of device and physical layer assumptions on manufacturing are discussed.

Index Terms—NASICs, nanodevices, nanoscale computing fabrics, noise, dynamic circuits, emerging technologies, semiconductor nanowires, nanowire crossbar, nano-architecture, nanowire FETs

I. INTRODUCTION

Emerging nanomaterials and devices such as semiconductor nanowires [1], [2], carbon nanotubes [3], [4], graphene [5], spin waves [6] are promising alternatives to scaled CMOS for electronics applications. However, new computing fabrics incorporating novel nanomaterials need to overcome challenges at multiple design levels including manufacturing modules and sequences, device design, circuit styles and fault-tolerance. Therefore a fabric-centric mindset, where design choices and optimizations at individual design levels must be compatible with the fabric as a whole, is essential for realizing future nanoscale systems. For example, while device choices and optimizations must target key electrical parameters such as threshold voltage and intrinsic delay, they should also i) be fully validated at the circuit/fabric level for noise implications and functionality and ii) not impose insurmountable challenges for the fabric manufacturing sequence.

In this paper we present an integrated device-fabric exploration with simulations at the circuit level built on accurate 3-D physics based simulations of nanodevice electrostatics and operations. We extract device I-V characteristics, parasitic capacitances and key electrical parameters such as threshold voltage and intrinsic delay for a variety of material and structural assumptions. We then create behavioral models of the data for a circuit simulator and use these to evaluate devices in-fabric for noise resilience, signal integrity and validation of worst-case test circuits and fabric sequencing schemes. We also discuss implications of device and fabric choices for manufacturing. While this work is focused on Crossed Nanowire Field Effect Transistors (xnwFETs) for the Nanoscale Application Specific Integrated Circuits (NASIC) [7], [8], [9], [10], [11], the approach and methodology are fairly generic and may be applicable to other nanoscale fabrics.

NASICs are composed of regular grids of semiconductor nanowires with xnwFETs at certain crosspoints. In NASICs, design choices at multiple levels are tailored towards mini-mization of fabric and manufacturing requirements with limited nanoscale customization requirements. For example, self-assembly based alignment techniques for nanowires [12], [13], [14] favor the formation of regular structures. Furthermore, in keeping with the fabric-centric mindset, novel dynamic circuit styles are used that i) are amenable to implementation on nanowire grids ii) would not require complementary doping for the logic transistors using innovative external sequencing schemes [8] and iii) do not require arbitrary sizing of devices or arbitrary routing between devices. However, noise resilience and cascading related issues are critical in dynamic circuits owing to high output impedance and need to be carefully addressed through an integrated device-fabric methodology.

In this paper, we extensively evaluate and validate devices and sequencing schemes for the NASIC fabric. We discuss the impact of key device level parameters on noise margins and signal integrity at the fabric level. We also present optimizations at device level and a new fabric level sequencing scheme that together achieve noise resilience and correct functionality for cascaded NASIC designs. A capacitance
engineering technique for improving system-level performance and manufacturing considerations for the various optimization techniques are analyzed.

The key contributions of this paper are: i) A methodology for integrated device fabric explorations for fabric validation across multiple design levels is presented; ii) Accurate 3-D physics based simulations of new xnwFET devices are presented and their characteristics extracted; iii) The implications for noise and signal integrity at the fabric level are discussed through extensive circuit level simulations; iv) A new noise resilient fabric level sequencing scheme is presented that, in conjunction with device level optimizations, validates the NASIC fabric; and v) Fabric-friendly optimizations for improving system-level performance are presented.

The rest of the paper is organized as follows: Section II discusses the methodology for integrated device-fabric explorations. Section III presents novel device structures for the xnwFET and characterizes their electrical behavior. Section IV evaluates devices in-fabric for noise resilience. Section V presents a new noise resilient sequencing scheme and NASIC fabric validations. Section VI discusses performance and manufacturing implications as well as the effect of power supply drooping (an external source of noise). Section VII concludes the paper.

II. METHODOLOGY FOR INTEGRATED DEVICE-FABRIC EXPLORATION

The methodology for bottom-up integrated device-fabric explorations is detailed in this section. It encompasses physical layer assumptions, device level explorations and implications at higher design levels and is summarized in a flow diagram (Fig. 1).

A variety of physical layer assumptions such as choice of gate material and the structure of devices can be made targeting device metrics such as the threshold voltage, on-currents and intrinsic delay. For example, the gate material used in NASIC crossed nanowire field effect transistors (xnwFETs) could be composed of crystalline silicon, nickel silicide or metals. Similarly, the structure of the device may be a top nanowire gate or an Omega gated structure for tighter electrostatics. In accordance with the fabric centric mindset, these assumptions need to be evaluated in terms of implications for manufacturing as well as for other design levels.

The electrical properties of individual xnwFETs may be characterized using accurate 3-D physics based simulation of the nanostructures using Synopsys® Sentaurus® TM. Calibration of the tool against experimental data at similar dimensions is required to account for nanoscale effects such as increased surface roughness and interface trap states. These device-level simulations provide 3 sets of data: i) Current data for different values of drain-source (V_DS) and gate-source (V_GS) voltages, ii) Device capacitances at different values of V_GS, and iii) key device parameters/defects that determine noise margins and performance of the devices such as the on-currents (I_ON), threshold voltage (V_TH) and the intrinsic delays of the devices. These device parameters may be adjusted by changing underlying physical layer assumptions as well as the substrate bias (e.g. a higher threshold voltage may be obtained by modifying the metal work function or using a more negative back gate bias).

The current data is fitted as a function of V_GS and V_DS using regression analysis and curve fitting. This step expresses the current as a mathematical function of V_GS and V_DS. The expression for the current, in conjunction with a piecewise linear approximation for the device capacitances forms a behavioral model of the xnwFET, which may be incorporated into a standard circuit simulator such as HSPICE to carry out circuit level evaluations.

The circuit level simulations take as inputs the behavioral models for individual devices, circuit netlists with worst-case noise scenarios as well as fabric specific control and sequencing schemes. As will be shown in the paper, different sequencing schemes have different implications while considering noise margins and signal integrity; they control the flow of data and influence capacitive interactions and glitching in between successive cascaded stages. Different cascading and noise scenarios are evaluated and output waveforms are checked for signal integrity. Circuit level delay and fabric performance implications are also quantified from these simulations. The methodology thus explores implications of physical layer and device assumptions on the fabric as a whole. While it has been explored extensively for the NASIC fabric, this integrated methodology is fairly generic and is applicable to other nano-fabrics as well.

III. PHYSICAL LAYER AND DEVICE EXPLORATIONS

A. Devices Explored

We have considered three different xnwFET structures. Fig. 2 shows an image of each nanowire transistor structure used for this study. The first structure considered is the silicon gate xnwFET. This transistor consists of a bottom nanowire that acts as the channel and a top nanowire, orthogonal to the bottom nanowire, which acts as the gate electrode. These two nanowires are separated by a thin dielectric, which acts as the gate insulator.

The second structure considered is the fully silicided (FUSI) gate xnwFET. This structure is similar to the previous one, except that the gate nanowire has been fully silicided. This eliminates some undesired effects such as gate depletition, and reduces the resistance of the gate nanowire needed for fast evaluation of the previous logic stage. Also NiSi gives a smaller gate-substrate workfunction difference and therefore, there is no need of applying large substrate biases or using large source/drain underlaps to achieve the desired threshold voltage.

The third structure considered is the Omega-gated xnwFET structure with a metal gate. This structure was chosen because it has a better gate to channel coupling than the two previous structures. Therefore it should have a better on current (I_ON) as well as a higher on-to-off current ratio (I_ON/I_OFF).

B. Methodology

Due to the complex structure of xnwFETs, a 3D simulation is mandated. To study the behavior of xnwFETs, Synopsys
Sentaurus Device simulator was used. Before any relevant simulation can be done, the simulation models have to be calibrated. To do this, experimental data from well characterized nanowire channel FETs with similar dimensions was employed [15], [16]. The calibrated models and parameters include the drift-diffusion transport models, to include effects such as carrier scattering due to surface roughness, and dielectric/channel interface trapped charges.

C. Simulation Results

For this study, six different devices have been simulated. For each of the structures mentioned before, we simulated a device with a threshold voltage of around 0.2 V and another device with a threshold voltage of around 0.3 V. The 0.2 V and 0.3 V values for $V_{TH}$ were chosen for the noise resilience study purposes. A lower value for $V_{TH}$ is expected to improve logic ‘1’ noise resilience, but lower the logic ‘0’ noise resilience, whereas a higher value for $V_{TH}$ will do the opposite. To achieve the desired $V_{TH}$ values, a source/drain underlap, as well as a back gate bias can be applied. Table I summarizes
Table II summarizes key parameters such as voltage (V) and intrinsic delay for the different devices. For the 6 devices simulated and Fig. 3(b) shows capacitance (C) vs. drain voltage (VDS). The basic device parameters used to achieve the desired VTH values.

Drain current vs. gate voltage (ID - VG), drain current vs. drain voltage (ID - VDS) and capacitance vs. gate voltage characteristics were simulated and important electrical parameters such as current (ION) and on-to-off current ratio (ION/IOFF) were extracted. Fig. 3(a) shows ID - VG curves for the 6 devices simulated and Fig. 3(b) shows capacitance vs. VG curves for the 6 devices simulated at VDS = 0.8 V (VDD). Similarly, data was obtained for other values of VDS and VG to cover the operating regions of the devices.

D. Device Comparisons

The characteristics of the three nanowire transistor structures are compared as follows. For a given threshold voltage, the silicon gate xnwFET has the smallest ION, followed by the NiSi gate xnwFET and the Omega-gated xnwFET has the highest ION as expected. First the NiSi structure has a higher ION than the Si gate structure because the ΦMS value is lower in the NiSi case. Therefore a smaller source/drain underlap is needed to achieve the same VTH, which in turn reduces the effective channel length, raising the drain current level. For the Omega-gated xnwFET, the higher current level is due to the increased ability of the gate to modulate the channel conductivity. In the Si gate or NiSi gate xnwFET structure, the inversion layer needed to turn on the device is formed mostly on the top part of the channel nanowire, near the gate nanowire, whereas in the Omega-gated xnwFET, the inversion layer can be formed almost all around the channel nanowire and therefore, this can be thought as increasing the effective channel width at the same gate voltage.

Another figure of merit for these three devices is the on-to-off current ratio. For a given threshold voltage, the Si gate xnwFET and the NiSi gate xnwFET devices have similar ION/IOFF but the Omega-gated xnwFET has a higher ION/IOFF value as expected. This is because the Omega-gated xnwFET has better gate to channel electrostatic control than any of the other two structures. In other words, the Omega-gated xnwFET is more effective at turning the device on and off than any of the other xnwFET structures. The Omega-gated xnwFET, therefore, should have better sub-threshold slope than any of the other two devices leading to a higher ION/IOFF.

Also we can compare the capacitances for these three devices. For a given VTH specification, it can be seen that the capacitance values are usually higher for the Omega-gated xnwFET, followed by the NiSi gate device, and the Si gate xnwFET has the lowest values. For example, the NiSi gate device has a higher gate-to-source and gate-to-drain capacitance value than the Si gate device because the former has a smaller junction underlap, which will thus increase the gate coupling to the source and drain. In addition, the NiSi gate device does not have the gate semiconductor depletion issue near the oxide interface further increasing its capacitance values. For the Omega-gated xnwFET, since the gate is wrapped around the channel, it can be easily seen that the gate is located closer to the source and the drain regions than in the other two xnwFET devices. It will in turn increase the gate-to-source and gate-to-drain coupling and thus the respective capacitances.

It is important to note that there are more optimizations that can be applied to each of these devices in order to improve their performance. For instance, strain engineering can be applied to the channel silicon nanowires to improve their mobility and obtain higher current levels. Also the six devices simulated were inversion mode devices, where the doping concentrations have to change from an n-type doping of 10^{20} cm^{-3} in the source and drain to a p-type doping of about 10^{18} cm^{-3} in the channel within a few nanometers. Instead a depletion mode device can be considered [18], where no
IV. CIRCUIT LEVEL SIMULATION AND NOISE EVALUATION IN FABRIC

Behavioral models for the devices examined in the previous section were created using the methodology described in Fig. 1. This section describes a variety of circuit level simulations carried out to identify and fully evaluate the impact of internal noise and validate cascaded nanowire fabrics utilizing xnwFETs.

DC Sweep analysis was done to verify that behavioral models accurately abstract device data. For all devices, it was found that behavioral models accurately track Sentaurus™ current data within 5% error for the voltage ranges considered.

A single NASIC NAND stage [8] was simulated using HSPICE to verify expected functionality. Representative results are shown in Fig. 4 for the Omega 0.2 device. Other devices exhibit similar behavior. From the signal waveforms we make the following key observations: 1) the output precharges to logic ‘1’ when the pre signal is asserted. Typically a value greater than $V_{DD}$ is used for pre to achieve rail-to-rail voltage swing at the output node. 2) The output goes to ‘0’ only when all inputs are ‘1’, achieving the required NAND logic. 3) Current dissipation occurs only when the capacitances are charged or discharged, and there is no static current in NASIC designs as one of pre or eva is always off. 4) During the hold phase, the output does not change. However during this time, the output node has high output impedance which makes it susceptible to switching events in its neighborhood while considering cascaded NASIC designs. In the next set of circuit simulation experiments these internal noise sources and switching events will be investigated in detail for the different xnwFETs and two baseline control schemes.

A. Sequencing schemes for the NASIC fabric

Fig. 5 shows one possible sequencing scheme for cascaded NASIC designs. In this baseline scheme, one stage is precharged and evaluated before the next stage with signals repeating every two stages, i.e. stages 1, 3 and 5 may use the same control signals (say pre1 and eva1) whereas stages 2, 4 would use pre2 and eva2. While any one stage is being precharged or evaluated, its neighbors are in the hold phase, with outputs implicitly latched on the nanowire for correct cascading and pipelining of datapaths.

In general, since control signals are not driven from logic but from reliable external circuitry, they may be optimized to achieve specific targets. One example of this is driving precharge signals to voltages greater than $V_{DD}$, thereby achieving a full $V_{DD}$ voltage swing at the output node of a nanowire for maximum logic ‘1’ noise margin. In keeping with the fabric centric mindset, modifying the control schemes does not impose any new challenges at the physical layer or in terms of manufacturing requirements, since there is no additional customization requirement at the nanoscale. Furthermore, noise implications and signal integrity considerations may be very different depending on the sequencing scheme used, since the scheme decides how logic nodes are switching relative to one another.

Another sequencing scheme used for the NASIC fabric is shown in Fig. 6. This is a 3-phase sequencing scheme where signals are repeating every 3 stages. In a large scale design, this

<table>
<thead>
<tr>
<th>PARAMETERS USED FOR THE DEVICE SIMULATIONS</th>
</tr>
</thead>
<tbody>
<tr>
<td>Device</td>
</tr>
<tr>
<td>Gate Material</td>
</tr>
<tr>
<td>Gate Workfunction (eV)</td>
</tr>
<tr>
<td>Gate NW diameter (nm)</td>
</tr>
<tr>
<td>Channel NW diameter (nm)</td>
</tr>
<tr>
<td>Channel doping (cm$^{-3}$)</td>
</tr>
<tr>
<td>Gate oxide material</td>
</tr>
<tr>
<td>Gate oxide thickness (nm)</td>
</tr>
<tr>
<td>Bottom oxide material</td>
</tr>
<tr>
<td>Bottom oxide thickness (nm)</td>
</tr>
<tr>
<td>Source/Drain underlap (nm)</td>
</tr>
<tr>
<td>Back Gate Bias (V)</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>DEVICE SIMULATION OUTPUT</th>
</tr>
</thead>
<tbody>
<tr>
<td>Si Gate xnwFET</td>
</tr>
<tr>
<td>$V_{TH}$ (V)</td>
</tr>
<tr>
<td>$I_{ON}$ (A)</td>
</tr>
<tr>
<td>$I_{ON}/I_{OFF}$</td>
</tr>
<tr>
<td>Intrinsic delay (ps)</td>
</tr>
</tbody>
</table>
B. Circuit Simulation and Analysis

The six devices described in Section III were evaluated for a worst-case circuit to evaluate noise implications and functionality. Both baseline timing schemes described in the previous sub-section were considered in this analysis.

The three-stage cascaded test circuit used in these noise evaluations is shown in Fig. 7. Stage 1 generates imperfect outputs that drive input xnwFETs of stage 2. Output integrity is checked at output nodes $do21$ and $do31$. Due to high output impedance during the hold phase, the output nodes at various stages may be susceptible to noise effects across device parasitic capacitances.

For example, key sources of noise for the $do21$ node include the Miller capacitances between this node and $do11$ and $do31$ nodes. If $do11$ evaluates to ‘0’ it might cause a downward glitch (degradation of logic ‘1’) at $do21$ due to the $C_{GD}$ capacitance between $do11$ and $do21$. Similarly, if $eva3$ is asserted, a downward glitch may occur at $do21$ due to the $C_{SG}$ parasitic capacitance. Precharging of $do31$ could cause an upward glitch at the $do21$ node. Other similar parasitic effects exist between outputs and intermediate nodes in the
design, leading to glitching and internal noise events.

Fig. 8 and Fig. 9 show the output waveforms for the NiSi 0.2 and Omega 0.2 devices for the basic sequencing scheme described in Fig. 5. Logic ‘1’ glitching is a very serious problem in this timing scheme. Due to parasitic coupling between the pre2 signal and do21 through the $C_{GS}$ capacitor (see Fig. 7), there is a drop in the do21 output when pre2 is deasserted. Furthermore, while do21 is holding logic ‘1’, it may be severely affected by two sources of noise: the $C_{GD}$ capacitance between do11 and do21 as well as the $C_{SG}$ capacitance of the input transistor of stage 3. If eva1 is asserted and do11 simultaneously discharges, a severe downward glitch may be experienced at the do21 node due to these capacitances. This implies that when stage 3 is evaluated, the driving voltage at the do21 node could be significantly below $V_{DD}$.

Two scenarios may then be considered: the voltage of do21 may be below or above $V_{TH}$. In the former case the signal integrity test fails at do21, since it is effectively at a logic ‘0’ voltage level. In the latter case, the circuit functionality depends on the characteristics of the device. A fast device may be able to effectively switch even with a low driving voltage, leading to a correct logic ‘0’ evaluation of node do31, whereas a slower device may not be able to effectively discharge do31, leading to an erroneous logic ‘1’ value on the node. As seen in Fig. 8, circuits with the slower NiSi gated devices fail in this scenario despite the input voltage being within the logic ‘1’ noise margin (i.e. $> V_{TH}$). However, the circuit with Omega 0.2 devices, which is the fastest of the 6 devices considered in terms of intrinsic delay, is able to effectively discharge the output node even with a significantly degraded input voltage. In other words, faster devices are more resilient to logic ‘1’ glitching effects. Of the 6 devices considered for these simulations, only the fastest Omega 0.2 device achieves expected behavior, the 5 slower devices do not work.

Fig. 10 shows output waveforms for the NiSi 0.2 (left) and Omega 0.2 (right) devices for the 3-phase control scheme described in Fig. 6. In this control scheme, logic ‘1’ glitching effects are not as severe as in the previous scheme. This is because both neighboring stages are not simultaneously discharging during the stage 2 hold phase. While there can be some downward glitching due to $C_{SG}$ between do21 and do32, in this scheme the parasitic capacitance $C_{GD}$ to do11 does not hurt logic ‘1’ integrity, since do11 is actually precharging during the stage 2 hold phase. Therefore the NiSi 0.2 device (Fig. 10 - left) is able to effectively discharge the do31 output node, leading to correct functionality. As expected, the Omega 0.2 device works correctly in the presence of logic ‘1’ glitches.

However, in this sequencing scheme, logic ‘0’ glitching is an important consideration. Due to precharging of node do11, the output node do21 might have an upward glitch from logic ‘0’ during its hold phase. For the Omega 0.2 device this upward glitch might cause a logic ‘0’ value to reach above the threshold voltage of the device. Given that this device has the lowest intrinsic delay of all devices considered, the glitch may be sufficient to cause the stage 3 input xnwFET to operate in the linear region, leading to loss of signal integrity (Fig. 10 – right). In other words, faster devices are less resilient to logic ‘0’ glitching effects. Of the 6 devices considered, the slowest NiSi 0.3 and Si 0.3 devices fail due to logic ‘1’ glitching effects, whereas the Omega 0.2 fails due to the logic ‘0’ glitching. NiSi 0.2, Si 0.2 and Omega 0.3, which are middle-of-the-road devices in terms of intrinsic delay, pass all signal integrity tests and are correctly evaluated.

As seen from these results, both sequencing schemes and device properties have strong implications on noise. Glitching occurs due to switching events in the neighborhood, which are influenced by the external control sequence. Therefore, while device parameters such as $V_{TH}$ and intrinsic delay need to be adjusted for noise resilience, additional noise optimizations could be done at the fabric level by altering the sequencing schemes and eliminating or isolating glitching events. For example, the 3-phase scheme is resilient to logic ‘1’ glitching for 4 out of 6 devices owing to the higher driving voltage at the input nodes, whereas the other baseline scheme works only for 1 of 6 devices. We could then potentially design a new noise resilient timing scheme that preserves the logic ‘1’ advantages of the 3-phase timing scheme while providing tolerance against logic ‘0’ glitching such that the fastest devices may be leveraged in NASIC designs.

V. NOISE RESILIENT SEQUENCING SCHEME FOR THE NASIC FABRIC

In this section, we present and evaluate a new noise-resilient dynamic control scheme that provides resilience against both logic ‘1’ and logic ‘0’ glitches across a variety of devices. The scheme is described and all devices are evaluated against it for the test circuit (Fig. 7).
Fig. 8. Cascading evaluations for NiSi 0.2 Device. Due to poor driving voltage at the input transistor and slow device, output node do31 does not properly discharge leading to loss of signal integrity.

Fig. 9. Cascading evaluations for Omega 0.2 Device. Despite poor driving voltage, signal integrity is preserved owing to faster device.
Fig. 10. Cascading evaluations for NiSi 0.2 and Omega 0.2 devices using 3-phase sequencing scheme. Logic ‘1’ glitching effects are reduced in this scheme, and NiSi 0.2 device shows expected behavior. However, logic ‘0’ glitching is critical for faster devices. Upward glitch on do21 during eva3 causes loss of signal integrity at do31 node.

Fig. 11. Noise resilient 4-phase sequencing scheme for the NASIC fabric. Additional hold phase (H2) inserted to separate evaluation from noise event. Green arrow shows do21 glitches only after eva3 has completed. Signals repeat every four stages.

Fig. 12 shows the output waveforms for the Omega 0.2 device with the new noise resilient scheme. As expected, the logic ‘0’ at do21 is already consumed before the glitching event occurs and does not affect do31. During eva3, stage 1 is in the new H2 phase, which essentially isolates the noise event from the propagation event preserving signal integrity. Thus, using the new noise resilient timing schemes, devices with lower intrinsic delays may be made functional in the NASIC fabric.

VI. DISCUSSION

This section discusses implications of the 4-phase noise resilient timing scheme on fabric performance, the effect of external noise sources (e.g., power supply droops) and manufacturing implications.

A. Performance Optimization and Evaluation

In general, it may be expected that the noise resilient 4-phase sequencing scheme would run at slower frequencies than the 3-phase and basic schemes since additional hold phases are inserted for noise resilience. However, since the 4-phase scheme provides better logic ‘1’ values and isolates logic ‘0’ glitches, faster devices could be leveraged with this scheme leading to significant performance improvements at the system level.

However, even with faster devices, NASIC dynamic circuits need to be optimized for performance. Specifically, due to noise cascading effects and high output impedance, charge at driving nodes and the associated gate-drive voltages are typically expected to be lower than $V_{DD}$. Since $I_{ON}$ is strongly dependent on $V_{GS}$, this implies that even devices with low intrinsic delays (e.g., Omega 0.2) may be operating at
Fig. 12. Cascading evaluations for NiSi (solid) and Omega (Dotted) devices using the noise resilient 4-phase control scheme. Results show signal integrity and sufficient noise margins for logic ‘1’ glitches for both devices. Logic ‘0’ glitches have been isolated from evaluation events and are therefore not propagated. The new sequencing scheme achieves noise resilience and correct functionality for 4 out of 6 devices.

sub-optimal points, leading to large evaluation delays and poor circuit performance. Therefore, circuits need to be optimized in-Fabric to improve $V_{GS}$ and performance.

CMOS dynamic circuits typically use keeper devices or domino logic [19] for achieving low output impedance. A keeper device is part of a feedback network, which is turned ON when the output node is ‘1’, and OFF when it is ‘0’. Keeper configurations are typically achieved with an inverter and a PMOSFET. However, this may be hard to achieve on a regular NW based fabric without a large density impact, since it requires nanoscale customization and feedback, in addition to p-type FETs and static inverters for every NASIC dynamic gate. Similarly, domino logic would need insertion of static CMOS stages between tiles. These approaches cannot be directly integrated into the NASIC fabric.

One promising technique for increasing charge at the driving nodes is capacitance engineering. The key idea is to increase the overall capacitance (and consequently the charge stored) at input nodes, thereby reducing the magnitude of noise glitching, thereby leading to higher gate voltages. While increased load capacitance at a node will have a linear impact on performance; the expectation is that a net benefit will be achieved due to the better-than-linear relationship between $I_{ON}$ and $V_{GS}$. Importantly, this technique does not impose new manufacturing challenges. A capacitance trench may be created at an input stage, increasing the net capacitance of all input nodes in that stage (Fig. 13). This would be done at the granularity of a NASIC stage (typically 10s – 100s of nm) using conventional photolithography steps and would be easier to achieve than in a conventional DRAM process, which requires isolated capacitors for every memory bit.

The test circuit used for performance evaluation with capacitance engineering is shown in Fig. 14. Stage 1 generates imperfect outputs and is subject to noise effects previously discussed. The time taken to fully discharge the output node of stage 2 is measured as a function of fan-in. Stage 3 loads stage 2. Capacitors shown in green are inserted at output nodes and improve drive voltages. It must be noted that these capacitances improve logic ‘1’ noise margins, since more charge is stored on the nodes and magnitude of downward glitching is reduced.

Experiments were done to characterize the evaluation delay of NASIC dynamic circuits as a function of fan-in. Maximum operating frequency is defined as $1/N * delay$, where $N$ is the number of distinct evaluate phases in the control scheme (explicitly, $N$ is 4 for 4-phase). The reasoning is that the minimum duration of any single evaluate phase has to be at least equal to the delay for completely discharging the output.
node through the pull-down network.

Fig. 15 shows drive voltage and maximum operating frequency vs. capacitance for fan-in 4 NASIC dynamic gates. Without any capacitive loading, a maximum frequency of 1.68 GHz is obtained. However, increasing the capacitance leads to a 5X improvement performance. A key observation is that for smaller drive voltages, significant improvements in performance are seen. However, at higher drive voltages, the $I_{ON}$ vs. $V_{GS}$ relationship becomes more linear, and the effect of better driving voltages due to capacitance at the input node is negated by the linear impact of the output load capacitance.

For capacitance loading between 9 aF and 30 aF, only a 5% standard deviation is observed, implying that performance is not very sensitive to variations in the capacitance values. Also, new techniques to mitigate the impact of variability in nanoscale fabrics [20] may be leveraged to improve the performance further. Similar trends are seen at other fan-ins.

Fig. 16 shows the maximum operating frequency vs. maximum fan-in for the Omega 0.2 device with and without capacitance engineering. A consistent 4.5-6X performance improvement is seen for all fan-ins with capacitance engineering (e.g. for fan-in 10, maximum operating frequency increases from 798 MHz to 3.34 GHz). These results attest to the importance of achieving high drive voltages at input nodes.

Other techniques to further improve the performance of NASIC circuits are currently under investigation. One promising approach is based on depletion mode nanowire FET devices similar to those shown in [18] that could potentially be faster than inversion-mode devices. Other industry-standard optimizations such as using strained Silicon (also stated in [18]) would significantly improve the mobility of carriers in xnwFET devices.

B. Impact of Power Supply Droop on NASIC Fabric Functionality

The previous sections dealt exclusively with internal noise sources such as arising from parasitic capacitances. Fundamentally, fabric design and optimizations have to be validated for functionality by mitigating internal noise. However, external effects such as power supply variation, clock skew, thermal vibrations and soft errors can also be detrimental to nanoscale fabric functionality. The latter two effects may partially be dealt with through built-in fault tolerance techniques incorporated in the NASIC fabric [7], [10]. With regard to clock...
skew, NASIC designs employ local interconnections between neighboring dynamic stages. The control signals that ‘clock’ NASIC stages are expected to be propagated on common rails from a Phase-Locked-Loop with local phase shifters generating the four-phase clock. Given the local interactions and the prescribed clocking structure, appreciable skew is not expected on control signals. However, systematic effects such as fluctuations in \( V_{DD} \) could still disrupt functionality, especially when considered in conjunction with internal noise sources.

In this section, we examine how \( V_{DD} \) changes may affect fabric functionality. The test circuit in Fig. 7 was used and the four devices examined were: Si 0.2, NiSi 0.2, Omega 0.3 an Omega 0.2. These devices were found to work correctly under nominal \( V_{DD} \) with the 4-phase noise resilient control scheme. \( V_{DD} \) was varied systematically for all the stages in the test design, because while across chip variation in \( V_{DD} \) could be large, little local variation is expected for smaller circuits using the same supply rails. Up to 20% variation on either side of nominal (0.8 V) was considered.

Supply voltage spiking can be detrimental to logic ‘0’ outputs. However, these upward glitches can be isolated using the 4-phase noise resilient scheme and our simulations showed circuits with all four devices working correctly for up to a 20% spike in \( V_{DD} \). Droops in supply voltage on the other hand affect logic ‘1’s. The following results highlight the impact of power supply drooping.

The results are shown in Fig. 17 for the NiSi 0.2 (left) and Omega 0.2 (right) devices. The trends for Si 0.2 and Omega 0.3 are very similar to NiSi 0.2. Omega 0.2 is extremely resilient to \( V_{DD} \) noise (Fig. 17 - right) due to its smaller intrinsic delay. Even when \( V_{DD} \) drops to 0.65 V (20% droop), the logic ‘1’ values are evaluated correctly and a strong ‘0’ is obtained at the do31 node. For NiSi 0.2, we see for \( V_{DD} = 0.65 \) V, the stage 2 input devices are not fully turned on and do21 is not fully discharged. An ambiguous signal \( \approx V_{TH} \) is obtained and loss of signal integrity occurs at do31. While the voltage at do21 for \( V_{DD} = 0.65 \) V is only slightly higher than for \( V_{DD} = 0.7 \) V, the stage 3 nxnFET is much more strongly turned on, leading to incorrect discharge at the do31 node.

These results highlight that devices with smaller intrinsic delays are resilient to logic ‘1’ glitching caused by both internal and external noise sources. In conjunction with fabric level noise resilient sequencing schemes and capacitance engineering, faster devices may be leveraged for noise tolerant, high performance computational fabrics and systems.

C. Manufacturing Considerations

A scalable manufacturing pathway for the NASIC fabric was described in [9], [11]. Challenges with regard to nanowire growth and alignment, as well as logic functionalization and various techniques for nanowire grid formation, based on in-situ and ex-situ growth and alignment as well as direct patterning of substrates have been discussed. Defect tolerance aspects and parameter variability mitigation techniques have been presented in [10], [7] and [21], [20] respectively. Similarly, overlay and registration considerations for the NASIC fabric have been addressed in [20]. In this section, we focus on manufacturability aspects related to device design and optimization for noise mitigation. Other aspects are beyond the scope of this paper.

Reliable and scalable assembly of nanostructures and manufacturing pathways towards integrated systems continue to pose significant challenges. Therefore two objectives must be concurrently achieved: i) Device design and optimizations at device/circuit levels must target circuit functionality and fabric noise mitigation, and ii) In keeping with the fabric-centric mindset physical layer assumptions targeting device structures must not pose insurmountable challenges to the manufacturing sequence.

A NASIC manufacturing sequence incorporating heavily doped silicon nanowire gate for n xnFETs has been previously proposed in [9]. In that sequence, the key challenges are the assembly of the nanowire grid as well as functionalization of selected crosspoints to determine the locations of the n xnFETs. No additional customization of individual FETs (e.g. arbitrary sizing, placement or doping) is required.

Silicidation of VLS grown nanowires with nickel for improved conductivity has been shown in [22]. A similar silicidation process may be appended to the NASIC manufacturing sequence [9] to achieve NiSi gate material as well as interconnect regions between n xnFETs. Since a final nickel silicidation step can be carried out after all ion implantation steps, thermal stability issues for NiSi material do not arise.

Omega-gated structures could be achieved by nanolithography based pattern and etch techniques. For example, Super-lattice Nanowire Pattern Transfer [23], [24] has shown metal nanowires at sub-15nm pitches. Snider et al. [25] have shown nanoimprint lithography based copper nanowires.

Two device engineering techniques discussed in the paper include the back-gate bias and the underlap. The substrate bias is applied to all devices in the fabric and therefore does not impose new manufacturing constraints. The underlap is envisioned to be created using a self-aligned process without any masking and is described below.

Self-aligned Underlap Formation: Source and drain junction underlap regions self-aligned to the gate nanowire are formed using spacer technology (Fig. 18). This process is similar to what is used to form highly doped drain and source (HDD) in CMOS devices and does not need any extra lithographic masking or overlay. During the anisotropic etch step (Fig. 18c), deposited material on nanowire sidewalls is not completely etched owing to higher thickness (Fig. 18b).

We believe that these physical layer choices carefully addressing manufacturing considerations, in conjunction with manufacturing-friendly device and fabric optimizations for noise and functionality may pave the way for future nanowire-based integrated nano-fabrics.

VII. Conclusion

An integrated device-fabric exploration methodology encompassing physical layer assumptions, accurate 3-D physics based simulations of device structures and detailed circuit level functionality and noise evaluation was presented. Crossed
nanowire FETs composed of different materials and structures were extensively simulated and current-voltage characteristics, parasitic capacitances as well as key device parameters such as on-current and intrinsic delay were characterized. Enhancement mode xnwFET devices were designed for two threshold voltage levels, 0.2 V and 0.3 V using heavily doped silicon, nickel silicide and Omega gates. An Omega-gated device with a metal workfunction of 4.5 eV was found to have the best on-current (18.5 µA) and intrinsic delay (0.59 ps). Behavioral models of the devices were created for a circuit simulator using regression analysis and curve fitting. Test circuits incorporating the devices were evaluated for noise effects and signal integrity for two baseline sequencing schemes. Only the fastest Omega 0.2 device can be made functional with the basic scheme owing to severe (0.6 V = 100% of noise margin) downward glitches on logic ‘1’ values that prevent accurate evaluation of cascaded circuits using slower devices. A 3-phase scheme had better resilience to logic ‘1’ glitching (30-40% of noise margin). However, circuits based on the fastest Omega 0.2 devices could not be made functional owing to logic ‘0’ glitches above 0.21 V (V_{TH} for Omega 0.2) that caused incorrect discharge of the next stage leading to loss of signal integrity. These experiments showed that devices with smaller intrinsic delays are more resilient to logic ‘1’ noise and less resilient to logic ‘0’ noise. A new 4-phase noise resilient timing scheme was developed to handle both logic ‘1’ and logic ‘0’ glitches. Logic ‘0’ noise events were separated from evaluation events by modifying the control and introducing additional hold phases, thus enabling the use of faster devices in-fabric. Of the six devices considered, only the slowest NiSi
0.3 and Si 0.3 devices failed signal integrity tests owing to logic ‘1’ glitching. A capacitance engineering approach to improve drive voltages and performance of NASIC designs was introduced. This technique boosts circuit performance by 4.5-6X for Omega 0.2 devices with the 4-phase noise resilient control. The Omega 0.2 devices were also found to be more resilient to supply voltage droops, with no loss of signal integrity for voltage values up to 20% below nominal. Manufacturing considerations for device and fabric optimization were also discussed.

REFERENCES