DESIGN OF SPIN WAVE FUNCTIONS-BASED LOGIC CIRCUITS

PRASAD SHABADI*, SANKARA NARAYANAN RAJAPANDIAN†,
SANTOSH KHASANVIS‡ and CSABA ANDRAS MORITZ§

Electrical and Computer Engineering, University of Massachusetts
Knowles Engineering Building, 309H, Amherst, Massachusetts 01003 USA

*shabadi@ecs.umass.edu
†srajarapandian@ecs.umass.edu
‡khasanvis@ecs.umass.edu
§andrás@ecs.umass.edu

Received 11 July 2012
Accepted 13 September 2012
Published

Over the past few years, several novel nanoscale computing concepts have been proposed as potential post-complementary metal oxide semiconductor (CMOS) computing fabrics. In these, key focus is on inventing a faster and lower power alternative to conventional metal oxide semiconductor field effect transistors. Instead, we propose a fundamental shift in mindset towards more functional building blocks, replacing simple switches with more sophisticated information encoding and computing based on alternate state variables to achieve a significantly more efficient and compact logic. Specifically, we propose wave computation enabled by magnetic spin wave interactions called as spin wave functions (SPWFs). In SPWFs, computation is based on wave interference and information can be encoded in a wave’s phase, amplitude and frequency. In this paper, we provide an update on key fabric concepts and design aspects. Our analysis shows that circuit design choices can have a significant impact on overall fabricated device capabilities required and vice versa. Thereby, we adapt an integrated fabric-circuit exploration methodology. Control schemes for wave streaming and synchronization are also discussed with several SPWF circuit topologies. Our estimations show that significant area and power benefits can be expected for SPWF-based designs versus CMOS. In particular, for a 1-bit adder up to 40X area benefit and up to 30X power consumption reduction may be possible with SPWF-based implementation versus 45 nm CMOS.

Keywords: Wave computation; spin wave functions; compressed data representation; parallel counters; magneto electric effect; topology.

1. Introduction

CMOS technology has dominated the integrated circuit (IC) industry over the past few decades. Growth within the IC industry has been primarily driven by scaling of transistor geometries leading to better performance and lower power consumption.

*†‡§Corresponding authors.
Fig. 1. Devices for nanofabrics: (left) conventional switch; (right) envisioned device with alternate state variables.

However, such scaling is fast approaching its fundamentals limits leading to exciting research opportunities on post-CMOS computing fabrics. Several alternate devices based on new physical phenomenon such as electron spin, molecular state, material phase and others, have been actively investigated as possible alternates to the metal oxide semiconductor field effect transistors (MOSFETs). \(^1\)\textsuperscript{--}\(^3\) The focus often is to replace the conventional MOSFET-based switching element with a better (faster, lower power) alternative, while preserving the rest of the fabric paradigm.

While it is important to explore alternate devices and new physical phenomenon, we propose a fundamental shift in mindset toward sophisticated building blocks versus simple switches (see Fig. 1), novel information encoding schemes, efficient interconnects and new nanoscale architectures for building efficient nanoscale systems. We explore one possible approach with wave computation enabled by spin-wave physical phenomenon. Spin wave-based computation does not involve physical movement of charge particles, thus potentially leading to ultra-low power operation. In addition, the use of ferromagnetic materials for realizing spin wave function (SPWFs) enables nonvolatile computation. Thereby, orders of magnitude area and power benefits versus state-of-art CMOS may be possible.

In this paper, we provide an update on the core concepts and design aspects of the SPWFs nanofabric. The benefits of non-Boolean multidimensional logic are illustrated with adders and parallel counters. The rest of the paper is organized as follows. Section 2 introduces the concept of SPWFs with detailed description of features and benefits. SPWF logic design with parallel counters as examples is presented in Sec. 3. SPWF topology and layout trade-offs are discussed in Sec. 4. Comparisons and benefits versus CMOS are shown in Sec. 5. Section 6 concludes the paper.

2. Spin Wave Functions

A spin wave is the collective oscillation of spins in an ordered spin lattice around the direction of magnetization in ferromagnetic materials. \(^9\)\textsuperscript{--}\(^12\) In the proposed nanofabric, logic functions themselves are used as elementary building blocks for computation enabled by spin wave interactions, called spin wave functions. \(^13\) Computation is through wave superposition and is based on a non-Boolean multidimensional function fusion paradigm where information is encoded in spin wave amplitude, phase and frequency.

The key fabric components are — (i) ferromagnetic waveguides for spin wave propagation and (ii) magneto electric (ME) cells which behave as transducers converting information from electrical to spin domain and vice versa (see Fig. 2). In addition to providing the I/O mechanism, the ME cells also enable nonvolatile information storage. This eliminates the need for separate latches/flip-flops in the designs. ME cells are also capable of amplifying spin waves and thus provide a mechanism for

Fig. 2. Physical structure of the spin wave nanofabric showing mainly the ME cells and the spin wave bus (SWB).
encoding information in the spin wave amplitude. This feature would also be necessary for restoring wave amplitudes in interconnect spin wave bus (SWB) for preserving signal integrity.

2.1. Intuition for SPWF benefits

SPWFs enable alternate ways to encode information and achieve arbitrary logic functions in an efficient manner. Conventional CMOS Boolean logic is inefficient because circuits are implemented using Boolean gates which are limited in fan-in (generally two or three inputs due to an exponential degradation in performance for higher fan-in). This leads to multiple logic levels with a large number of gates and complex interconnects.

Majority/threshold logic style has been proposed as a potential alternative to overcome CMOS Boolean logic limitations. Figures 3(a) and 3(b) show the conceptual comparison of the Boolean versus majority logic styles. As shown in Fig. 3(b), majority logic would lead to fewer logic levels and gates for implementing a given logic function.14,15 However, current majority gate implementations are again inefficient and complex, since conventional MOSFETs are used to build circuits that emulate majority behavior.16–18 Thus the benefits due to reduction in logic levels are often quickly negated by the implementation complexity of individual majority gates, and by the fact that the computation result is entirely in the Boolean domain. Hence, this logic style has had very little impact on current very large scale integration (VLSI) circuit designs.

In contrast, SPWF uses wave superposition which naturally enables efficient majority functions in a single step without the need for special circuits/gates [see Fig. 3(c)]. In addition, spin waves are capable of compressed information encoding in multiple wave attributes simultaneously such as phase, amplitude and frequency. This eliminates the need for replicating signals. SPWFs are also capable of supporting higher bit-widths since there is no restriction on the number of waves interfering at a given point. These factors lead to significant reduction in logic complexity for SPWFs when compared to conventional approaches, potentially resulting in area and performance benefits. Also, the fact that SPWFs do not involve charge transport in conjunction with nonvolatility results in energy-efficient systems.

3. SPWF Logic Design

SPWFs form the basic building blocks for logic design. While wave-frequency could also be used for information encoding, the designs shown here use only wave phase and amplitude. Using parallel

---

**Fig. 3.** Generic diagram showing conceptual difference between conventional Boolean, Majority (MAJ) and SPWF logic styles.
P. Shabadi et al.

Fig. 4. SPWF layout of (3,2) parallel counter. After the input waves interference, the output waves (W3, W4, W5) encode information in both phase and amplitude leading to compressed information representation.

counters as example, we illustrate the approach and benefits of SPWF logic design. Parallel counters are digital circuits with “n” inputs and “log2(n+1)” output bits representing the number of 1’s in the “n” input bits set.\textsuperscript{19,20} Generally, parallel counters are used in the realization of fast parallel multipliers.

3.1. SPWF (3,2) parallel counter

Figure 4 shows the SPWF implementation of a (3,2) parallel counter. A (3,2) parallel counter has three inputs (A2, A1, A0) and two outputs (O1, O0) which represent the count of number of waves with phase “π” in the inputs. It can be observed that a (3,2) parallel counter is functionally equivalent to a 1-bit full adder.

The output logic functions are given by the following equations (here MAJ indicates a Majority function):

\[ O_1 = \text{MAJ}(A_2, A_1, A_0), \]  
\[ O_0 = \text{MAJ}(A_2, A_1, A_0, -2\text{MAJ}(A_2, A_1, A_0)). \]

Output “O1” is generated by a simple superposition of the three incoming waves. Compressed information encoding in wave amplitude and phase is illustrated here by the information content on wave “W4”. While the phase of the output wave “W4” represents the majority decision, its amplitude indicates the number of input waves which resulted in that particular majority decision. Thus the first interference also acts as a “pre-computation” step to generate output “O0”, which requires this information from primary inputs (A2, A1, A0) as indicated in Eq. (2). This eliminates the need to replicate the primary inputs for computing “O0” and results in a compact function implementation.

Generally, in CMOS technology, parallel counter sizes are limited to only either (3,2) or (7,3). However, high fan-in capability of SPWFs enables efficient realization of higher bit-width parallel counters. Similar to (3,2) counter design, we show the designs of (7,3) and (15,4) parallel counters in Figs. 5 and 6, respectively. As mentioned before, the pre-computation step with compressed information encoding significantly reduces the layout complexity when compared to conventional approaches. Reference 21 shows detailed bench marking and comparisons of SPWF parallel counters versus conventional CMOS Boolean implementations. SPWF based (3,2) parallel counter showed that up to 40X power reduction and 53X area reduction can

Fig. 5. SPWF layout of (7,3) parallel counter. All the inputs are first compressed into a single wave which is later used to generate the output bits.
be expected with comparable performance. Higher benefits were observed for the (15,4) parallel counter, with up to 90X power and about 103X area reduction with approximately 1.36X performance benefit.

4. SPWF Design Aspects: Fabric Capabilities, Topology, Synchronization and Clocking

Our approach to designing circuits with the SPWF fabric places strong emphasis on an integrated exploration methodology that encompasses the physical layer, circuit style, layout topology, synchronization and clocking. Emerging physical phenomena like spin waves impose new constraints at various design levels; thus necessitating such an integrated approach.

4.1. SPWF topologies and ME cell capabilities

In contrast to CMOS layouts which mainly impact circuit performance/area/power, SPWF layouts directly control the functionality of the circuit. This is due to the fact that wave propagation length controls the phase of the wave at the point of interference. Since waves encode information both in amplitude and phase, careful consideration is needed on layout to ensure correct output functionality.

Several design alternatives are presented in this section with different waveguide topologies. We use SPWF 1-bit full adder layout to show the interdependence between layout, circuit style and ME cell capabilities. We have indentified that some designs require ME cells with Amplitude Tracing capability. Amplitude Tracing refers to the ability of the ME cells to regenerate new spin waves with variable/dynamic amplitudes depending on the amplitude of the incoming spin waves (tracing).

Figure 7 shows the three different topologies with different assumptions on ME cell capabilities.

Figure 7(a) shows a highly compact design; however the amplitude of the wave from the “C_{out}” ME cell to the Intermediate (I-2) ME cell and “Sum” ME cell will be dynamic, depending on the interference of the three inputs waves. This would require sophisticated ME cells (possibly with some feedback mechanism) with Amplitude Tracing capability.

By redesigning the layout such that “C_{out}” ME Cell is away from the point of interference, the amplitude information of resultant waves is preserved. Thus 1-bit SPWF adder can be realized without the need for Amplitude Tracing ME cells. This design is shown in Fig. 7(b).

A special layout requirement in both designs, Figs. 7(a) and 7(b), is that wave guides between Intermediate ME cells and the output ME cells needs to be carefully patterned to implement inversion. This requirement can further be eliminated by using dual-rail logic, which does not require any internal inversions due to the explicit use of both true and complementary inputs. An inversion-free layout is shown in Fig. 7(c). In addition to simplifying the ME cell capabilities, this design also simplifies the control schemes required for synchronization which is discussed in the next section.

4.2. Synchronization and clocking

In addition to patterning functionally correct layouts, careful consideration is also needed to ensure that spin waves are excited and captured at specific
Fig. 7. 1-bit SPWF adder designs with different assumptions on ME cell capabilities. (a) Custom SPWF layout with Amplitude Tracing ME cells. (b) SPWF layout without any Amplitude Tracing ME cells. Here, all ME cells generate waves of fixed amplitude. (c) Inversion-free 1-bit SPWF adder based on dual-rail logic. In addition to relaxing the constraints on ME cell capabilities, this design eliminates layout pattern-based inversion.

time instants to ensure correct functionality. Synchronization aspects of SPWF designs are closely related to how ME cells operate to (i) generate new waves and (ii) capture information from incoming waves. The current ME cell is a bi-stable device with an energy separation between the two stable states. An additional meta-stable state can be used to reduce the amount of energy necessary to switch the ME cell from one stable state to another.

Thereby, a combination of layout patterning techniques and external electrical control signals is used to assure that waves are generated and captured correctly. In this section, we discuss this aspect of SPWF designs with several layout topologies for a SPWF 1-bit adder. Figure 8 shows three different variants of SPWF 1-bit adder with different fabric assumptions and control schemes.

The design shown in Fig. 8(a) uses layout to achieve logic inversion for generating the “Sum” output. This is a custom layout technique which leads to a highly compact design. However, from synchronization perspective we need separate control signals to force Intermediate (I-2) ME cells and the “Sum” output ME cells into meta-stable state, primarily due to layout imbalance between the paths of the waves that interfere to produce the “Sum” output. The corresponding timing diagrams in Fig. 8(a) show that two separate external control signals are needed for correct operation of this design.

The primary motivation for the designs shown in Figs. 8(b) and 8(c) was to make the layouts balanced. This would ensure that waves travel equal distance before superposition leading to simplified control schemes. In Fig. 8(b), balancing is achieved by using special waveguides with a pinned magnetic layer. In comparison with the regular single layer ferromagnetic waveguides, the pinned layer provides additional phase shift of “π” for the same propagation length. Thereby, a single external clock signal is sufficient to control both output and Intermediate ME cells.

While the design shown in Fig. 8(b) achieves balanced layouts, it needs special waveguides with a pinned magnetic layer. A possible third alternative is to eliminate intermediate inversions in the design by using dual-rail logic. This example further shows the importance of an integrated fabric-circuit-layout exploration methodology in such unconventional computing fabrics. Detailed benchmarking, methodology and comparison results for these SWPF 1-bit adder designs are presented in the next section.

5. Benchmarking versus CMOS

In this section, we show comparisons of various SPWF-based 1-bit adder designs versus 45 nm custom CMOS design. We show area, power and performance evaluations of the Amplitude Tracing, Amplitude Tracing Free, Inversion Free and the balanced-pinned layer-based designs. Fabric assumptions and evaluation methodology are also presented.

5.1. Fabric assumptions

The CMOS full adder design was implemented using North Carolina State University (NCSU) 45 nm
Design of SPWFs-Based Logic Circuits

Fig. 8. Different 1-bit SPWF adder designs based on synchronization and external control requirements. (a) Custom layout with two separate external control signals. (b) Balanced layout with only one external control signal. This design uses special waveguide with a pinned magnetic layer to enable inversion while still using a balanced layout. (c) Balanced layout based on dual-rail logic without using pinned magnetic layer.

Product Development Kit (PDK). SPWF fabric assumptions are based on theoretical simulations and experimental work at device research laboratory in UCLA.\textsuperscript{21,22} For the evaluations shown in this paper, ME cell dimension of 100 nm \times 100 nm with a switching delay of 100 ps is used. Based on a simple capacitive approximation, ME cell switching energy is calculated to be around 10 aJ per switching.\textsuperscript{22} For delay calculations in the waveguides, spin wave group velocity is assumed to be $10^3$ m/s.\textsuperscript{10}

5.2. Methodology

For a 1-bit full adder, the path from the inputs to the “Sum” output forms the critical path. Thereby, delay calculations for the proposed adders are based on evaluating the “Sum” logic for both SPWF and CMOS designs. For the SPWF version, delay is determined by the total number of ME cells along the critical path and the wave propagation distance. CMOS delay calculations are obtained using Hspice simulation for the worst case input pattern.

As mentioned earlier, spin wave propagation does not involve physical movement of charge particles. Thereby, only ME cell switching activity is considered for evaluating the SPWF adder power consumption. For the CMOS designs, power evaluations are based on calculating the average power consumption for 40 trials each with 1000 random inputs patterns.
Table 1. 1-bit Adder Comparison versus 45 nm NCSU PDK-Based Custom CMOS Layout.

<table>
<thead>
<tr>
<th>Fabric</th>
<th>Design</th>
<th>Delay (ps)</th>
<th>Power (µW)</th>
<th>Complexity</th>
</tr>
</thead>
<tbody>
<tr>
<td>CMOS</td>
<td>Custom</td>
<td>250</td>
<td>36.5</td>
<td>Area = 20 µm²</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Transistor Count = 32</td>
</tr>
<tr>
<td>SPWF with I/O ME</td>
<td>With AT</td>
<td>475</td>
<td>0.12</td>
<td>Area = 0.5 µm²</td>
</tr>
<tr>
<td></td>
<td>W/O AT</td>
<td>375</td>
<td>0.16</td>
<td>ME Count = 6</td>
</tr>
<tr>
<td></td>
<td>Balanced Pinned</td>
<td>390</td>
<td>0.18</td>
<td>ME Count = 7</td>
</tr>
<tr>
<td></td>
<td>Inversion Free</td>
<td>390</td>
<td>0.25</td>
<td>Area = 1.8 µm²</td>
</tr>
</tbody>
</table>

Note: AT = Amplitude Tracing, λ = 100 nm, ME cell = λ², ME delay = 100 ps, Wave velocity = 10^4 m/s, ME switching power = 100 nW.

For CMOS area calculation, a custom 45 nm layout was designed. SPWF area is mainly determined by the total number of ME cells in the design and the area needed to pattern a specific layout. These calculations are made directly from the layouts shown in Figs. 7 and 8.

5.3. Comparison results

Table 1 shows the comparison results for the various SPWF 1-bit adders versus 45 nm CMOS design. These results show that up to 40X area reduction and up to 304X power reduction can be expected with the SPWF versions. However, with a slight penalty in area and power consumption, the balanced-pinned and the inversion-free designs significantly relax fabric requirements and also simplify the control schemes.

6. Conclusion

We have presented a novel multidimensional spin wave-based computational paradigm. The idea of using sophisticated functions as building blocks was presented with the concept of SPWFs. These functions were based on superposition of spin waves that are created by collective excitation of individual electron spins in ferromagnetic materials. Our explorations have shown that SPWFs provide many fabric tuning knobs to enable novel computing models with unconventional information encoding schemes. In this paper, we have demonstrated the data compression and pre-computation capabilities of SPWFs with parallel counters and adders as examples. Various design aspects were explored and trade-offs were presented based on an integrated fabric circuit-layout exploration methodology. The impact of layout on the necessary fabric capabilities and control mechanisms were shown with detailed descriptions. Benchmarking was done and comparisons show that up to 40X area benefit and up to 304X power consumption reduction can be expected with the proposed SPWF adder designs versus 45 nm CMOS. Thereby, the proposed SPWF fabric is a promising option for post-CMOS electronics. This can potentially be game-changing for high fan-in arithmetic applications especially in the areas of cryptography, graphics processing, energy-autonomous processing, etc.

References


