Zhiting Lin, Zhongzhen Tong, Jin Zhang, Fangming Wang, Tian Xu, Yue Zhao, Xiulong Wu, Chunyu Peng, Wenjuan Lu, Qiang Zhao, and Junning Chen
Fig. 1. (Color online) Overall framework of static random-access memory (SRAM)-based computing in-memory (CIM) for the review: (a) various functions implemented in CIM, (b) operation functions realizable with CIM, and (c) application scenarios of CIM.
Fig. 2. (Color online) (a) Standard 6T SRAM cell, (b) dual-Split 6T SRAM cell, and (c) 4+2T SRAM cell.
Fig. 3. (Color online) SRAM cells with separated read and write: (a) standard 8T SRAM bit-cell, (b) 7T SRAM cell, (c) 9T SRAM cell, and (d) 10T SRAM cell.
Fig. 4. (Color online) SRAM cells based on capacitive coupling: (a) C3SRAM bitcell and (b) M-BC bitcell.
Fig. 5. (Color online) (a) Transposable bitcell contains two pairs of access transistors and (b) separated read–write transposable bit cell. (c) Schematic of the transposable 10T bit bitcell.
Fig. 6. (Color online) Compact coupling structure: (a) 12T cell and (b) two-way transpose multibitcell.
Fig. 7. (Color online) (a) Asymmetric differential sense amplifier (SA), (b) flash ADC, and (c) successive approximation ADC.
Fig. 8. (Color online) (a) Weighted array with different capacitor sizes and (b) multi-period weighting technique using capacitors of the same size.
Fig. 9. (Color online) Schematic of the column-wise GBL_DAC circuit: (a) Circuit of the constant current source, (b) two-stage MUX, and (c) waveform of the column-wise GBL_DAC circuit. (d) Schematic and waveform of the pulse height modulation circuit.
Fig. 10. (Color online) Redundant reference column technology.
Fig. 11. (Color online) (a) In-/near-memory computing peripherals and (b) a bit-tree adder.
Fig. 12. (Color online) Signed 4-b × 8-b least significant bit (LSB) multiplier: (a) timing diagram and (b) circuit schematic.
Fig. 13. (Color online) Boolean operation: (a) Boolean logical operations using an SRAM array, (b) histogram of AND and NOR operation voltages, and (c) schematic of the 8T-SRAM for implementing the IMP and XOR operations.
Fig. 14. (Color online) Column-wise BCAM: (a) search example in 3D-CAM and (b) 4+2T. Row-wise TCAM: (c) organization based on 10T and (d) organization based on 6T.
Fig. 15. (Color online) Schematic and truth table of the binary dot product: (a) 6T-SRAM binary dot product and (b) 8T-SRAM binary dot product. (c) Ternary dot product: operation of ternary multiplication and XNOR value mapping table.
Fig. 16. (Color online) Row of 9T SRAM cells for calculating the Hamming distance.
Fig. 17. (Color online) (a) Precharge weighting technology, (b) pulse width weighting, (c) pulse height weighting, and (d) pulse number weighting.
Fig. 18. (Color online) (a) 8T-SRAM memory array for computing dot products with 4-bit weight precision and (b) Twin-8T cell.
Fig. 19. (Color online) (a) Schematic of SAD circuit and (b) sequence diagram.
Fig. 20. (Color online) Implementation of (a) CNN and (b) AES on multiple SRAM arrays.
Fig. 21. (Color online) Application in the k-NN algorithm.
Fig. 22. (Color online) Application in classifier algorithms.
Fig. 23. (Color online) Read disturb issue.
Fig. 24. (Color online) (a) Single row activation during normal SRAM read operation, (b) multirow read and nonlinearity during CIM, and (c) inconsistent CIM calculation.
Fig. 25. (Color online) Approach of mapping from the common operator set to the actual circuits.
Fig. 26. (Color online) Architecture of the bidirectional CIM system, including a reusable and reconfigurable module.
Fig. 27. (Color online) Multithreaded CIM macro based on a pipeline processor.
Parameter | Structure of the 6T cell | | Cell structure with additional devices |
---|
Standard 6T | Dual-split 6T | 4+2T | | Read and write separating | | Capacitive coupling | | Transposable | | Compact coupling |
---|
Ref.
[19]
| Refs.
[1, 2]
| Ref.
[20]
| | Ref.
[18]
| Ref.
[67]
| Ref.
[57]
| | Refs.
[31, 58]
| Refs.
[59, 60, 61]
| | Refs.
[12, 13]
| Refs.
[14, 55]
| Ref.
[11]
| | Ref.
[17]
| Ref.
[43]
|
---|
1Input precesion/Weight precesion/Output precesion. TWT-MC: Two-way transpose multibit accumulation; DDC: deeply depleted channel; FDSOI: full depleted silicon on insulator
| Cell type | 6T | 6T | 6T | | 8T | 9T | 10T | | 8T1C | 10T1C | | 8T | 8T | 10T | | 12T | TWT-MC | Process technology | 28-nm FDSOI | 65 nm | 55-nm DDC | 45 nm | 65 nm | 28 nm | 65 nm | 65 nm | 28 nm | 7 nm | 28 nm | 65 nm | 28 nm | Added circuit | No | No | Two read
ports
| One read
port
| One read
port
| Two read
ports
| Two transistors
one cap
| Four transistors
one cap
| Two read
ports
| One read
port
| Two read
ports
| Pull-up/
down
circuits
| Multiply
cell
| Read write disturb | Yes | Yes | No | No | No | No | No | No | No | No | No | No | Yes | Area efficiency | High | High | High | Med. | Med. | Med. | Low | Low | Med. | High | Low | Low | Low | TOPS/mm2 | NA | 33.13 | NA | NA | NA | 170 | 20.2 | 0.6 | 27.3 | NA | NA | 5.461 | NA | TOPS/W | NA | 30.49–
55.8
1/1/5b1 | 41.4
1/1/1b1 | NA | NA | 1002
1/1/1b1 | 671.5
1/1/5b1 | 192–
400
1–8b
| 0.56/5.27
Arbitrary
| 6.02
8/1/
11b1 | 66.7
1/1/1b1 | 403
1/1/
3.46b1 | 7.2–61.1
2,4,8/4,
8/10,12,
16,20b1 |
|
Table 1. Static random-access memory (SRAM) bitcells in CIM.
Parameter | Ref. [19]
| Ref. [28]
| Ref. [20]
| Ref. [29]
| Ref. [5]
| Ref. [68]
| Ref. [11]
|
---|
1Row-wise search. 2Column-wise search. DDC, deeply depleted channel; FDSOI, full depleted silicon on insulator.
| Technology | 28-nm
FDSOI
| 180 nm | 55-nm
DDC
| 28-nm
FDSOI
| 65 nm | 28 nm | 28 nm | Cell type | 6T | 8T | 4+2T | 6T | 8T | 14T | 10T | Array size | 64×64 | 8×8 | 128×128 | 128×64 | 128×128 | 1024×320 | 64×64 | Supply voltage (V) | 1 | 1.2 | 0.8 | 0.9 | 1.2 | 0.9 | 0.9 | CAM | Freq.
(MHz)
| 370 (1 V) | NA | 270 (0.8 V) | 1560 (0.9 V)
8.90 (0.38 V)
| 813
(1.2 V)
| 1330
(0.9 V)
| 262 (0.9 V) | 256 (0.9 V) | Energy
(fJ/bit)
| 0.6
(1 V)
| NA | 0.45
(0.8 V)
| 0.13 (0.9 V) | 0.85
(1.2 V)
| 0.422
(0.9 V)
| 1.025 (0.9 V) | 1.02 (0.9 V) | 0.635 (0.7 V) | 0.632 (0.7 V) | Logic | Freq.
(MHz)
| NA | NA | 230 (0.8 V) | NA | 793 (1.2 V) | NA | ~300(0.9 V) | Energy
(fJ/bit)
| NA | NA | 24.1 (0.8 V) | NA | ~31 (1.2 V) | NA | ~15 (0.9 V) | ~22.5 (1 V) | ~12.5 (0.8 V) | 16.6 (0.8 V) | ~10.5 (0.7 V) | Search mode | 1 | 1 | 2 | 2 | 1 | 2 | 1 | 2 | Function | SRAM/
CAM/Logic
| SRAM/ TCAM/ Left Shift/ Right/Shift | SRAM/
CAM/Logic
| BCAM/SRAM/
Pseudo-TCAM
| SRAM/
CAM/
Logic
| SRAM/
TCAM
| SRAM/CAM/
Logic/Matrix transpose
|
|
Table 2. Summary of chip parameters and performance of in-memory Boolean logic and CAM
Parameter | Refs. [1, 2]
| Ref. [10]
| Refs. [15, 16]
| Ref. [32]
| Ref. [31]
| Ref. [44]
| Ref. [33]
| Refs. [24, 37]
| Ref. [34]
|
---|
Tchnology | 65-nm CMOS | 65-nm CMOS | 130-nm CMOS | 55-nm CMOS | 65-nm CMOS | 65-nm TSMC | 55-nm CMOS | 7-nm FinFET | 65nm | Cell structure | DCS 6T | 10T | 6T | Twin-8T | 8T1C | 6T | 6T | 8T | 6T | Array size | 4 Kb | 16 Kb | 16 Kb | 64×60 b | 2 KB | 64 Kb | 4 Kb | 4 Kb | 64 Kb | Chip area (μm2)
| NA | 6.3×104 | 2.67×105 | 4.69×104 | 8.1×104 | NA | 5.94×106 | 3.2×103 | 1.75×105 | Input precision (bit) | 1 | 6 | 5 | 1, 2, 4 | 1 | 5 | 1, 2, 7, 8 | 4 | 4 | Weight precision (bit) | 1 | 1 | 1 | 2, 5 | 1 | 5 | 1, 2, 8 | 4 | 1, 2, 3
4, 5, 8
| Output precision (bit) | 1 | 6 | NA | 3, 5, 7 | 5 | NA | 3, 7, 10, 19 | 4 | NA | Computing mechanism | Analog | Digital+
Analog
| Digital+
Analog
| Analog | Analog | Analog | Digital+
Analog
| Analog | Analog | Model | XNORNN/MBNN | CNN | Classify | CNN | CNN | VGG
LeNet-5
| CNN | VGG-9 NN | CNN | Energy efficiency
(TOPS/W)
| 30.49–55.8 | 40.3 (1 V)
51.3 (0.8 V)
| NA | 18.37–
72.03
| 671.5 | NA | 0.6–40.2 | 351
(0.8 V)
| 49.4
(Input:
4b Weight:1b)
| Throughput (GOPS) | 278.2 | 8 (1 V)
1 (0.4 V)
| NA | 21.2~
67.5
| 1638 | NA | 5.14-329.14 | 372.4(0.8 V) | 573.4
(Input:
4b Weight:2b)
| Accu-racy | MNIST | 96.5%
(XNORNN)
95.1%
(MBNN)
| 98%(0.8 V)
98.3%(1 V)
| 90% | 90.02%–
99.52%
| 98.30% | 99% | 98.56%–
99.59%
| 98.51%–
99.99%
| 98.80% | CIFAR 10 | NA | NA | NA | 85.56%~
90.42%
| 85.50% | 88.83% | 85.97%-91.93% | 22.89%-96.76% | 89.00% |
|
Table 3. Summary of chip parameters and performance of single- and multibit operations.