• Journal of Semiconductors
  • Vol. 43, Issue 3, 031401 (2022)
Zhiting Lin, Zhongzhen Tong, Jin Zhang, Fangming Wang, Tian Xu, Yue Zhao, Xiulong Wu, Chunyu Peng, Wenjuan Lu, Qiang Zhao, and Junning Chen
Author Affiliations
  • School of Integrated Circuits, Anhui University, Hefei 230601, China
  • show less
    DOI: 10.1088/1674-4926/43/3/031401 Cite this Article
    Zhiting Lin, Zhongzhen Tong, Jin Zhang, Fangming Wang, Tian Xu, Yue Zhao, Xiulong Wu, Chunyu Peng, Wenjuan Lu, Qiang Zhao, Junning Chen. A review on SRAM-based computing in-memory: Circuits, functions, and applications[J]. Journal of Semiconductors, 2022, 43(3): 031401 Copy Citation Text show less
    (Color online) Overall framework of static random-access memory (SRAM)-based computing in-memory (CIM) for the review: (a) various functions implemented in CIM, (b) operation functions realizable with CIM, and (c) application scenarios of CIM.
    Fig. 1. (Color online) Overall framework of static random-access memory (SRAM)-based computing in-memory (CIM) for the review: (a) various functions implemented in CIM, (b) operation functions realizable with CIM, and (c) application scenarios of CIM.
    (Color online) (a) Standard 6T SRAM cell, (b) dual-Split 6T SRAM cell, and (c) 4+2T SRAM cell.
    Fig. 2. (Color online) (a) Standard 6T SRAM cell, (b) dual-Split 6T SRAM cell, and (c) 4+2T SRAM cell.
    (Color online) SRAM cells with separated read and write: (a) standard 8T SRAM bit-cell, (b) 7T SRAM cell, (c) 9T SRAM cell, and (d) 10T SRAM cell.
    Fig. 3. (Color online) SRAM cells with separated read and write: (a) standard 8T SRAM bit-cell, (b) 7T SRAM cell, (c) 9T SRAM cell, and (d) 10T SRAM cell.
    (Color online) SRAM cells based on capacitive coupling: (a) C3SRAM bitcell and (b) M-BC bitcell.
    Fig. 4. (Color online) SRAM cells based on capacitive coupling: (a) C3SRAM bitcell and (b) M-BC bitcell.
    (Color online) (a) Transposable bitcell contains two pairs of access transistors and (b) separated read–write transposable bit cell. (c) Schematic of the transposable 10T bit bitcell.
    Fig. 5. (Color online) (a) Transposable bitcell contains two pairs of access transistors and (b) separated read–write transposable bit cell. (c) Schematic of the transposable 10T bit bitcell.
    (Color online) Compact coupling structure: (a) 12T cell and (b) two-way transpose multibitcell.
    Fig. 6. (Color online) Compact coupling structure: (a) 12T cell and (b) two-way transpose multibitcell.
    (Color online) (a) Asymmetric differential sense amplifier (SA), (b) flash ADC, and (c) successive approximation ADC.
    Fig. 7. (Color online) (a) Asymmetric differential sense amplifier (SA), (b) flash ADC, and (c) successive approximation ADC.
    (Color online) (a) Weighted array with different capacitor sizes and (b) multi-period weighting technique using capacitors of the same size.
    Fig. 8. (Color online) (a) Weighted array with different capacitor sizes and (b) multi-period weighting technique using capacitors of the same size.
    (Color online) Schematic of the column-wise GBL_DAC circuit: (a) Circuit of the constant current source, (b) two-stage MUX, and (c) waveform of the column-wise GBL_DAC circuit. (d) Schematic and waveform of the pulse height modulation circuit.
    Fig. 9. (Color online) Schematic of the column-wise GBL_DAC circuit: (a) Circuit of the constant current source, (b) two-stage MUX, and (c) waveform of the column-wise GBL_DAC circuit. (d) Schematic and waveform of the pulse height modulation circuit.
    (Color online) Redundant reference column technology.
    Fig. 10. (Color online) Redundant reference column technology.
    (Color online) (a) In-/near-memory computing peripherals and (b) a bit-tree adder.
    Fig. 11. (Color online) (a) In-/near-memory computing peripherals and (b) a bit-tree adder.
    (Color online) Signed 4-b × 8-b least significant bit (LSB) multiplier: (a) timing diagram and (b) circuit schematic.
    Fig. 12. (Color online) Signed 4-b × 8-b least significant bit (LSB) multiplier: (a) timing diagram and (b) circuit schematic.
    (Color online) Boolean operation: (a) Boolean logical operations using an SRAM array, (b) histogram of AND and NOR operation voltages, and (c) schematic of the 8T-SRAM for implementing the IMP and XOR operations.
    Fig. 13. (Color online) Boolean operation: (a) Boolean logical operations using an SRAM array, (b) histogram of AND and NOR operation voltages, and (c) schematic of the 8T-SRAM for implementing the IMP and XOR operations.
    (Color online) Column-wise BCAM: (a) search example in 3D-CAM and (b) 4+2T. Row-wise TCAM: (c) organization based on 10T and (d) organization based on 6T.
    Fig. 14. (Color online) Column-wise BCAM: (a) search example in 3D-CAM and (b) 4+2T. Row-wise TCAM: (c) organization based on 10T and (d) organization based on 6T.
    (Color online) Schematic and truth table of the binary dot product: (a) 6T-SRAM binary dot product and (b) 8T-SRAM binary dot product. (c) Ternary dot product: operation of ternary multiplication and XNOR value mapping table.
    Fig. 15. (Color online) Schematic and truth table of the binary dot product: (a) 6T-SRAM binary dot product and (b) 8T-SRAM binary dot product. (c) Ternary dot product: operation of ternary multiplication and XNOR value mapping table.
    (Color online) Row of 9T SRAM cells for calculating the Hamming distance.
    Fig. 16. (Color online) Row of 9T SRAM cells for calculating the Hamming distance.
    (Color online) (a) Precharge weighting technology, (b) pulse width weighting, (c) pulse height weighting, and (d) pulse number weighting.
    Fig. 17. (Color online) (a) Precharge weighting technology, (b) pulse width weighting, (c) pulse height weighting, and (d) pulse number weighting.
    (Color online) (a) 8T-SRAM memory array for computing dot products with 4-bit weight precision and (b) Twin-8T cell.
    Fig. 18. (Color online) (a) 8T-SRAM memory array for computing dot products with 4-bit weight precision and (b) Twin-8T cell.
    (Color online) (a) Schematic of SAD circuit and (b) sequence diagram.
    Fig. 19. (Color online) (a) Schematic of SAD circuit and (b) sequence diagram.
    (Color online) Implementation of (a) CNN and (b) AES on multiple SRAM arrays.
    Fig. 20. (Color online) Implementation of (a) CNN and (b) AES on multiple SRAM arrays.
    (Color online) Application in the k-NN algorithm.
    Fig. 21. (Color online) Application in the k-NN algorithm.
    (Color online) Application in classifier algorithms.
    Fig. 22. (Color online) Application in classifier algorithms.
    (Color online) Read disturb issue.
    Fig. 23. (Color online) Read disturb issue.
    (Color online) (a) Single row activation during normal SRAM read operation, (b) multirow read and nonlinearity during CIM, and (c) inconsistent CIM calculation.
    Fig. 24. (Color online) (a) Single row activation during normal SRAM read operation, (b) multirow read and nonlinearity during CIM, and (c) inconsistent CIM calculation.
    (Color online) Approach of mapping from the common operator set to the actual circuits.
    Fig. 25. (Color online) Approach of mapping from the common operator set to the actual circuits.
    (Color online) Architecture of the bidirectional CIM system, including a reusable and reconfigurable module.
    Fig. 26. (Color online) Architecture of the bidirectional CIM system, including a reusable and reconfigurable module.
    (Color online) Multithreaded CIM macro based on a pipeline processor.
    Fig. 27. (Color online) Multithreaded CIM macro based on a pipeline processor.
    ParameterStructure of the 6T cellCell structure with additional devices
    Standard 6TDual-split 6T4+2TRead and write separatingCapacitive couplingTransposableCompact coupling
    Ref. [19] Refs. [1, 2] Ref. [20] Ref. [18] Ref. [67] Ref. [57] Refs. [31, 58] Refs. [59, 60, 61] Refs. [12, 13] Refs. [14, 55] Ref. [11] Ref. [17] Ref. [43]
    1Input precesion/Weight precesion/Output precesion. TWT-MC: Two-way transpose multibit accumulation; DDC: deeply depleted channel; FDSOI: full depleted silicon on insulator
    Cell type6T6T6T8T9T10T8T1C10T1C8T8T10T12TTWT-MC
    Process technology28-nm FDSOI65 nm55-nm DDC45 nm65 nm28 nm65 nm65 nm28 nm7 nm28 nm65 nm28 nm
    Added circuitNoNoTwo read ports One read port One read port Two read ports Two transistors one cap Four transistors one cap Two read ports One read port Two read ports Pull-up/ down circuits Multiply cell
    Read write disturbYesYesNoNoNoNoNoNoNoNoNoNoYes
    Area efficiencyHighHighHighMed.Med.Med.LowLowMed.HighLowLowLow
    TOPS/mm2NA33.13NANANA17020.20.627.3NANA5.461NA
    TOPS/WNA30.49– 55.8 1/1/5b141.4 1/1/1b1NANA1002 1/1/1b1671.5 1/1/5b1192– 400 1–8b 0.56/5.27 Arbitrary 6.02 8/1/ 11b166.7 1/1/1b1403 1/1/ 3.46b17.2–61.1 2,4,8/4, 8/10,12, 16,20b1
    Table 1. Static random-access memory (SRAM) bitcells in CIM.
    ParameterRef. [19] Ref. [28] Ref. [20] Ref. [29] Ref. [5] Ref. [68] Ref. [11]
    1Row-wise search. 2Column-wise search. DDC, deeply depleted channel; FDSOI, full depleted silicon on insulator.
    Technology28-nm FDSOI 180 nm55-nm DDC 28-nm FDSOI 65 nm28 nm28 nm
    Cell type6T8T4+2T6T8T14T10T
    Array size64×648×8128×128128×64128×1281024×32064×64
    Supply voltage (V)11.20.80.91.20.90.9
    CAMFreq. (MHz) 370 (1 V)NA270 (0.8 V)1560 (0.9 V) 8.90 (0.38 V) 813 (1.2 V) 1330 (0.9 V) 262 (0.9 V)256 (0.9 V)
    Energy (fJ/bit) 0.6 (1 V) NA0.45 (0.8 V) 0.13 (0.9 V)0.85 (1.2 V) 0.422 (0.9 V) 1.025 (0.9 V)1.02 (0.9 V)
    0.635 (0.7 V)0.632 (0.7 V)
    LogicFreq. (MHz) NANA230 (0.8 V)NA793 (1.2 V)NA~300(0.9 V)
    Energy (fJ/bit) NANA24.1 (0.8 V)NA~31 (1.2 V)NA~15 (0.9 V)
    ~22.5 (1 V)~12.5 (0.8 V)
    16.6 (0.8 V)~10.5 (0.7 V)
    Search mode11221212
    FunctionSRAM/ CAM/Logic SRAM/ TCAM/ Left Shift/ Right/ShiftSRAM/ CAM/Logic BCAM/SRAM/ Pseudo-TCAM SRAM/ CAM/ Logic SRAM/ TCAM SRAM/CAM/ Logic/Matrix transpose
    Table 2. Summary of chip parameters and performance of in-memory Boolean logic and CAM
    ParameterRefs. [1, 2] Ref. [10] Refs. [15, 16] Ref. [32] Ref. [31] Ref. [44] Ref. [33] Refs. [24, 37] Ref. [34]
    Tchnology65-nm CMOS65-nm CMOS130-nm CMOS55-nm CMOS65-nm CMOS65-nm TSMC55-nm CMOS7-nm FinFET65nm
    Cell structureDCS 6T10T6TTwin-8T8T1C6T6T8T6T
    Array size4 Kb16 Kb16 Kb64×60 b2 KB64 Kb4 Kb4 Kb64 Kb
    Chip area (μm2) NA6.3×1042.67×1054.69×1048.1×104NA5.94×1063.2×1031.75×105
    Input precision (bit)1651, 2, 4151, 2, 7, 844
    Weight precision (bit)1112, 5151, 2, 841, 2, 3 4, 5, 8
    Output precision (bit)16NA3, 5, 75NA3, 7, 10, 194NA
    Computing mechanismAnalogDigital+ Analog Digital+ Analog AnalogAnalogAnalogDigital+ Analog AnalogAnalog
    ModelXNORNN/MBNNCNNClassifyCNNCNNVGG LeNet-5 CNNVGG-9 NNCNN
    Energy efficiency (TOPS/W) 30.49–55.840.3 (1 V) 51.3 (0.8 V) NA18.37– 72.03 671.5NA0.6–40.2351 (0.8 V) 49.4 (Input: 4b Weight:1b)
    Throughput (GOPS)278.28 (1 V) 1 (0.4 V) NA21.2~ 67.5 1638NA5.14-329.14372.4(0.8 V)573.4 (Input: 4b Weight:2b)
    Accu-racyMNIST96.5% (XNORNN) 95.1% (MBNN) 98%(0.8 V) 98.3%(1 V) 90%90.02%– 99.52% 98.30%99%98.56%– 99.59% 98.51%– 99.99% 98.80%
    CIFAR 10NANANA85.56%~ 90.42% 85.50%88.83%85.97%-91.93%22.89%-96.76%89.00%
    Table 3. Summary of chip parameters and performance of single- and multibit operations.
    Zhiting Lin, Zhongzhen Tong, Jin Zhang, Fangming Wang, Tian Xu, Yue Zhao, Xiulong Wu, Chunyu Peng, Wenjuan Lu, Qiang Zhao, Junning Chen. A review on SRAM-based computing in-memory: Circuits, functions, and applications[J]. Journal of Semiconductors, 2022, 43(3): 031401
    Download Citation