Willshaw memory with vector ... - IEEE Xplore

0 downloads 0 Views 329KB Size Report
Technology Research Center, University of Turku, Finland. E-mail: mlaiho@utu.fi. ∗∗. Redwood Center for Theoretical Neuroscience, University of California, ...
A 512x512-Cell Associative CAM/Willshaw Memory with Vector Arithmetic Mika Laiho∗ , Jonne K. Poikonen∗, Eero Lehtonen∗, Mikko P¨ank¨aa¨ l¨a∗, Jussi H. Poikonen ∗ , Pentti Kanerva∗∗, ∗

Technology Research Center, University of Turku, Finland. E-mail: [email protected] Center for Theoretical Neuroscience, University of California, Berkeley, U.S.A

∗∗ Redwood

COLUMN LOGIC/MEMORY

DATA BUS

Abstract—In this paper we present a CMOS implementation of a 512×512-cell Associative Content Addressable Memory (ACAM) in 180 nm CMOS. The memory can be operated either as an associative CAM or it can be configured into a Willshaw memory for operating with sparse data. The vector matching operation can use a tunable hit threshold or the strongest hit can be selected with a winner-take-all (WTA) network. Built-in row and column circuitry can perform logic operations on the contents of the row and column memories. The operation of the circuit is verified experimentally with an example on computing with random vectors.

8-BIT MEMORY

8-BIT MEMORY

COL LOGIC

COL LOGIC

x1

COL LOGIC

x2

x512

ROW I/O LATCH LATCH

r1 y1

LATCH

r2

LATCH

y2

ACAM c1,1

ACAM

c1,2

ACAM

c1,512

RD WR

RD WR

RD WR

ACAM c2,1

ACAM c2,2

ACAM c2,512

RD WR

RD WR

RD WR

x c1,1:512

x c2,1:512

Index Terms — analog signal processing, content addressable memory, associative memory, Willshaw memory

LATCH

r512

LATCH

y512

ACAM c512,1

ACAM c512,2

c512,512

RD WR

RD WR

RD WR

I. I NTRODUCTION

The work at UT was funded by the Academy of Finland (140108, 258831, 253596, 264914, 277383).

Fig. 1.

HIT PROCESSING

z1 WTA

HIT LOGIC

64-BIT HIT MEMORY

HIT LOGIC

64-BIT HIT MEMORY

HIT LOGIC

64-BIT HIT MEMORY

z2 WTA

ACAM

x c512,1:512

A Content-Addressable Memory (CAM) is a means to locate data of interest within a memory array [1]. When a CAM is provided an input data vector, it compares the vector to the whole memory contents, preferably in a parallel fashion. If a matching vector is found, the CAM outputs the address where the data matching the input vector is stored. For the search operation to provide a hit, a complete match of the input vector may be required or, alternatively, the search operation can be based only on a limited set of bits of the input vector by assigning a don’t care condition to some bits of the vector. The CAM concept can be broadened to cover also partial hits (Associative CAM, ACAM), so that the strength of the hit is proportional to the number of matching bits in the input vector. An associative search operation enables the retrieval of input vectors among noisy data, because a partial hit is sufficient to be interpreted as a match. The amount of tolerated mismatch in the data vectors can be controlled through a programmable threshold. A problem with a simple threshold operation is that it can provide ambiguous results by producing multiple hits — or no hits at all — if the Hamming distance changes a lot from search to search. Better discrimination in such a case can be achieved by using a Winner-Take-All (WTA) circuit [2] that can find the single strongest match. The type of data vectors to be stored affects the CAM configuration. The preferable way to store dense data is to allocate one vector per row, which is typical for the ACAM operation. On the other hand, if the data are sparse, the number of vectors that can be stored into the memory array (vector capacity) can be much larger than the number or physical rows, provided that the data can be distributed to multiple rows. An

978-1-4799-8391-9/15/$31.00 ©2015 IEEE

8-BIT MEMORY

z512 WTA

Top level illustration of the ACAM/Willshaw memory realisation.

example of a CAM that can store sparse data by distributing to multiple rows is the Willshaw memory [3]. In this paper we present an integrated circuit implementation of a content addressable memory architecture that realizes both ACAM and Willshaw operation modes. The chip was implemented in 180 nm CMOS and has a die area of 5×5 mm2 . The size of the memory array is 512×512 cells and the chip has a 64-bit hit memory for each array row, and 8 bits of input memory per array column. The built-in row and column circuitry can perform logic operations on the contents of the row and column memories. Hit vectors are extracted either with a tunable threshold circuit, or with a row-parallel WTA. II. ACAM/W ILLSHAW M EMORY F UNCTIONALITY Fig. 1 shows a top level illustration of the ACAM/Willshaw memory realization. Data is written to the chip via a 16-bit bus that connects both to the Column Logic/Memory and Row I/O circuitry. The search result is processed and stored in the Hit Processing block, which also contains an address encoder. A. Write Operation Either dense or sparse data can be stored in the memory array. In short, a sparse vector contains only a small fraction of ones while most of its bits equal zero. In dense binary vectors the fractions of zeros and ones are close to one half. When the circuit is used to store dense data vectors for the regular ACAM operation, the vector elements xi , provided by the column memory, are stored to ACAM cells ci,j one at a time by activating the row addresses yj , where i, j ∈ [1, 512].

1350

COL MEMORY

When sparse data are processed the array is used as a Willshaw memory. In this case (1)

CL

If the Willshaw memory is used in autoassociative mode, x = y. In other words, the result of the search is the (error corrected) vector itself. On the other hand, if heteroassociative operation is preferred, x is the key, and y is the value that is stored with the key. The vector capacity of a Willshaw memory for random vectors is M ≈ 0.69(N/K)2,

WCOL

WCOL

xi

WSHAW

xi

ACAM

rj

RD MATCH CURRENT

ci,j

(2)

where N is the length of the vector and K is the average number of ones per vector [4], and we assume K