A low-power multilevel-output classifier circuit

August 30, 2017 | Autor: Shahram Minaei | Categoría: Low Power, Current Mode

Descripción

Manuscript ID TNN-2011-P-2933

1

A Neural CMOS Integrated Circuit and its Application to Data Classification İzzet Cem Göknar, Fellow IEEE, Merih Yıldız, Member IEEE, Shahram Minaei, Senior Member IEEE, and Engin Deniz, Member IEEE  Abstract— Implementation and new applications of a tunable Complementary Metal–Oxide–Semiconductor Integrated Circuit (CMOS-IC) of a recently proposed classifier Core-Cell (CC) are presented and tested with two different datasets. With two algorithms, one based on Fisher’s linear discriminant analysis and the other on perceptron learning, used to obtain CC’s tunable parameters, Haberman and Iris datasets are classified. The parameters so obtained are used for hard-classification of datasets with a neural network structured circuit. Classification performance as well as coefficient calculation times for both algorithms are given. The CC has 6 ns response time and 1.8 mW power consumption. The fabrication parameters used for the IC are taken from CMOS AMS 0.35 µm technology. Index Terms—CMOS, Classifier, Fisher, Iris, Haberman

I. INTRODUCTION

C

LASSIFICATION is an important subject matter in many applications ranging from pattern recognition, neural networks to artificial intelligence, from statistics to template matching [1], [2]. In general data classification can be realized either by software or by hardware systems. Many algorithms have been developed for classification [1]; however for faster online operations on hard data it is desirable to realize these classifiers in hardware which can be achieved with many different approaches either in analog or digital domains. Analog implementation of classification has many advantages over digital ones. For one, complexity of analog circuits is lower as compared to digital circuits; for another, they can be built in voltage or current-mode (input and output signals are current). In voltage-mode implementations the supply voltage level has an important impact on the dynamic range of the circuit. Current-mode approach provides larger dynamic range for processing the variables. It is well known that shrinking bias voltages makes it difficult to process data in voltage-mode. A simple summing circuit, in voltage-mode, needs additional active blocks (e.g. operational amplifiers and additional circuitry); current-mode processing on the other hand is preferred as currents can be added by connecting output terminals of the blocks without requiring the use of extra active blocks. For a handicap, the current-mode circuits used to suffer from less accuracy in comparison to voltageManuscript received January 25, 2011; revised June 15, 2011,October 20, 2011 and 01 February, 2012. This work is part of project 106E139 supported by the Scientific & Technological Research Council of Turkey (TÜBİTAK). The authors are with the Department of Electronics and Communications Engineering, Dogus University, Acibadem, Kadikoy 34722, Istanbul, Turkey. (e-mail: [email protected], [email protected], [email protected] )

mode ones; but, in newer technologies (180 nm and below) where low supply voltages are used, the accuracy in voltagemode circuits is also critical whereas current signals can maintain high ratio accuracy [3]. Some classifier circuits, using advantages of the current-mode approach, are listed in the next paragraph. For template matching applications, current-mode circuits are proposed in [4] based on Euclidean distance calculator and in [5] based on threshold circuits. Another current-mode circuit which covers both Euclidean distance calculation and Gaussian neighborhood tapering is given in [6]. A currentmode sorting circuit for pattern recognition is designed to build transformation between features and classes in [7] and [8]. For pattern recognition applications a current-mode, fuzzy IC is presented in [9]. Different classification solutions regarding the implementation of neural networks on programmable digital circuits and devices can be found in [10], [11] but these circuits are relatively costly and high power dissipating. A compact analog programmable multidimensional radial basis function based classifier is proposed in [12] and CMOS implementation of a Neural Network (NN) classifier with several output levels and a different architecture is given in [13]. CMOS realization of a conscience mechanism used to improve the effectiveness of learning in winner-takeall artificial neural networks, which also eliminates the dead neurons, is presented in [14]. Except the one in [13], all these circuits suffer from the shortcoming of not being tunable. In this paper, the new DU-TCC 1209 IC containing 3 CCs, 9 Second-Generation Current Conveyors (CCII) and 3 current buffers is being introduced. The newer CC architecture published in [5], which improves the response time and the Relative Tracking Error (RTE) as compared to the CC given in [13], is being exploited in the IC design/layout/fabrication. Connected properly, these CCs can be exploited to realize n-D classifiers, which can only classify data, defined over mesh-grid (rectangular partitioning) domains. To overcome this deficiency, Linearly Weighting Circuits (LWC) that take linear combinations of data and input these combinations to CC are introduced as preprocessing units. With two algorithms, modified/adapted versions of Fisher’s Linear Discriminant Analysis (LDA) and Perceptron Learning Algorithm (PLA) the weighting coefficients are calculated and soft as well hard-tested on Iris and Haberman datasets. Iris dataset consists of 50 samples from each of three species of the Iris flowers, which are: virginica, versicolor and setosa (3-class data). The flowers have 4 features which are the lengths and widths of the sepal and petal in centimeters (4-

Manuscript ID TNN-2011-P-2933 D data). This dataset is classified with LDA to distinguish the flowers from each other [15]; the Iris dataset is not linearly separable and is frequently used to test many other classification techniques. Haberman dataset contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago's Billings Hospital on the survival of patients who have undergone surgery for breast cancer. It consists of 306 samples from two classes, the patients who survived 5 years or longer (255 samples) and the patients who died within 5 years (81 samples). The dataset is 3-D with age of the patient at the time of operation, patient’s year of operation and number of positive axillary nodes detected. The paper is organized as follows: in Section II, block diagram, the transfer characteristic and the schematics of the current-mode CC are given. In Section III, LWC needed for datasets separated by hyperplanes with arbitrary slopes is introduced. Derivation of parameter values for classifying datasets using the modified Fisher’s LDA based algorithm is given in section IV. The derivation of the same parameters with PLA is given in section V. In Section VI, these weight parameters are applied to the Iris and Haberman dataset classifier circuits, which are constructed with LWCs and CCs; also DU-TCC 1209 classification test results are compared with the simulation results. Finally, Section VII concludes the paper. II. CMOS CORE CELL (CC) The block diagram of the CC and its transfer characteristics are shown in Figs. 1(a) and (b), respectively. The horizontal position, the width and the height of the transfer characteristic can be adjusted independently by means of external currents I1, I2 and IH. The proposed CC schematic is shown in Fig. 2 where the transistors M1-M5 and M8-M12 constitute the two threshold circuits respectively. The basic current mirror constructed with transistors M6 and M7 performs the desired operation of subtraction. The transistors M13, M14, M15 are used to provide currents equal to IH (adjusting the output level)

2

for the threshold circuits. Similarly, the same approach is used with transistors M16, M17 and M18 to apply the input current Iin to both of the threshold circuits. The current Iin is the 1-D data for each CC and Iout is the output of the classifier. It has been shown in [5] that by properly grouping CCs and adding the outputs in each group Multi-Input-Multi-Output (MIMO) classifiers can be obtained. A detailed Monte Carlo analysis of the CC for VTH and β parameters of the MOS transistors are reported in [5]; they show that parameter mismatch has little effect on the CC characteristics. III. CLASSIFICATION OF DATA PARTITIONED WITH ARBITRARY HYPERPLANES

The block diagram of the CC presented in Section II, and MIMO classifiers realized with them, partitions data domains into rectangular mesh-grids whereas there is strong need to treat linearly non-separable data as shown in Fig 3(a). Let the data to be classified be 1-D and belong to two different classes (A and B) as in Fig. 3(a). If inputs x1 and x2 are multiplied with coefficients (w1 and w2) and applied to CC with control currents I1 and I2 as shown in Fig. 3(b), called a linearly non-separable 1-D data classifier, then the data domain will be partitioned with arbitrary hyperplanes as shown in Fig. 3(a). The block diagram realization of Fig. 3(b) can easily be generalized to an n-D data classifier; a detailed explanation is given in [16]. IH Iin

Iout

Iout

Core Circuit

Iin

Iout

IH I1

(a) a) CC block diagram.

Fig. 1

I2 Iin

I2

I1

(b) b) Transfer characteristic of CC.

VDD M16

M18

M17 VDD

VDD M20

VDD

M13

M19

M15 M21

M14

M22

Subtractor

Iin

VDD VSS

I2

IH

I1

M6

M7 Iout

VSS

VSS

VSS M10

M3 M1

M2

Threshold Circuit Fig. 2. CMOS implementation of CC [5].

M4

M5

M12 VSS

M11

M9

M8

Manuscript ID TNN-2011-P-2933

3

x2 I2 w2

A

I1 w2

IH

B A

x1w1+x2w2 I1 w1

I2 w1

x1

Iin

Iout

Core Circuit

I1

I2

(a) (b) Fig. 3. Linearly non-separable a) data domains b)1-D data classifier. Fig. 5. View of DTH on the projection line.

line satisfying the following two criteria: a) µp1 and µp2 are to be at the maximum distance from each other, b) σ1 and σ2  should have minimal values. Let the 2-D vector x belong to a “two class” data,

 x  x   1  x2 

(2)

 Projection of x samples to the projection line can be found

 

Fig. 4. Classification methodology of LDA dataset [16].

The classification method and the IC developed in the sequel will be able to classify linearly non-separable (e.g. as the one in Fig. 4(a)) data. The dashed lines, called Double Threshold Hyperplanes (DTH), in Fig. 4 (b)-(d) correspond to:

w1x1  w2x2  a , w1 x1  w2 x2  b

(1)

and are found to provide best separation of data as shown in Fig. 4(b) (data outside of these DTH being already classified). Then classified data is deleted and for the rest, the DTH are found again as shown in Fig. 4(c). The outcome of the classification is shown in Fig. 4(d). So the classification of data is achieved by finding the appropriate coefficients w1, w2 and corresponding CC currents. The output of the classifier will be in conformity with Fig. 4(d). Algorithms for obtaining these DTHs will be presented in Sections IV and V. IV. CLASSIFICATION OF DATA WITH FISHER’S LDA BASED ALGORITHM Fisher’s LDA, a successful linear feature extraction method which maximizes between-class separability and minimizes within-class variability, is used to find the classifier circuits’ parameters; it will be presented here in 2-D and applied to 4D; exactly the same procedure is valid in n-D [17]. Fisher’s LDA is used to find the projection of the data to a  “best” line, which has a direction vector v and passes through the origin. Let a two dimensional data with two classes c1 and c2 be given as shown in Fig. 4(a). After the projection of the data to this “best” line the histogram of the data is obtained as shown in Fig. 5 where, µp1 and µp2 are the average distances of the data to the origin, and σ1, σ2 are standard deviations. The histogram characteristics help to find the “best” projection line and hence DTH. This is achieved by finding the projection

with the scalar product of v T x ; µp1 and µp2 are the average distances of the first and second class data sets, respectively,  whereas 1 , 2 are the average vectors calculated component wise for each class. It follows that:      p1  vT 1,  p2  v T 2 (3)

T 

as y  v x gives the distance of the data to the origin. So the scatters of the first and the second class are given with:

s2p1 

( y   i

p1

2

) , s2p 2 

( y   i

p2

2

)

xiC2

xiC1

(4)

According to these equations, to find the best projection line,





 p1   p 2 2 v T S B v  J( v )  2   ( s p1  s 2p 2 ) v T S wv

(5)

should be maximized [17], where n     Sw   (x  μi )(x  μi )T  i1 xC i

c     SB   ni (μi  μ)(μi  μ)T i 1

are the within-class and between-class scatter matrixes; here ni is the number of training sample in the i-th class and n is the  number of classes. To maximize J (v ) now is equivalent to the generalized eigenvalue problem [17]:   SBv  Swv (6) The eigenvector corresponding to the eigenvalue with maximum value (they are all positive, the matrices being  positive semi-definite) obtained from (6) gives the vector v which is the slope of the projection line [18], [19]. The projection of data on this line helps to find the DTH as shown in Fig. 5 [20]. On the projection line in Fig. 5, “a” shows the minimum distance to the origin of data in the second-class; its coordinates are (xa1, xa2). Similarly, point “b” is the maximum distance to the origin of data in the first-class dataset, whose coordinates are (xb1, xb2). The hyperplane equations that are going to classify the datasets can then be written as:

Manuscript ID TNN-2011-P-2933

T  x1  T  xb1   x   x  v T  1   v T  a1   0 , v  x2   v  xb2   0 (7)      x2   xa2  The process is repeated for data between the hyperplanes, the data outside being classified. The components of the so chosen eigenvector determine the weight coefficients (e.g. (14) in Section VI.B) to be used in the LWC that will implement the separating hyperplanes. In the n-D case, the only difference is in the dimension of the involved matrices in (6), which is now n×n; again one has to find the eigenvector, corresponding to the eigenvalue with maximum value to determine DTH [18]-[20]. V. CLASSIFICATION OF DATA WITH THE PERCEPTRON LEARNING ALGORITHM

The classical PLA widely used in neural networks is based on a single threshold activation function [21] whereas the one used here is double threshold function as shown in Fig. 1(b). Regions separated by DTH are characterized by: 1 x v T  a  0 yi    iT i 0 x vi  ai  0 1 x v T  b  0 yi    iT i 0 x vi  bi  0

4

at that step respectively; η is the learning coefficient which has to be chosen between 0 and 1.While updating, when yd = yo the weight coefficients do not change. Learning algorithm stops when all weight coefficients stay constant [23]. When learning is finished using the training set, the DTHs  ai  x viT  bi partition the data domain into regions each



containing a single class of data. Components of v obtained from the learning algorithm give the weight coefficients of LWC, and ai and bi determine the CC’s control currents I1 and I2. The remaining data set will be used to verify the correct operation of the classifier so obtained. The CC current IH helps to identify the class of the data. Thus, classification is provided with appropriate number of LWC and CC blocks. In the sequel both algorithms will be used to hard and softclassify Iris and Haberman datasets.

(8)

(9)

With the algorithm developed next for DTH, the vectors vi, the DTH coefficients ai and bi in (8) and (9), relevant to the ith class, will be determined. Classification of the data will then be achieved by separating the classes with appropriate number of hyperplanes. Fig. 6. Die photo of the classifier integrated circuit.

Perceptron Learning based Classification Algorithm Selecting one of the data classes ci (i=1,2,..,m): 1. Check, if there is an appropriate DTH that separates all data of class ci from all the others. a. If there is, save these coefficients and delete this data class from list; move to step 2. b. If there is not, then move to step 3. 2. Continue with the remaining classes. a. If the remaining class is m-th, then stop classification. b. If not, move to the step 1. 3. Increase the number of DTH by 1 and check whether all data from class ci can be separated from other classes. a. If, yes then save the coefficients of DTHs and remove this data class from list; move to step 2. b. If, the class cannot be separated then move to step 3. As the activation function is a hard limiter, the coefficients can be calculated with the update rules of PLA [22] given below.

TABLE I DIMENSIONS OF MOS TRANSISTORS IN FIG 2 MOSFET W [m] L [m]

M1, M2, M3, M4, M5, M8, M9, M10, M11, M12 21 1.05

Vy

Z+ DO- CCII ZX Y

Iz+ Iz-

R

 Fig. 7. LWC configuration with DO-CCII.

Perceptron learning algorithm update rules: (10) (11) (12) In (10-12) yd and yo are the desired and the output obtained

M6, M7, M13, M14, M15, M16, M17, M18, M19, M20, M21, M22 67.9 1.05

Fig. 8. Projection of Iris data to the origin.

Manuscript ID TNN-2011-P-2933 VI. REALIZATION OF THE CORE CIRCUIT, LINEARLY WEIGHTING COEFFICIENTS AND TEST RESULTS A. CMOS Realizations The layout of the CC and of the IC including 3 CCs, 9 current conveyors and 3 current buffers have been designed using MENTOR software with fabrication parameters for the CMOS AMS 0.35 µm process [24]. The die photo of the manufactured IC, called DU-TCC 1209, is shown in Fig. 6. In order to provide user tunability all 52 pins had to be used for I/O access, causing the pads dominate IC area, thus a pad limited design. DU-TCC 1209 has a 2.62×2.62 mm2 die area and CMOS transistors’ dimensions of the CC are given in Table I. The block diagram of the LWC using a Dual Output Second Generation Current Conveyor (DO-CCII) [16] is shown in Fig. 7. The voltage Vy is the input and the current Iz+ and Iz- are the outputs of the circuit in Fig. 7; these output currents can be expressed as:

x1 LWC-1

x2 x3 x4

B. Experimental Setting and Applications The 4-D Iris dataset has 150 samples with equal number from three classes (c1, c2, c3). Taking 40 data from each class and using Fisher’s LDA, the coefficients of the projection (eigen)vector are obtained as:  (14) v  0.57  0.80 0.10 0.14 .



The projection of the Iris data using v (the scalar product

  v T x ) is shown in Fig. 8. It can be seen from Fig. 8 that the data belonging to three different classes can be separated with appropriate boundaries (DTH); these boundaries determine the CC control currents. To test DU-TCC 1209 with the Iris data, the classifier block diagram given in Fig. 9 was constructed on a specially designed Printed Circuit Board (PCB) shown in Fig. 10 where potentiometers provide tunability; in building  the classifier, product of Iris input data with the vector v (wi coefficients) was provided with LWC blocks as outlined next. In order to obtain weighting coefficient w1 = 0.57 in (14) the resistor R1 at the X terminal of the 1st DO-CCII is tuned to the value R1 = 10/0.57 k providing an output current IZ+ = 0.57(V/10k). For the 2nd coefficient w2 = 0.80 of (14), the resistor R2 at the X terminal of the 2nd DO-CCII is tuned to R2 = 10/0.80 k, providing an output current IZ- = 0.80(V/10k) to secure the minus sign and so on. To test the weight values provided by the algorithm, unused 30 data (10 from each class) was soft-applied to the classifier of Fig. 9 with a 4-channel programmable-output function generator. Each data was applied with 1 ms duration and was classified correctly; the outcome is shown in Table II (only 20 shown because of space limitations).

LWC-3

Current Follower

x3w3

CC-2

Output

x4w4

CC-3

Fig. 10. Test PCB for DU-TCC 1209. TABLE II IRIS AND HABERMAN TEST OUTCOME OF THE CLASSIFIER

Vy Vy  I z     R R

The resistance R is used to convert the voltage input data Vy, to current; moreover, the ratio 1/R provides the appropriate weight value for the realization. It is worthwhile mentioning that the DO-CCII can also be used to provide negative weight values using the Z- terminal in case the need arises.

CC-1

x2w2

LWC-2

LWC-4

x1w1

Fig. 9. Iris classifier block diagram (Fisher’s LDA is used).

Iris data

Time interval of data (ms)

I z

5

0 -1 1 -2 2 -3 3 -4 4 -5 5 -6 6 -7 7 -8 8 -9 9 - 10 10 - 11 11 - 12 12 - 13 13 - 14 14 - 15 15 -16 16 - 17 17 - 18 18 - 19 19 - 20

Haberman data

x1

x2

x3

x4

Class

x1

x2

x3

Class

4.3 5.7 5.7 4.9 5.6 4.6 4.7 6.1 6.4 4.9 4.8 5.4 6.3 6.7 7.2 5.4 5.3 5.4 6.3 6.3

2.3 2.7 2.7 2.2 2.5 3 3.1 2.9 3 2 3.2 3.8 2.8 3 3.2 3.9 3.7 3.8 3.0 3.0

1.4 3.9 4.8 6 5.1 1.7 1.5 3.8 4 4.7 1.2 1.6 6.5 6.4 5.4 1.9 1.5 1.3 4.4 4.1

0.2 1.1 1.8 2.5 1.9 0.4 0.1 1.1 1.3 1.4 0.2 0.6 1.8 2 2.1 0.4 0.2 0.3 1.3 1.3

c1 c2 c2 c3 c3 c1 c1 c2 c2 c2 c1 c1 c3 c3 c3 c1 c1 c1 c2 c2

34 61 51 37 54 61 42 53 48 65 60 42 30 65 41 46 42 72 47 43

60 68 59 59 58 62 63 61 67 66 59 59 62 62 60 58 61 67 63 58

1 1 3 6 1 5 1 1 7 15 17 2 3 22 23 3 4 3 23 52

c1 c2 c2 c1 c1 c2 c1 c1 c2 c2 c2 c1 c1 c2 c2 c1 c1 c1 c2 c2

Fig. 11. Iris data classification oscilloscope outcome (LDA is used).

x1

LWC-1

x2

LWC-2

x3

LWC-3

x1

LWC-4

x2

LWC-5

x3

LWC-6

x1w1 x2w2 x3w3

CC-1

Output

x1w4 CC-2

x2w5 x3w6

Fig. 12. Haberman classifier block diagram (PLA is used).

Manuscript ID TNN-2011-P-2933

Fig. 13. Haberman data classification oscilloscope outcome (PLA is used).

To verify the performance of DU-TCC 1209 control currents of the CCs given in Fig. 9 are hard-set as given in Table III; CC currents I1, I2 are selected according to the classification boundaries obtained from the projection of the Iris data as shown in Fig. 8, (for instance, class c1 is in the distance range of 0.1 to 0.8 so, I1=1 µA and I2=8 µA). On the other hand, if the output current Iout is 10 µA then the data is taken from class c1, for 20 µA from c2, while for 30 µA from c3. Fig. 11 shows the output voltage taken across a 5 kΩ resistor connected to the output of the Iris dataset classifier circuit to measure its output current. Test results are in perfect agreement with the classes given in Table I. The 3-D Haberman dataset contains 306 samples belonging to two classes (c1, c2); for the training stage taking randomly 42 data from each class and using the PLA developed in Section V, the DTHs are obtained as:  a1=11 and b1=69 (15) v1  (1.4 2.2 16) ,

 v2  (1.3 2.2 10) ,

a2=20 and b2=24 (16) The data remaining within the region enclosed by these DTHs belongs to class c1, otherwise the data is from class c2. These regions are constructed with the configuration in Fig. 12 (LWC-1, LWC-2, LWC-3 and CC-1 constitute the first region, and the remaining blocks constitute the second region). To verify the performance, unused 20 data (10 from each class) was first soft-applied to the Haberman dataset classifier of Fig. 12; all data was classified correctly and the outcome is shown in Table II. Control currents of the CCs given in Fig. 12 were hard-set as given in Table III. These control currents were chosen according to the coefficients a and b which are obtained from the PLA. If the output current Iout is 30 µA then the data is from c1, if 0 µA then the data is from c2. Fig. 13 shows the output voltage taken across a 5 kΩ resistor connected to the output of the classifier circuit to measure its output current. Test results are in perfect agreement with the classes given in Table II. Iris dataset is also used to test the performance of the classifier circuit with the PLA. Fig. 14 shows the result of the classification of Iris dataset with the PLA. If output current Iout is 15 µA then the data is from c1, 30 µA then the data is from c2, 0 µA then the data is from c3. Test results are in perfect agreement with the classes given in Table II. Haberman dataset has also been classified with Fisher’s LDA based algorithm and the outcome is exhibited in Fig. 15. If the output current Iout is 20 µA then the data is from c1, 60 µA then the data is from c2. Test results are in perfect agreement with the classes given in Table II.

6

Fig. 14. Iris data classification oscilloscope outcome (PLA is used).

Fig. 15. Haberman data classification oscilloscope outcome (LDA is used). TABLE III CONTROL CURRENTS OF THE CCS IN FIG. 8 AND FIG. 11 CC’s in Fig. 8 CC’s in Fig. 11 Control Currents CC-1 CC-2 CC-3 CC-1 CC-2 I1 1 µA 16 µA 21 µA 11 µA 20 µA I2 8 µA 19 µA 24 µA 69 µA 24 µA IH 10 µA 20 µA 30 µA 30 µA 30 µA TABLE IV PERFORMANCE COMPARISON OF THE ALGORITHMS Classification Coefficient calculation Performance [%] time [second] Algorithm Iris Haberman Iris Haberman Fisher LDA Based 100 100 0.4 0.3 Perceptron learning 100 100 220 190

Exp/ Ref. Sim(a) [4] [8] [9] Exp [12] CC Sim

[13]

TABLE V COMPARISON OF 1-D CLASSIFIER CIRCUITS Supply Power Response Technology Voltage Dissipation Time 0.6 µm 3.3 V 14.95 mW 0.35 µm 5V 2 µm 5V 80 mW 0.5 µm 3.3 V 90-160 µW 20-40 µsec 1.2 mW Soft 0.35 µm ± 1.65 V 6 ns 1.8 mW Hard 0.35 µm ± 1.65 V 1.4 mW 7 ns

RTE 0.5% 2%

(a) Sim: Simulated, Exp: Experimental

Haberman and Iris datasets were classified with Fisher’s LDA and perceptron learning based algorithms. The algorithms were executed on a Personal Computer (PC) running at 3 GHz and having 1 GB Random Access-Memory (RAM). The classification performance and coefficient calculation times for both algorithms are given in Table III showing there is no misclassification. The core cells given in [4], [8], [9], [12], [13] and the one used in this paper are compared from technology parameters, power consumption, supply voltage, response time and RTE point of views in Table V. Small response time and RTE are important for fast and correct decision of the data. The power consumption of the circuit in [13] and CC, given in Table V, are obtained for I1=50 µA, I2=100 µA, IH=100 µA; smaller choices of these currents yield much less consumption.

Manuscript ID TNN-2011-P-2933 VII. CONCLUSION This paper is about a NN classifier based on a 1-D classifier called CC implemented in DU-TCC 1209, an IC containing 3 CCs, 9 current conveyors, 3 current buffers with 52 I-O pins, which has been designed and manufactured with CMOS AMS 0.35 µm technology parameters. After presenting the improved CC, the newly designed IC, DU-TCC 1209, was introduced. Hard-test results concerning the classification of Haberman and Iris datasets are reported; to that purpose two algorithms, one based on perceptron learning, the other on Fisher’s LDA, were developed. The weight values (control currents of CCs) are determined with these two algorithms and hard/soft-applied to the resulting NN classifier showing perfect agreement between simulations and measurements. Other applications such as quantization, template matching with error correction etc. were previously described at simulation level in [5], [16], [25], [26]. The running time of the algorithms are also included to compare performances. Moreover, DU-TCC 1209 is versatile, with tunable weight parameters provided a library of weights (control currents) is available; furthermore they can be used in many applications, whereas the others given in Table IV are single-task oriented. All applications being so far based on static properties, dynamical behavior of DU-TCC 1209 has to be analyzed then tested. New applications developed by allowing control currents (parameters) vary with time, thus taking full advantage of DTH, will be explored in future works. In another direction, digital circuitry can be embedded into the IC and online programming as well as tuning of the weighting parameters achieved in the field.

[12]

REFERENCES

[25]

[1] [2]

E. Hunt, Artificial Intelligence. New York: Academic, 1975. H.S. Abdel-Aty-Zohdy and M. Al-Nsour, “Reinforcement learning neural network circuits for electronic nose,” in IEEE International Symp. on Circuits and Systems, May –June 30-2, 1999, pp. 379 – 382. [3] C. Toumazou, G.S. Moschytz and M. B. Gilbert, Trade-offs in analog circuit design: the designer’s companion. Kluwer Academic Pub., 2002. [4] B. Liu, C. Chen, and J. Tsao, “A modular current-mode classifier circuit for template matching application,” IEEE Trans. Circuit and Syst. II, Analog and Digital Sig. Proc., vol. 47, no. 2, pp. 145-151, 2000. [5] M. Yıldız, S. Minaei, and C. Göknar, “A flexible current-mode classifier circuit and its applications,” International Journal of Circuit Theory and Applications, vol. 39, pp. 933-945, 2010. [6] F. Li, C.-H. Chang, and L. Siek, "A compact current mode neuron circuit with Gaussian taper learning capability," in IEEE International Symposium on Circuits and Systems, May 24-27, 2009, pp. 2129-2132. [7] G. Lin and B. Shi, “A current-mode sorting circuit for pattern recognition,” in Intelligent Processing and Manufacturing of Materials, Honolulu, Hawaii, July 10-15, 1999, pp. 1003–1007. [8] D. Y. Aksın, and S. Aras, “A compact Distance Cell for Analog Classifiers,” in Proceedings of the IEEE International Symposium on Circuits and Systems, Kobe, Japan, May 23-26, 2005, pp. 3627-3630. [9] G. Lin and B. Shi, “A multi-input current-mode fuzzy integrated circuit for pattern recognition,” in Intelligent Processing and Manufacturing of Materials, Honolulu, Hawaii, July 10-15, 1999, pp. 687-693. [10] J.L. Ayala, A.G. Lomena, M. Lopez-Vallejo, and A. Fernandez, “Design of a Pipelined Hardware Architecture For Real-Time Neural Network Computations” in 45th Midwest Symposium on Circuits and Systems, Ohlahoma, USA, August 4-7, 2002, pp:419-422. [11] J. Zhu, “Towards an FPGA based reconfigurable computing environment for neural network implementations”, in International

[13]

[14]

[15] [16]

[17] [18]

[19]

[20] [21]

[22]

[23]

[24]

[26]

7

Conf. on Artificial Neural Networks Conference, Edinburgh, UK, 6-10 September, 1999, pp. 661–666. S. Y. Peng, P. E. Hasler, and D. Anderson, “An Analog Programmable Multi-Dimensional Radial Basis Function Based Classifier,” in International Conference on Very Large Scale Integration, Atlanta, USA, October 15-17, 2007, pp. 13-18. M. Yıldız, S. Minaei, and C. Göknar, “A CMOS Classifier Circuit using Neural Networks with Novel Architecture,” IEEE Transaction on Neural Networks, Vol. 18, 2007, pp. 1845-1849. R. Dlugosz, T. Talaska, W. Pedrycz, and R. Wojtyna, "Realization of the Conscience Mechanism in CMOS Implementation of Winner-Takes-All Self-Organizing Neural Networks," IEEE Transaction on Neural Networks , vol. 21, no. 6, pp. 961-971, June 2010. R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annual Eugenics, vol. 7, pp. 179-188, 1936. M. Yildiz, S. Minaei, and S. Özoğuz, “Linearly weighted classifier circuit,” in Northeast Workshop on Circuits and Systems, Toulouse, France, June-July 28-1, 2009, pp. 99-102. L. Qi and W.T. Donald, “Principal feature classification,” IEEE Transaction on Neural Networks, vol. 8, pp. 155-160, 1997. D. Qian, “Modified Fisher's linear discriminant analysis for hyperspectral imagery,” IEEE Geoscience and Remote Sensing Letters, Vol. 4, pp. 503-507, 2007. H. Çevikalp, “Theoretical analysis of linear discriminant analysis criteria,” in IEEE 14th Signal Processing and Communications Applications, Antalya, Turkey, April 17-19, 2006, pp. 1-4. Q. Li, “Classification using principal features with application to speaker verification,” Ph.D. diss., Univ. Rhode Island, Kingston, Oct. 1995. D. Y. Aksın, S. Aras, and İ.C. Göknar, “CMOS realization of user programmable, single-level, double-threshold generalized perceptron,” in Proceedings of Turkish Artificial Intelligence and Neural Networks Conference, İzmir, Turkey, July 21-23, 2000, pp. 117-125. Y. Zhao, B. Deng, and Z. Wang, “Analysis and study of perceptron to solve XOR problem,” in Proc. of the 2th International Workshop on Autonomous Decentralized System, China, Nov. 6-7, 2002, pp. 168-173. İ. Genç and C. Güzeliş, “Threshold class CNNs with input-dependent initial state,” in IEEE International Workshop on Cellular Neural Networks and their App., London, Eng., April 14-17, 1998, pp. 130-135. Parameter Ruler Design CMOS AMS 0.35 µm, Mentor Graphics Corporation, 2008. M. Yıldız, S. Minaei, and İ.C. Göknar, “A low-power multilevel-output classifier circuit,” in European Conference on Circuit Theory and Design, Seville, Spain, Aug. 26-30, 2007, pp. 747-750. M. Yıldız, S. Minaei, C. Göknar, “Realization and template matching application of a CMOS classifier circuit,” in Proc. of the Applied Electronics Conf., Pilsen, Czech Rep., Sep. 10-11, 2008, pp. 231-234.

Lihat lebih banyak...

A low-power multilevel-output classifier circuit

Descripción

Comentarios