A CMOS Ultra-Low Power Vision Sensor With Image Compression and Embedded Event-Driven Energy-Management

June 28, 2017 | Autor: Leonardo Gasparini | Categoría: Image Processing, Data Compression, Machine Vision, Energy Management, Microcontrollers, Event Detection, Visual Processing, Motion estimation, Image compression, Power Consumption, Low Power, Low Power Electronics, Chip, Autonomic System, Duty Cycle, Ultra Low Power, Event Detection, Visual Processing, Motion estimation, Image compression, Power Consumption, Low Power, Low Power Electronics, Chip, Autonomic System, Duty Cycle, Ultra Low Power

Share Embed

Laporkan tautan ini

Descripción

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 3, SEPTEMBER 2011

299

A CMOS Ultra-Low Power Vision Sensor With Image Compression and Embedded Event-Driven Energy-Management Nicola Cottini, Leonardo Gasparini, Student Member, IEEE, Marco De Nicola, Nicola Massari, Member, IEEE, and Massimo Gottardi, Member, IEEE

Abstract—This paper presents an energy-aware CMOS vision sensor, targeted to surveillance and monitoring applications. The 64 64 pixels sensor directly estimates and binarizes the spatiotemporal contrast with a power consumption of 30 W, at 30 fps and 3.3 V supply voltage. The sensor embeds event detection and energy management capabilities, which meet the requirements of an energy autonomous system. Data compression and fast readout delivering turn into a chip low power performance and duty-cycle optimization of the external processor, devoted to visual processing and communication. A preliminary prototype of vision system has been developed, estimating motion and looking for events occurring in the scene. It consists of the vision sensor interfaced with a small-density CPLD and a tiny micro-controller for data readout and communication. Here, the vision system performs ultra-low power consumption of about 1 mW. The presented system exhibits an operating life of more than four months, powered with 950mA h battery and a duty-cycle of 5%. Index Terms—CMOS vision sensor, early image processing, energy-autonomous sensors, energy-aware sensors, image compression.

I. INTRODUCTION

T

HERE is a real need of energy-autonomous sensory systems organized in wireless sensor networks, aimed at monitoring many different physical quantities in our everyday life environment. Although, big advantages have been done in all those applications requiring relatively simple sensors (temperature, pressure, humidity, light, sound, etc.), no significant steps forward have been made toward more complex sensors, like vision sensors, where the information is much more dense and the typical activity is larger, as much as their output bandwidth. In fact, most of those systems are big, consume large power, ranging from 100 mW up to few Watts1 [1]–[4], so that, when powered with batteries, their operating lifetime is Manuscript received December 31, 2010; revised May 31, 2011; accepted August 09, 2011. Date of publication October 13, 2011; date of current version November 09, 2011. This work was supported in part by the Project BOViS (“A Battery Operated Vision System for Wireless Sensor Network Applications”), within the Italy-Israel Cooperation Program 2009. This paper was recommneded by Guest Editor E. Macii. N. Massari and M. Gottardi are with the Center for Scientific and Technological Research (ITC-irst), Fondazione Bruno Kessler, Trento, I-38123, Italy (e-mail: [email protected]). N. Cottini, L. Gasparini, and M. De Nicola are with the Integrated Optical Sensors and Interfaces Group (SOI), Fondazione Bruno Kessler, Trento, I-38123, Italy. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JETCAS.2011.2167072 1Wireless

Sensor Network Lab., Stanford Univ., Palo Alto, CA. Available: http://wsnl.stanford.edu/smartcam.html

very poor, ranging from one up to 3–4 days. This means that these systems need to be supplied from the mains, turning into significant installation costs. The main constraints for the above-mentioned approach is that commercial imagers are designed for multimedia applications, typically mobile phones, camera and toys, where image quality and resolution are the most important figures of merit. Here, the sensor power consumption is not considered to be, to some extent, the most critical issue, especially if it is compared with the power consumption of the overall system: DSP, memory, communication unit. For example, a 70–80 mW commercial VGA CMOS imager is claimed to be an ultra-low power sensor2 [5]. Other examples of available prototypes refer to 35 mW CIF format image sensors [6]. Powered with a small 950 mA h Li-ion battery, only the imager can work for 38 h [5]. This value will drastically decrease if we also take into account the power consumption of the other components of the system. Another custom image sensor, specifically designed for battery operated applications, consumes 5 mW with 100 K pixel resolution.3. Here the performance are much better, turning into about one month operating lifetime with the same battery, for the sensor alone. However, this numbers give us a rough estimation of the lifetime of a system build with commercial components. Although their advanced performance, these imagers cannot meet the main requirements of an energy-autonomous vision system. In fact, most commercial imagers already integrate automatic exposure time, on-chip analog–digital (A/D) conversion and standard video output. Only few parameters can be changed by the user. Moreover they deliver video data at frame rate, forcing the processor to run all the time, managing images in real-time and extracting information regarding possible alert situation. This operation has to be always executed, even if there is nothing in the scene to look at, with a consequent waste of power. Currently, this is the only way for a vision algorithm to be embedded in COTS-based (Components Off The Shelf) hardware. In literature, several recent examples of low-power image sensors have been proposed (Table I). It is worthwhile to carefully analyze the reported architectures. This is not always a simple task. In fact, the presented implementations are very different from each other (pixel topology, ADC, chip interface, 2Aptina

Imaging. Available: http://www.aptina.com Microelectronic—Marin SA. Available: http://www.emmicroelectronic.com 3EM

2156-3357/$26.00 © 2011 IEEE

300

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 3, SEPTEMBER 2011

TABLE I COMPARISON OF CHIP CHARACTERISTICS

subsequent processing). Moreover, data presented on power consumption is not homogeneous: total power, power/pixel, power/array, power/ADC conversion, etc. Although no standard has been defined yet to enable rigorous comparison between the implementations, a good figure of merit is the power consumption per frame per pixel (W/frame.pixel). This value includes the actual consumption of the pixel together with a share of consumption of the sensing circuit, A/D conversion, and chip interface. Comparing the values listed in the last row of the table (Total Power Consumption), we note that these values range from 4.9 nW/frame.pixel [7], down to 84 pW/frame.pixel [8]. Moreover, it is important to point out that all the sensors listed in Table I, are imagers, with the only exception of [10] and [7]. Their main function is to extract and deliver images at minimum energy budget. An interesting solution is reported in [13], describing an imager with energy-harvesting capabilities, implementing reconfigurable resolution aimed at managing a limited energy budget. However, this approach is possible at the cost of a reduced performance in image resolution. In fact, for each sensor readout, only half array is delivered, allowing the remaining pixels to be used as energy harvesters. For an imager, it is not possible to decide a priori which pixel is to be delivered and which is not, without losing information. This is somehow different for a vision sensor, whose main task is to extract one or few specific features from an image. In this case, it is possible to define a strategy aimed at detecting the pixels of the array which can be excluded from the readout without affecting the sensor performance. For example, for one sensor, extracting the visual contrast, it makes sense to deliver only those pixels, detecting a contrast larger than a certain value. This readout policy strictly depends on the status of each single pixel of the array. It can therefore change from frame to frame. Another approach is the multi-resolution [14], where, for example, 8-bit accuracy is used for high-precision imaging, while 4-bit is used for higher frame rate or, this could be our case, for reducing the power consumption. In our vision, a drastic power reduction is feasible especially for all those application based on events, such as monitoring and surveillance, where the system is required to execute a process upon a significant change in the scene. This can be efficiently implemented with a custom vision sensor as well as adopting a hierarchical approach, as described in Section II. In fact, a custom vision sensor can bring big improvements into the system, allowing it to operate with a much smaller energy budget, making it suitable to

Fig. 1. Example of contrast-based binary image directly extracted by the sensor.

be an energy-autonomous system, powered with small batteries with month lasting lifetime. The paper is organized as follows. In Section II, the hierarchical system energy management strategy is introduced. The overall vision sensor architecture is described in Section III, together with main functional blocks. Some aspects of the vision system realization are discussed in Section IV, while Section V deals with experimental results, which are mainly focused on the energy-management. II. HIERARCHICAL ENERGY MANAGEMENT The main peculiarity of an event-based application is that the information rarely happens. Therefore, an energy-efficient system has to be able at first to detect when it occurs, before doing anything else. This is the case of people monitoring application, where the time required for a person crossing the scene is very low compared with the time that statistically elapsed between two events. Fig. 1 shows a simplified example of a moving edges of a person walking through the scene. It takes a relatively small number of frames and the pixels involved are few compared with the image resolution (15%). In this case, the sensor should deliver data only when the person occupies the scene and stop communicating when nobody/nothing is moving inside. Therefore, two main sensor functionalities can be conceived: • no motion detected—no data delivering outside the sensor; • motion detected—sensor activity, data delivering, image processing, data communication. An efficient energy management can be implemented, using a hierarchical system architecture as depicted in Fig. 2, where three layers can be defined for the presented vision system: 1) 1st Layer—Sensing: The vision sensor acquires images and executes early visual processing in order to detect events and take low-level decisions;

COTTINI et al.: A CMOS ULTRA-LOW POWER VISION SENSOR

301

Fig. 2. Hierarchical approach for system energy-management.

2) 2nd Layer—Processing: The processing layer is usually turned off during the event detection process. It is woken up by the sensing layer only if this last detects an alert situation. In this case, a more complex image processing is required where a specialized hardware (DSP, FPGA) is required. The power consumption is much larger than that one in the 1st Layer. This is the reason why the 2nd Layer is forced into off state as soon as possible. Typical values range between 90%–95% or even more in surveillance applications. 3) 3rd Layer—Communication: Wireless communication is usually the most power hungry activity of the system. It requires tens of milliwatts of power, which is similar to the power consumption of previous layer. Therefore, it has to be carefully used, minimizing both the duty-cycle and the amount of data to be transmitted. One strategy is sending only symbolic information, instead of raw images. This information will be broadcast only after the event has been properly classified and recognized. Depending on the application, the information can be stored locally and sent after a certain time. For example, in people counting, the system can work as a data logger, broadcasting the statistical information once a day, thus drastically cutting down the communication duty-cycle. III. VISION SENSOR ARCHITECTURE The vision sensor directly extracts and binarizes the spatial contrast at pixel-level, optimizing the power consumption and simplifying the electronic implementation [15]. Binary data are delivered to the output, according to the address representation, which guarantees the minimum I/O bandwidth. The sensor architecture is targeted to ultra-low power consumption. Contrast is an image feature with high-pass characteristics, i.e. it detects spatial changes in light intensity, allowing a target to be distinguished from the surrounding background. The information is therefore concentrated around objects profiles, turning into a strong data compression. Therefore, events in the scene can be detected by monitoring the amount of pixels associated to the contrast. Next subsections describe the main building blocks of the sensor architecture. The sensor is organized in a 64 64 pixel array, a ROW DECODER which progressively pre-charges the

Fig. 3. Block diagram of the vision sensor architecture. On the right side, few examples of contrast based images are depicted.

Fig. 4. Contrast estimation. Principle of operation.

bit-lines and selects the rows during readout phase; a COLUMN DECODER looking for active pixels inside the selected row and delivering the position (row address) of those pixels (Fig. 3). The sensor readout is an asynchronous process; it only requires one external trigger to start (START) and ends with an EOF (End Of Frame). A 65th column of pixels (rightmost side of the array) is used to synchronize the row selection (ROW DECODER) with the disparity-check (COLUMN DECODER) processes. The presented vision sensor is an improved evolution of a previously published low-power chip [10]. Its main improvements are the simplification of the pixel topology and, most important, an optimized technique for data delivering. This last is aimed at reducing the internal power consumption and at efficiently implementing the interface between 1st and 2nd Layer (Sensing and Processing Layers) through a hierarchical approach. A. Smart Pixel Low-power spatio-temporal image processing is mostly accomplished inside the pixel. The basic operation is the estimation and binarisation of the spatial contrast, computed between the pixel itself and the two neighbors (PE, PN). This operation is accomplished during the integration time, as shown in Fig. 4.

302

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 3, SEPTEMBER 2011

Fig. 5. Schematic of the pixel embedding the contrast block. V is the analog contrast estimated among a three pixel kernel (blow-up).

Here, the main difference, with respect to [10], is in the contrast estimation. In the presented sensor, the neighboring pixels do not compete with PO in building the contrast voltage (VC), but only define the estimation window in which the contrast is estimated. The contrast voltage , as depicted in Fig. 4, is defined as (1) is the photo-generated current, related to the pixel where PO, and is the capacitance of the photodiode. Moreover, the estimation window can be expressed as (2) Replacing (2) in (1) (3) . Although (3) is not exactly the exwith pression of Weber contrast, but an approximation, it simplifies the hardware implementation, reducing the connectivity among the three pixels of the kernel (PO, PN, PE), very expensive in terms of silicon area, and turning into a larger fill-factor. Moreover, the photodiode is now driving only one source follower transistor (M5), improving the photodiode sensitivity. The pixel is based on a photodiode, working in storage mode (Fig. 5), discharging at a rate proportional to the light intensity, impinging on the active area, driving a source follower (M5), which forces a charge transfer between the charge reservoir capacitor (C1) and the signal capacitor (C2). This process is regulated in time by the most and the least illuminated pixels of the kernel, PN and PE respectively (Fig. 4). The charge transfer from C1 to C2 only occurs inside the estimation window

the charge is shunted from C1 to the reference voltage VP, being the series transistors M6, M7, M8 ON. After t3, M2, M3, M4 are ON, shunting node A to GND. To define the time window, each pixel is equipped with a voltage comparator, switching when the photodiode voltage reaches the common threshold VTH. According with the blow-up of Fig. 5, the reference pixel (PO) receives two binary inputs from the neighbors (QE, QN) and deliver QO to the two pixels of the kernel. This turns into a connectivity of 4, which is much smaller than the one of [10], equal to 8. The entire process is dynamic and is accomplished along the integration time, implementing a real pixel-level automatic auto-exposure control. The estimated analog signal (VC) is then binarized with respect to the threshold VTED before to be stored either into a local 1-bit latch or held at the output of the comparator. The pixel communicates with the outside world by means of two binary bit-lines (BA, BB). In general, a disparity between the two bit-lines means a change between two contrast values, estimated by the pixel at two different times. If we store a dark signal into the latch and compare it with the current one, we extract the contrast of the image. If we store the signal of the previous frame into the latch and we compare it with the current one, we detect motion by frame differencing. This information is directly used in the presented sensor for event detection applications, such as surveillance and monitoring. B. Data Readout The readout process is asynchronous and is triggered from outside by the signal START. Here, we have to distinguish two operating modes. • IDLE MODE: the sensor works at minimum energy, looking for events, with no communication with the outside world; • ACTIVE MODE: after detecting an event, the sensor wakes up and start acquiring images, extracting contrast and delivering data at frame rate.

(4) C. Idle Mode where (4) includes the gain factor C1/C2, due to the charge transfer process. Outside the time window, before reaching t1,

After the bit-lines are pre-charged, the first row is selected by the ROW DECODER. As soon as the bit-line are settled, the

COTTINI et al.: A CMOS ULTRA-LOW POWER VISION SENSOR

303

Fig. 6. Schematic of the sensor readout with row-skip capabilities.

COLUMN DECODER, consisting of 64 cascaded logic blocks, looks for bit-line disparity, starting from the first column of the array down to the last one. For each pixel with disparity, a pulse is provided which is counted by a binary counter. At the end of the row, an End Of Row (EOR) is provided, so next row is selected. This process is repeated 64 times, as many times as the number of rows. At the end of frame (EOF), the total number of active pixels, i.e. with disparity, is available and is compared with a user-programmed threshold. The sensor asserts a FLAG when this threshold is reached. This FLAG is used as an interrupt for the Processing Layer. D. Active Mode The process is the same as in the IDLE MODE, with the difference that here the COLUMN DECODER generates a pulse for each pixel, regardless its status. These pulses are counted by a binary counter, keeping trace of the position of the currently selected cell. In case one pixel of the row shows a disparity, the 7-bits address (6-bits plus sign) of that pixel is delivered to the output through (D0-D6) synchronized with the Write Enable (WRN). At the end of the row, the counter is reset, ready for a next row, and an EOR is generated. IV. SENSOR ENERGY MANAGEMENT IDLE MODE and ACTIVE MODE have different power consumption. Although the internal operations are more or less the same, i.e. image raster-scan and disparity detection, the main contribution to power consumption, comes from the I/O activity [10], which is minimized in IDLE MODE. However, if we think the power consumption of the system rather than the sensor, we see that the first one is much more important, minimizing the duty-cycle of the Processing Layer. In fact, this last has a power consumption two orders of magnitude greater than that of the sensor, or even more. If the vision sensor does not have this capability, the entire system will have a low energy efficiency. A. Normal Data Readout The sensor delivers data bursts at an equivalent frequency of 50 MHz. This requires a fast, power hungry device to interface

Fig. 7. Standard raster scan readout process.

the chip. It is therefore important to compress the readout time, in order to minimize the processor duty-cycle, even within a single frame, when the sensor works in ACTIVE MODE. If we look at the standard data readout of the sensor, shown in Fig. 7, we can see that the time allocated to each row, between two successive EORs, is the constant (2.3 s), regardless the active pixels density. If the row does not contain any active pixel, no information is delivered to the output, but the processor has to stay awake until an EOF, i.e. for 140 s. For sake of simplicity, assuming an integration time of 14 ms and a readout time of 140 s, a duty-cycle of 1% of the entire frame is devoted to data delivering. Due to the sequential nature of the raster-scan process, the time required to dispatch the information within one frame is

304

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 3, SEPTEMBER 2011

Fig. 8. Raster scan readout process with row-skip mode.

Fig. 9. Row-skip mode data readout for a dark image.

constant and does not depend on active pixels density. This turns into a waste of power because the digital processor has to stay awake more than what it is really needed. If we take a look at the waveforms, we notice that the EORs are uniformly distributed at a time step of 2.3 s, while WRN is only active in the middle of the time window, labeled with Data. In the remaining two slots (Empty), no data are delivered. In order to make the readout time to be proportional with the number of active pixels and not with the total number of pixels, we need to compress the Empty zones, without losing information, i.e. preserving image synchronization. B. Row Skipping Mode A simple but efficient solution has been implemented at circuit level, working during the array raster-scan, looking for empty rows and skipping them, avoiding their disparity-check. Although, the implemented technique strongly depends on the active pixels distribution inside the array, in our case it is very efficient, thanks to the type of processing performed by the sensor, motion detection of spatial contrast, turning into very sparse array of active pixels. Fig. 6 shows the circuit details of the implemented technique. The row skipping process can be better described through a list of operations, starting from the row selection: ; 1) bit-lines are pre-charge to ground ( ); 2) first row of the array is selected by the ROW DECODER; 3) bit-lines are settled; 4) 65th column of pixels with built-in disparity called Dummy-Column , defines the time at which all the bit-lines of the array are certainly settled. Here, FIND triggers the disparity-check process, accomplished by the COLUMN DECODER.

Fig. 10. Chip microphotograph.

5) DL is a bit-line, running along the COLUMN DECODER, pre-charged low and pulled-up by any active pixel of the selected row. In practice, it is a wired-or line. In case of an empty row, DL never toggles (stuck to ground). TL is a bit-line similar to DL, but it can be only pulled-up by the and . By design, TL is Dummy-Column, through always pulled up, at row selection. Moreover, TL toggles later than DL, even in the worst case, i.e. when the selected row has only one active pixel. TL is used as clock to sample DL. and forces 6) Row with active pixels: TL samples , which triggers the COLUMN DECODER, starting a disparity-check; and forces , 7) Empty row: TL samples blocking the COLUMN DECODER. The signal generates an EOR by shunting the last stage of the COLUMN DECODER, thus by-passing the entire disparity-check process; 8) Next row is selected.

COTTINI et al.: A CMOS ULTRA-LOW POWER VISION SENSOR

TABLE II CHIP POWER CONSUMPTION IN THE TWO READOUT MODES

305

TABLE III SYSTEM POWER CONSUMPTION IN IDLE MODE AND ACTIVE MODE. IN CASE ACTIVE MODE TAKES ONLY 5% OF THE TIME, THE AVERAGE POWER CONSUMPTION OF THE SYSTEM WILL BE 1 mW

TABLE IV MAIN SENSOR CHARACTERISTICS

Fig. 11. A simple system prototype based on the presented vision sensor.

The row selection process is similar to that one used in [10]. The main difference consists of the introduction of two bit-lines, TL (Toggle Line) and DL (Data Line), both crossing all the 64 cascaded cells of the COLUMN DECODER. Fig. 8 shows the improvement with respect to Fig. 7. Under normal readout, the time step between two EORs is 2.3 s. Using Row Skip Mode, this time is compressed to 125 ns, about 20 times less. Fig. 11 shows a comparison among normal readout, row-skip mode with test image, and lastly, row skip mode with a dark image. Here, the readout time takes 8 s to complete. V. EXPERIMENTAL RESULTS The power consumption of the vision sensor has been measured both in normal and in row-skip readout modes, using the input test image of Fig. 7, with 60% of empty rows. Measurements have been carried out separately on the chip core and on the chip pads (Table II). The row skip readout mode only affects the internal computation, which is not visible from outside, without touching the activity at the I/O. As expected, the I/O power consumption is almost constant in the two operating modes, depending on the pads activity, which is the same in both cases. This is reasonable because the information delivered by the sensor is preserved. For this reason, in the example of Fig. 8 the total power reduction is about 7 W (i.e. 18%). Moreover, its contribution to the power consumption is dominant with respect to the internal activity. The presented row-skip technique is simple although it is not very reliable in presence of noise in the scene or in the sensor itself. However, it can be efficiently used after providing one layer of binary spatial noise filtering, aimed at removing isolated active pixels from the selected row. This could be implemented quite easily right after the selected row is available on the bitlines. Another advantage of the Row Skip Mode over the Normal Mode is that the effective duty-cycle of an external device, interfacing the sensor can be reduced.

A preliminary vision system has been built up, interfacing the vision sensor with a low-density CPLD4 aimed at generating the timing for the sensor and reading high speed data. A tiny Controller manages the data communication with a PC. Assuming an operating duty-cycle of 5%, we will have a total power consumption of about 1 mW, as shown in Table III. This is not a big improvement due to the power consumption of the device generating the voltage references for the chip (VTED, VTH). Much work has to be done on this side. However, if we consider the power contributions of the other components of the system, we can demonstrate the advantages of the presented approach. Powered with a 950 mA h Li-ion battery, the system will have an operating lifetime of about four months. Next step is to use a high-density ultra-low power FPGA5 implementing the Processing Layer in order to execute complex visual algorithms as in [16]. In Table III, the components working 100% of the time are: sensor, CPLD and references. This are supposed to belong to the 1st Layer of Fig. 2. Currently, its duty-cycle only depends on the vision sensor alert detection capability, which is actually fairly crude. Using a more dense and advanced low-power FPGA , it is possible to implement simple but reliable filters, significantly reducing the false alarms right at the 1st level of the hierarchy. This would mean, cutting down the activity of the microcontroller (2nd Layer), whose power consumption is much larger than the previous components (9.9 mW). A duty-cycle of 5% is not so uncommon. Let consider a people flow application in a medium-size museum, counting up to 100 000 visitors/year with a peak value of 1000 persons/day. The estimated duty-cycle of the system, required to detect people crossing a gate, is about 4%. As always, it depends on the specific application scenarios. Moreover, in some case, such as people flow monitoring for market interpretation, a data logger is more effective than a real time system. This means that final output computed by the system is sent off-line to 4Xilinx.

Available: http://www.xilinx.com Products Group. Available: http://www.actel.com

5MicroSemi—SoC

306

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 3, SEPTEMBER 2011

the server (e.g. once or twice a day), cutting down the power consumption of the wireless communication. The main chip characteristics have been listed in Table IV. VI. CONCLUSION A vision sensor has been presented with early image processing and data compression, which is optimized for ultra-low power performance, targeted to event detection in surveillance and monitoring applications. The chip architecture embeds energy management capabilities aimed at reducing the duty-cycle of the entire vision system rather than the single sensor. This a first step in the implementation of a hierarchical energy management approach, emphasizing the system design methodology. A preliminary vision system prototype has been developed, based on the presented sensor, powered with a tiny battery and performing a lifetime of about four months with a duty-cycle of 5%. The presented approach for sensor-level energy management want to be an example, demonstrating the feasibility of an energy-autonomous vision system. ACKNOWLEDGMENT The authors would like to thank Gianmaria Pedretti for his support in the system interface and M. Chini for his help in building up the system. REFERENCES [1] W. Feng, B. Code, E. Kaiser, M. Shea, W. Feng, and L. Bavoil, “Panoptes: A scalable architecture for video sensor networking applications,” in Proc. ACM Multimedia, 2003, pp. 151–167. [2] J. Boice, X. Lu, C. Margi, G. Stanek, G. Zhang, R. Manduchi, and K. Obraczka, “Meerkats: A power-aware, self-managing wireless camera network for wide area monitoring,” in Workshop on Distributed Smart Cameras (DSC 2006), Boulder, CO, 2006. [3] M. Rahimi, R. Baer, O. Iroezi, J. Garcia, J. Warrior, D. Estrin, and M. Srivastava, “Cyclops: In situ image sensing and interpretation in wireless sensor networks,” in Proc. 3rd Int. Conf.. on Embedded Networked Sensor Syst., 2005, pp. 192–204, ACM. [4] L. Ferrigno, S. Marano, V. Paciello, and A. Pietrosanto, “Balancing computational and transmission power consumption in wireless image sensor networks,” in Proc. 2005 IEEE Int. Conf. on Virtual Environments, Human-Computer Interfaces and Meas. Syst. (VECIMS 2005), p. 6. [5] Micron, “1/4-inch VGA, ultra low-power, CMOS digital image sensor camera system-on-a-chip,” 2007 [Online]. Available: http://download. micron.com/pdf/flyers/mt9v131.pdf [6] Fraunhofer IMS, “Low power CMOS image sensor,” [Online]. Available: http://www.ims.fraunhofer.de/uploads/media/CMOS_Imager_for_Low_Power_Application_Fraunhofer_IMS.pdf [7] Z. Fu and E. Culurciello, “A 1.2 mw CMOS temporal-difference image sensor for sensor networks,” in Proc. IEEE Int. Symp. on Circuits Syst. (ISCAS 2008), pp. 1064–1067. [8] F. Tang, Y. Cao, and A. Bermak, “An ultra-low power current-mode CMOS image sensor with energy harvesting capability,” in 2010 Proc. ESSCIRC, 2010, pp. 126–129, IEEE. [9] K. Kagawau, S. Shishido, M. Nunoshita, and J. Ohta, “A 3.6 pw=frame 1 pixel 1.35 V PWM CMOS imager with dynamic pixel readout and no static bias current,” in IEEE Solid-State Circuits Conf. Dig. Tech. Papers, 2008, pp. 54–55. [10] M. Gottardi, N. Massari, and S. Arsalan Jawed, “A 100 w 128 2 64 pixels contrast-based asynchronous binary vision sensor for sensor networks applications,” IEEE J. Solid-State Circuits, vol. 44, no. 5, pp. 1582–1592, May 2009.

[11] S. Hanson and D. Sylvester, “A 0.45–0.7 Vsub-microwatt CMOS image sensor for ultra-low power applications,” in 2009 Symp VLSI Circuits, 2009, pp. 176–177, IEEE. [12] K. Cho, D. Lee, J. Lee, and G. Han, “Sub-1-V CMOS image sensor using time-based readout circuit,” IEEE Trans. Electron Devices, vol. 57, no. 1, pp. 222–227, 2010. [13] M. Law, A. Bermak, and C. Shi, “A low-power energy-harvesting logarithmic CMOS image sensor with reconfigurable resolution using two-level quantization scheme,” IEEE Trans. Circuits Syst. II, Express Briefs, no. 99, pp. 1–5, 2011. [14] A. Bermak and Y.-F. Yung, “A DPS array with programmable resolution and reconfigurable conversion time,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 14, no. 1, pp. 15–22, Jan. 2006. [15] N. Massari, M. De Nicola, and M. Gottardi, “A 30 W 100 dB contrast vision sensor with sync-async readout and data compression,” in 2010 Proc IEEE ESSCIRC, pp. 138–141. [16] L. Gasparini, R. Manduchi, and M. Gottardi, “An ultra-low-power contrast-based integrated camera node and its application as a people counter,” in 2010 IEEE 7nth Int. Conf. on Adv. Video and Signal Based Surveillance (AVSS), 2010, pp. 547–554. [17] C. Shi, M. Law, and A. Bermak, “A novel asynchronous pixel for an energy harvesting CMOS image sensor,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 19, no. 1, pp. 118–129, Jan. 2011. [18] A. Fish, S. Hamami, and O. Yadid-Pecht, “Self-powered active pixel sensors for ultra low-power applications,” in 2005 Proc. IEEE Int.Symp. Circuits Syst. (ISCAS 2005), 2005, pp. 5310–5313, IEEE. [19] M. Law and A. Bermak, “High-voltage generation with stacked photodiodes in standard CMOS process,” IEEE Electron Device Lett., vol. 31, no. 12, pp. 1425–1427, Dec. 2010. Nicola Cottini received the B.Sc. degree and the M.Sc. degree in telecommunication engineering from the University of Trento, Trento, Italy, in 2005 and 2008, respectively, and is currently working toward the Ph.D. degree at Fondazione Bruno Kessler, Trento, Italy, on the design of low-power CMOS vision sensors and systems with event-driven capabilities. His research interests also include the development of hardware-oriented low-energy vision algorithms for people monitoring and tracking.

Leonardo Gasparini (S’10) received the B.Sc. degree and the M.Sc. degree in telecommunication engineering from the University of Trento, Trento Italy, in 2004 and 2007, respectively, and the Ph.D. degree in information and communication technologies from the Department of Information Engineering and Computer Science, University of Trent, in 2011. For his doctoral studies, he has been working on the development and the metrological characterization of an ultra-low power wireless system equipped with a camera. In 2010, he joined the Integrated Optical Sensors and Interfaces Group, Bruno Kessler Foundation, Trento, Italy, where he was involved in the design of integrated optical sensors fabricated in deep-submicron CMOS technology. His research interests also include the design, the development, and the metrological characterization of embedded systems, with particular emphasis on low-power applications.

Marco De Nicola received the B.Sc. degree and the M.Sc. degree in telecommunication engineering from the University of Trento, Trento, Italy, in 2006 and 2009, respectively. In 2009, he joined the Integrated Optical Sensors and Interfaces Group at Fondazione Bruno Kessler, Trento, Italy, where he was involved in the design of electronic interfaces for low power vision systems. His research interests include also the design of lowenergy algorithms for embedded vision systems for surveillance and tracking applications.

COTTINI et al.: A CMOS ULTRA-LOW POWER VISION SENSOR

Nicola Massari (M’08) was born in Venezia, Italy, in 1973. He received the laurea degree in electronics engineering from the University of Padova, Italy, in 1999. Since 2000, he has been involved with the Center for Scientific and Technological Research (ITC-irst), Fondazione Bruno Kessler, Trento, Italy, as a Research Associate in the Integrated Optical Sensors Group, Microsystems Division. His research interests are in the field of CMOS integrated optical sensors with on-chip processing.

307

Massimo Gottardi (M’97) received the laurea degree in electronics engineering from the University of Bologna, Italy, in 1987. In the same year, he joined the Integrated Optical Sensors group of the Center for Scientific and Technological Research (ITC-irst), Fondazione Bruno Kessler, Trento, Italy, where he was initially involved in the design and characterisation of CCD and CCD/CMOS optical sensors arrays with on-chip analog processing, in collaboration with Harvard University, Cambridge,, MA. Since 1993, he has been involved in the design of CMOS integrated optical sensors. His research interests are mainly in the design of vision sensors with embedded image processing and in low power interfaces for MEMS.

Lihat lebih banyak...

A CMOS Ultra-Low Power Vision Sensor With Image Compression and Embedded Event-Driven Energy-Management

Descripción

Comentarios