CIMA based remote instrument and data access: An extension into the australian e-Science environment

July 24, 2017 | Autor: Ian Atkinson | Categoría: Data Management, Data Visualisation, Data Collection, Data Access, Workflow Management System
Share Embed


Descripción

CIMA Based Remote Instrument and Data Access: An Extension into the Australian e-Science Environment Ian M. Atkinson,1* Douglas du Boulay,2 Clinton Chee,2 Kenneth Chiu,3 Tristan King,1 Donald F. McMullen,4* Romain Quilici,2 Nigel G.D. Sim,1 Peter Turner,2* Mathew Wyatt.1 1

School of Information Technology, James Cook University, Townsville, QLD 4811, Australia. 2 Department of Chemistry, University of Sydney, Sydney, NSW 2006, Australia. 3 Computer Science Department, SUNY Binghamton, Binghamton, NY, USA. 4 The Pervasive Technology Labs at Indiana University, Bloomington, IN 47404 USA. [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Abstract

The Common Instrument Middleware Architecture (CIMA) is being used as a core component of a portal based remote instrument access system being developed as an Australian e-Science project. The CIMA model is being enhanced to use federated Grid storage infrastructure (SRB), and the Kepler workflow system to, as much as possible, automate data management, and the facile extraction and generation of instrument and experimental metadata. The Personal Grid Library is introduced as a user friendly portlet interface to SRB data and metadata, and which supports customisable metadata schemas. An Instrument Instruction Module has been introduced as a CIMA plug-in for instrument control. A virtual instrument portlet provides a simulation of the instrument during a data collection. The system is being further augmented with a tool for collaborative data visualisation and evaluation.

1. Introduction. Coast to coast, the breadth of the island continent of Australia spans some 5000 km, so it is easy to appreciate that the provision of remote access to resources such as scientific instruments and their data offers significant efficiency and cost benefits. Further gains may be provided by harnessing distributed resource technologies such as Grid storage and compute resources, work-flow tools and Web services.

Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

In order to explore this potential, we are developing a pilot network of X-ray diffraction instruments that are being equipped with Grid enabled services for remote access.

Figure 1. Nodes defining a test-bed instrument access network.

The test-bed network encompasses instruments at James Cook University, Monash University, the University of Queensland and the University of Sydney (Fig. 1). Geographically these sites span the populated east coast of Australia - a distance of over 3000 km. The X-ray diffraction instruments that define the network provide data for the crystallographic

determination of chemical and biological molecular structures. X-ray crystallography provides an attractive test domain, with well defined work-flows and data structures, and employs relatively common (if not standardised) instrument types. Federation through the use of the Grid Storage Resource Broker (SRB)[1] is intended to enable seamless data access and replication across the distributed partner institutions (Fig. 2). The security and access rights issues associated with this store are presently being considered.

2. Remote Instrument Monitoring and Data Management. A number of significant modifications and enhancements have been made to the CIMA model, in developing a flexible and Grid enabled portal system for remote instrument monitoring and data management. In summary, these changes include the following: • NFS and MySQL data manager replaced with an SRB and MCAT backend • C++ and Perl data manager replaced with Kepler[4] for facile data and metadata management • Use of Personal Grid Library (PGL)[5] for user friendly SRB data manipulation • Stored experimental data easily retrieved and annotated • Additional visual diagnostics for sensors streams (e.g. graphs, visual indicators) The remote access portal is also planned to include down stream data processing and analysis of the collected data. The interplay between the components of the system is illustrated in Figure 3.

Figure 2. Australian CIMA deployments and proposed data federations.

We’ve adopted the Common Instrument Middleware Architecture (CIMA)[2] model as the key underpinning technology. The CIMA project is developing a Web services based proto-type API that embodies a generic or abstract representation of an instrument in terms of its type, identity, data and metadata output streams. The CIMA model promotes re-usability and hence a reduction in coding effort in changing instrument environments or eco-systems. CIMA is designed to be scalable from multiple small sensors through to major facility instrumentation. A companion paper[3] presents CIMA in detail and highlights the linkage of instruments into the Grid. Here we present an adaptation and extension of the capabilities of CIMA, through its use as a component in developing a rich GridSphere portal environment for collaborative remote instrument and data access. In close collaboration with the CIMA project principals, this work is being undertaken as part of an Australian e-Research program.

Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

Figure 3. Structure of the new CIMA based portal system for instrument monitoring and data management.

The architecture is extensible and the storage component is not restricted to SRB, however as SRB offers many useful features, it is an attractive choice. Useful attributes of SRB include the ability to directly associate metadata with data, the integrated Grid Security Infrastructure (GSI), and the ability to federate repositories and simplify inter-institutional interactions. The GridSphere portlet interface to SRB is provided by Personal Grid Library (PGL) portlets (Fig. 4). PGL has been developed at James Cook University and provides a user friendly Web interface to SRB data and metadata, and it supports customisable metadata

templates (schemas). These templates strictly define the keys and values which are valid for each object type, ensuring metadata coherence and integrity. As well, PGL supports SRB annotations which are utilised by CIMA for further tagging the data (annotations are on a per file basis). Annotation or metadata based searches can be undertaken through the PGL interface. These portlets are intended to be reused within other SRB enabled portlet projects. The metadata template system within PGL is intended to be a flexible method of ensuring metadata integrity and consistency. It is implemented on top of the generic metadata system provided by SRB, which allows key-value pairs to be stored against objects. The template system allows for multiple document types to be defined. Each type has a schema associated with it, defining the value list of keys, the data type for these keys, and the valid values for the data. As well, friendly names and descriptions are associated with each type, so front ends like PGL can render a selfexplanatory interface. These templates are stored as an XML file in a directory in SRB. This directory is defined as the root of the library, and the templates are recursively applied to all sub-directories below it.

Figure 4. PGL interface to image files stored in SRB. Metadata image file previews are also available.

As can be seen in Figure 3, Kepler plays a key mediating role in the work flow and data management. For use cases of this instrument monitoring system outside of the domain of X-ray crystallography, the data management requirements will most likely be very different. Data management customisation for new applications would normally require a significant amount of new code. However, by exploiting Kepler workflows the development effort can be dramatically reduced. Using Kepler a customised data manager workflow can be configured in days, rather than weeks or months. Workflows in Kepler can be exported into XML, which can then be deployed to other instances of Kepler at different sites.

Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

Kepler also facilitates the capture and storage of metadata in the MCAT. Currently Kepler does not support Web services, so an intermediate Web service container (Apache Axis) has been used. This intermediate contains a deployed Web service that acts as a sink for data. As parcels arrive at the sink they are packaged into a java bean and forwarded to the Kepler service.

3. Remote Instrument Control. Providing a remote user with a capability to operate expensive and possibly dangerous equipment presents significant risks and challenges. In this section we describe the utilisation of CIMA as a core component in a comprehensive portal system for remote control of an X-ray diffractometer instrument. The implementation of CIMA in the system required further adaptations and extensions to the CIMA framework. The portal system includes the following features: • Instrument Instruction Module plug-in for CIMA • Portlet for user input of data collection defining parameters • Pushed instrument data for rapid updating of portlet content to support safe operation • A Data Cache for browsing, examination and analysis of data • A virtual instrument portlet to simulate the instrument undertaking a data collection and to provide a constantly updated representation of the current state of the instrument These features are described in more detail below. A significant disincentive to developing a ‘custombuilt’ software system to enable access to a remote scientific instrument, is that there is a high coding overhead that may well reproduce functionality already provided by an instrument manufacturer. Accordingly, a commonly adopted means of providing access to remote resources, is to employ remote desktop tools such as VNC[6] (and its many variants), CITRIX[7], Tarantella[8] and NX[9] The remote desktop solution is easy to implement, and has the further advantage of providing the client user with a familiar working environment. However, while convenient, these approaches are not ideal and can afford remote instrument users with excessive control over expensive and potentially dangerous instruments. A significant advantage of the custom-built approach, is that the actions of the remote instrument user can be tightly controlled, while at the same time services outside the desktop environment can be provided to offer a richer operating environment.

Portal containers provide an attractive environment for developing and hosting feature rich, configurable, scalable and extensible remote resource access systems. Portal technologies facilitate maintenance and portlet re-use, comfortably accommodate SOA methods and modules, have recognised standards and an open source development community. Accordingly, we are developing a CIMA based GridSphere (JSR 168) portal/portlet system for remote instrument control, monitoring and data processing. In effect, CIMA service end-points underpin the portal based system. Thus far CIMA software has been developed solely for remote instrument and sensor monitoring, and there has been no provision for remote instrument control and experiment steering. Our adaptation and extension of CIMA to provide services supporting instrument control, has been undertaken in accord with the CIMA channels and plug-in model.[2]

Figure 5. CIMA as a component in a modular system for remote instrument control.

A modular SOA model using Web services has been adopted for the portal system. The architecture employs two primary containers (A and B; see Fig 5) to provide implementation and application flexibility. Container B is located at the instrument site to provide instrument services, while container A provides user access interface components (portlets) that need not be co-located with the instrument. As an illustrative use case scenario, several institutions under a common project may want to provide remote access to a shared instrument or set of instruments. Accordingly, a user interface container may be beneficially located at each of the user institutions. Alternatively, container A could be shared between multiple institutions to provide remote access to multiple shared instruments (not necessarily located at one site). The second model may be desirable for large facilities such as a synchrotron, for which a single container B could serve

Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

multiple instruments or beamlines. The architecture is flexible and other combinations are possible. As indicated in Figure 5, CIMA service endpoints are integrated into Web services containers (Axis) located in the primary containers (JSP; Tomcat). Container A hosts a portlets container providing the user interface components, a Data Cache, a Web services container to receive SOAP calls from container B, and a CIMA component used to send XML parcels via SOAP to container B. In complementing container A, container B contains elements to interact directly with the remote instrument or instruments. Within container B there is a Web services container that relays SOAP messages sent from the CIMA component in container A (sink) to a CIMA component within container B (source). The CIMA source serves as an interface both to the instrumentation and to the data manager. The CIMA source also send XML parcel containing SOAP messages to the CIMA sink of container A, via the Web services container in A. There are two types of communication that can occur between the two primary containers. The first is a request-response process in which a request is sent from container A to container B, and for which a response is expected. The response may simply be a message acknowledgement, or it may reflect the result of a status changing command sent to the instrument. The CIMA component in container B extracts data from the XML parcels in the SOAP message, and an Instrument Instruction Module plugin then builds instrument-specific commands to be sent to the instrument. Consequential responses from the instrument are encoded by the CIMA component, into an XML parcel contained in a SOAP message then sent to container A. The requests themselves fall into two categories; requests that affect the instrument state (SET requests, eg actuator commands) and requests sent to retrieve instruments information (GET requests). The Instrument Instruction Module has been introduced to as a CIMA plugin to provide an instruction interface to proprietary instrument command and control software (or device drivers). The instructions may serve to get instrument status information or to change the state of the instrument (control the instrument). The module effectively translates CIMA parcels into instrument specific instructions to be sent to the instrument interface. Work is currently underway to generalise this initiative. The second type of inter-container interaction occurs when an instrument pushes information. The push may be the result of a previous asynchronous command, state changes in the instrument, or because

the instrument otherwise sends data on a regular basis. In this case the CIMA source again sends an XML parcel via SOAP to container A, where the CIMA sink updates the portlet content and transfers relevant data into a temporary store called the Data Cache. The Data Cache contains information about the instrument, and is populated when data pushed from the instrument arrives, and when a request initiated response containing instrument information is received. The Data Cache is used to avoid SOAP calls when a GET request is issued and the desired data is already in the cache. The cache has the same lifecycle as container A, and its use may be instrument domain dependent In the present case of X-ray diffraction instrumentation, the raw measurement data is provided as CCD derived images or ‘frames’ that are typically encoded in a proprietary compression format. The frames are sent (pushed) from the instrument into the CIMA source of container B, where they are parcelled for transfer out to a data manager and into a storage resource. container B also provides a service to convert the proprietary CCD data image files into smaller file size JPEG images for rapid transfer to container A for portlet display and storage in the Data Cache. Retaining the JPEG images in the Data Cache facilitates image browsing and inspection as the data collection proceeds. The client user may then assess whether a data collection should indeed proceed. A crucial requirement for the safe and efficient control of a remote instrument, is the delivery of realtime status information to the client portal. Meeting this requirement is an essential pre-requisite for the effective monitoring and control of operations and events that critically depend upon user response times. Ideally the portlets would be updated immediately upon data being pushed from the instrument. Accordingly, AJAX[10] and Pushlets[11] technologies are being employed to engage the challenge of providing (pseudo) real-time updating of instrument status portlet content. The instrument access interface currently has an instrument control pane (Fig. 6) that is augmented with a pane for instrument status monitoring (eg generator voltage and current, cooling water temperature, CCD camera temperature etc.). The control pane provides for user input of data collection definition parameters and a means for initiating and aborting a data collection. The current data frame is displayed in the portlet, together with web cam views of the instrument and the crystal sample. Additional panes are under development, and these include data examination, analysis and processing panes. We envisage that portlet panes will ultimately be fully customisable so that users can arrange functionality according to their own

Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

preferences and requirements. This would also enable configuration according to the users’ access privileges (role).

Figure 6. GridSphere remote control portlet.

A further portlet capability under investigation provides a virtual representation of the instrument (Fig. 7) that can be rapidly updated from a small number of parameters pushed from the instrument. The virtual instrument thus provides a (pseudo) real-time means of monitoring the instrument during a data collection. The virtual instrument representation augments streaming video and provides a solution to the ‘dark laboratory’ problem.

Figure 7. Portlet based diffractometer instrument simulator.

Importantly, the virtual instrument also allows the remote user to simulate a data collection strategy and thereby visually assess its viability and suitability before the collection is actually undertaken. Similarly, the instrument simulator may serve as a tool for training novice users. The simulator provides a valuable portal capability for instrument monitoring, and for facilitating the safe operation of the remote

instrument. We are also developing a stand-alone version of the virtual instrument. Our implementation of an instrument simulator builds upon and extends the DS diffractometer simulation software package developed by Zheng, M. Yao and I. Tanaka.[12] In adapting the DS package we have replaced the use of OpenGL with an exploration of the potential benefits of X3D and XJ3D.[13] The X3D virtual reality system generates XML files that are externalised from the source and are easily edited. .Importantly, X3D can be externally scripted via JavaScript, which allows the direct inclusion of X3D based virtual reality models within XHTML pages. As part of the evolution of the X3D specification, a picking extension is being introduced through XJ3D. In the context of this work, the new feature has the attraction of offering a virtual instrument collision detection mechanism. Although X3D is in its infancy, X3D models can be examined by an increasing number of renderers and web-browser plugins.

4. Platforms for Collaboration. While multiple users may join a remote access session through a browser, only one user can control the instrument at given time, though control may be transferred to another user. The system provides then for collaborators to monitor an experiment and collectively determine the best approach to data collection and processing.

small preview of the selected image appearing under the directory list and adjacent to a larger display of the selected image. A user may zoom in one features of particular interest in the larger display, while the smaller reference image remains unchanged. In Figure 8, two users are contrasting two images in a tiled display. Each user has zoomed in on a particular region of interest. Currently the embryonic tool is intended for use as a stand alone application over a VPN, and we are exploring the possibilities for incorporating its functionality into a portlet. The viewer has an image zoom function and future work is to incorporate annotation and analysis facilities.

5. Conclusion. As part of an Australian e-Research program, the capability and flexibility of the CIMA model is being enhanced and extended through its incorporation into a comprehensive remote instrument control and monitoring system. Data flow is directed into userfriendly Grid storage infrastructure (PGL and SRB) via the Kepler workflow tool. The system provides for collaborative remote control and monitoring of instruments, and support for collaborative interactions is being enhanced. The introduction of a virtual instrument brings the advantages of data collection simulation, dark laboratory instrument monitoring and a safe user training device. Future work will introduce a Grid enabled remote data processing and analysis capability, and will explore options for verbal communication and secure user access to remote resources. Our goal is to develop a comprehensive virtual research environment for molecular and materials structure determination and analysis.

6. Acknowledgements. Figure 8. A platform for collaboration in which geographically distributed users may independently select and manipulate images.

In order to extend our collaborative capabilities, we are currently developing a Java-based application that allows multiple users in different locations to collaboratively select and inspect images, such as Xray diffraction images, in a such way that each users image and view can be seen by the other participants (see Figure 8). The images may be viewed, contrasted and manipulated by the collaborators, either in a ‘side by side’ tiled multiple image display or in separate tabbed panes within the viewing frame. Current functionality allows the users to select an image, with a

Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

The support of the Australian Department of Education, Science and Training (DEST), for the DART project (www.dart.edu.au), and the Australian Research Council, GrangeNet and the ARC Molecular and Materials Structure Network (mmsn.net.au) is gratefully acknowledged. CIMA is supported by National Science Foundation cooperative agreements and grants SCI 0330568 and MRI CDA-0116050.

7. References. [1] Storage Resource Broker: www.sdsc.edu/srb [2] T. Devadithya, K. Chiu, K. Huffman, D.F. McMullen, “The Common Instrument Middleware Architecture: Overview of Goals and Implementation”, In Proceedings of the First IEEE International Conference

[3]

[4] [5] [6] [7] [8]

on e-Science and Grid Computing (e-Science 2005), Melbourne, Australia, Dec. 5-8, 2005. Donald F. McMullen, Ian M. Atkinson, Kenneth Chiu, Romain Quilici, Peter Turner, Mathew Wyatt; “Toward Standards for Integration of Instruments into Grid Computing Environments.” Submitted to Proceedings of IEEE eScience 2006. Kepler: www.keplerproject.org Personal Grid Library Project, VeRG Lab, JCU: https://plone.jcu.edu.au/hpc/staff/projects/hpcsoftware/personal-grid-library. Virtual Network Computing; www.realvnc.com/ (also TightVNC, RealVNC, UltraVNC, and TridiaVNC) CITRIX: www.citrix.com Tarantella; now renamed Sun Secure Global Desktop: www.sun.com/software/products/sgd NoMachine (NX); www.nomachine.com.

[9] [10] Asynchronous JavaScript Technology and XML

(AJAX): java.sun.com/developer/technicalArticles/J2EE/AJ AX/ [11] Pushlets: www.pushlets.com/site-main.html [12] Zheng, M. Yao and I. Tanaka. “DS - a 3D graphics simulation program for single-crystal diffractometer.” .J. Appl. Cryst. (1995). 28, 225227. [13] X3D: www.web3d.org /about/overview/

Proceedings of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06) 0-7695-2734-5/06 $20.00 © 2006

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.