A sensory scientific approach to visual pattern recognition of complex biological systems

Share Embed


Descripción

Food Quality and Preference 21 (2010) 977–986

Contents lists available at ScienceDirect

Food Quality and Preference journal homepage: www.elsevier.com/locate/foodqual

A sensory scientific approach to visual pattern recognition of complex biological systems Magni Martens a,b,*, Siren R. Veflingstad d, Erik Plahte c, Dominique Bertrand e, Harald Martens a,b,c a

Nofima Mat, Osloveien 1, N-1430 Ås, Norway University of Copenhagen, Faculty of Life Sciences, Rolighedsvej 30, DK-1958 Frederiksberg C, Denmark c Department of Mathematical Sciences and Technology, Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences, N-1432 Ås, Norway d Systems Biology Centre, University of Warwick, Coventry CV4 7AL, United Kingdom e Institut National de la Recherche Agronomique, Equipe Bioinformatique, Rue de la Géraudière – BP 71627, 44316 Nantes Cedex 3, France b

a r t i c l e

i n f o

Article history: Received 6 October 2009 Received in revised form 28 April 2010 Accepted 29 April 2010 Available online 7 May 2010 Keywords: Sensory descriptive analysis Computer-derived image analysis Mathematical modelling of biological systems Partial Least Squares Regression

a b s t r a c t A sensory scientific approach for exploring and interpreting image patterns is presented. It is used for analysis of the behaviour of a complex mathematical model — in this case representing two-dimensional pattern-generating protein signalling during cell differentiation. The approach consists of several consecutive research steps, each including statistical planning, image production, image profiling and multivariate data analysis. Initially, a high number of images were produced and profiled by automatic but non-selective computerised image analysis profiling. Then the most interesting images were analysed by descriptive sensory profiling, in two consecutive, increasingly focused experiments. Partial Least Squares Regression models were applied, on one hand, to predict the sensory profile from automatic image analysis, and, on the other hand, to relate the sensory profile to the mathematical model parameters. Previously unknown pattern types for this biological system were thus revealed. Finally, a preliminary sensory morphological wheel was proposed. Ó 2010 Elsevier Ltd. All rights reserved.

1. Introduction 1.1. The human visual system in action The human visual perception and language capabilities provide an amazingly efficient measuring system for complex samples, as demonstrated by the extensive use of visual terms in descriptive sensory analysis based on trained assessor panels. Scientists in their daily work, both in academia and in the industrial R&D, rely on their own eyes for qualitative and quantitative evaluations, although often informally, subjectively and without recognising it. This paper demonstrates how sensory science can turn human visual sensory perception into relevant and reliable profiling of scientific systems that are too complex for traditional theoretical analysis. It shows how visual information can be acquired, systemised and made operational as efficient tools in scientific projects to model and understand spatially complex biological systems. Recent knowledge about visual attention from a physiological and psychological point of view is documented in Bundesen and Habekost (2008). What we select to perceive visually is an interac* Corresponding author at: Nofima Mat, Osloveien 1, N-1430 Ås, Norway. Tel.: +47 48134856; fax: +47 64970333. E-mail address: magni.martens@nofima.no (M. Martens). 0950-3293/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.foodqual.2010.04.013

tion between the environment and ourselves. The human senses are constantly in action, not just being passive receivers (Gibson, 1979; Harper, 1972; Martens & Tschudi, submitted for publication). A clear interest for sensory methods for visual pattern recognition was evident at the 8th Pangborn Sensory Science Symposium in Italy 2009 (www.pangborn2009.com). The present study is based on descriptive sensory evaluation of spatial organisation (textures and patterns) of complex samples, from images printed on paper. Previous examples of this are known in food research, e.g., assessing electron microscopy images of whey protein gels (Langton & Hermansson, 1996), and studying structural heterogeneity of potatoes from fMRI-images (Martens et al., 2002). In the present paper, the application comes from systems biology. The word ‘‘texture” is currently utilised in the domain of image analysis, as well as deeply inspired by Gibson’s use of ‘‘texture” in a visual perceptual context (Gibson, 1950, 1979). The purpose of this paper is to outline a sensory approach for revealing, systemising and interpreting image patterns from biological systems — being reliable and valid in communication across various scientific fields. It describes the sensory part of a study of a complex pattern-generating process (Martens et al., 2009). The challenge was to measure ‘what is going on’ in biological cells, in a way that can be translated into qualitative understanding and quantitative prediction.

978

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

The choice of reference images, terminology and scale definition in visual assessment will invariably depend on the system to be analysed. But generic reference image collections do exist, for instance Brodatz (1966) published a photographic album for picture textures, for artists and designers. Here we outline, among other things, a preliminary version of a generic way to structure the sensory profile terminology for the analysis of texture and patterns in images of biological structures — a visual sensory vocabulary structured into a sensory ‘morphological wheel’. This addresses other biological research on shape and size and geometrical forms within the science of morphology and will be outlined in Section 3.3. 1.2. The system studied: a mathematical model from systems biology The sample images of the biological system to be studied here are not from the system itself, but from a mathematical model intended to represent a simplified version of current knowledge about the system. This model consists of a large set of coupled non-linear dynamic differential equations describing a certain cell differentiation process in a two-dimensional lattice of cells. This mechanistic model has the capacity to generate spatial patterns of various kinds. Unexpected, complex large-scale patterns are very difficult to reveal and study by traditional mathematical model analysis, if their character were a priori unknown. For example, in the present model, Collier, Monk, Maini, and Lewis (1996) primarily described local three-periodic regularities, not the larger-scale patterns emerging when running the two-dimensional multi-cellular model over time till steady state. How can a butterfly develop its beautiful patterns? How can the organs and limbs of a body become so different, when they all started from the same shared DNA information in the fertilised egg cell? Cell differentiation is still only partly understood in biology. The mathematical model studied here describes how thousands of cells in a two-dimensional lattice of cells develop and change over time. It represents a simple example of how different signalling proteins interact within and between cells, as a step in the fascinating process leading from a single cell in an early embryo to a fully developed adult organism. The model is oversimplified, involving only two signalling proteins and cells in a very regular two-dimensional lattice. But it still has sufficient dimensionality, non-linearity and positive feedback to make theoretical prediction of model behaviour from known modelling conditions very difficult. The sample images to be studied here are not of the biological system itself, but images of spatial patterns generated by the model. Our motivation was to develop sensory science as a generic tool for empirical studies of the behavioural repertoire of overwhelmingly complex mathematical pattern-generating models. The model is a description of so-called lateral inhibition mediated by Delta–Notch signalling (Collier et al., 1996). This system was chosen because it has been explored by both biologists and mathematicians and has relevance to sensory psychophysics (Veflingstad, 2006). Delta (D) and Notch (N) are both trans-membrane proteins that interact only between cells in direct physical contact. D is a ligand that binds to and activates N in neighbouring cells, while N inhibits the activity of D within the same cell. To illustrate how interactions between neighbouring cells cause lateral inhibition, consider a two-cell system (Fig. 1A). When N is activated in cell 1, the production of D is suppressed in the same cell. Then N is suppressed in cell 2, which in turn relieves the inhibition of D, thereby increasing its activity, in this cell. Overall, these results are reflecting an increased activation of N in cell 1, which in turn strengthens the inhibition of D, and vice versa in cell 2. In other words, there is a positive feedback loop between pairs of neighbouring cells, driving them towards opposite fates: a cell that produces more ligand forces its neighbours to produce less. In a two-dimen-

sional configuration of square cells, this leads to the well-known, regular checker-board pattern of two-cell states: high D/low N and low D/high N. However, in the more biologically relevant packing of hexagonal cells in, e.g., a 50  50 theoretical cell lattice, these cells lead to patterns that in most cases are highly irregular, with many different protein levels and intricate macroscopic patterns. High N in cell 1 leads to low N in the neighbouring cells 2 and 3, and this causes frustration if cell 2 and cell 3 are neighbours. The mathematical differential equation model for how protein activities of D and N develop over time in each cell is controlled by five model parameters (hD, hN, pD, pN, and l), see Fig. 1B, through sigmoidal stimulus–response curves (S) as shown in Fig. 1C. The thresholds hD and hN define the activity levels at which the two stimulus–response functions SD and SN reach their half-maximum, or the levels at which the response is most sensitive to changes in the stimulus. The steepnesses pD, and pN determine how sensitive the response is near the threshold, or how steep the response curves are. The final parameter l is simply the ratio between the decay rates for D and N. However, the initial state of the lattice, i.e., the state from which the pattern evolves, will usually also affect the overall patterning process, and thus needs to be specified. The chosen initial conditions should mimic the initial properties of the biological lattice, in which all cells have almost equal levels, but are most likely (not exactly) identical because of small fluctuations within each cell. By a theoretical analysis it has been shown that the model in Fig. 1B has a homogenous steady state N* and D*, i.e., a state in which all cells are equal and in which the activity levels are not changing (Fig. 1D). However, in many cases this state is unstable, implying that any small random changes in the protein activity of the cells will cause the system to move away from the homogenous state, and eventually approach a patterned state, as visualised in Fig. 1A. Thus, a state in which all cells are slightly perturbed from the homogenous steady state in a random fashion seems an appropriate initial state. Here these initial perturbations are represented by two perturbation parameters: general perturbation size (g, the percentage of the homogenous steady-state level) and perturbation direction (s, up (+) or down () relative to the homogenous steady-state level), giving a total of seven parameters (Fig. 1D). In our analysis, the input is the chosen values for the model parameters and random initial conditions, while the output consists of images of the pattern of cell types evident after the simulated differentiation process has reached a steady state. One challenge is to discover, quantify and distinguish patterns that may arise in the output images under different conditions. Another challenge is to predict output patterns resulting from chosen input parameter values.

2. Methods 2.1. Overview of methods By theoretical analysis of the ‘hard’ mathematical model, we struggled to describe ‘what is going on’ during cell differentiation, due to high complexity and a very high number of mathematical equations. Instead, using our senses actively when seeing patterns, we developed an approach to achieve a more informative analysis, in successive steps (carried out in the period 2005–2008). In each step, the data were interpreted by multivariate analysis, using cross-validated PLS regression (Martens & Martens, 2001; Wold, Martens, & Wold, 1983). First, computer simulation studies of the mathematical model in a theoretical two-dimensional lattice with 50  50 cells were carried out. Initially the parameter values were chosen by trial-

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

979

Fig. 1. Introduction to the biological system of Delta–Notch regulation of cell differentiation and its mathematical description. (A) From initially equal Delta and Notch levels, neighbouring cells are driven towards different fates, due to a positive feedback loop between them (see text for further description). (B) The mathematical model consists of a pair of differential equations, one each for Notch and Delta, respectively, for each of the thousands of cells in a two-dimensional lattice. Delta and Notch affect each others’ rates via sigmoidal stimulus–response curves. (C) Examples of sigmoidal response curves (represented by the Hill function defined in (B). Parameters are h = 0.3, p = 2 and h = 0.3, p = 5. Neighbouring cells are connected through the local spatial averaging of Delta. (D) The system has a total of seven parameters.

and-error, with visual inspection of the resulting steady-state images, and then according to a full factorial experimental design for the five model parameters and the two perturbation parameters (s and g). The steps of image generation and image characterisation are now explained in more details. 2.1.1. Image generation For each of the samples in the experimental design, the Notchand Delta-cell activities in a two-dimensional 50  50 or 51  51 cell lattice were obtained by integrating the differential equation system till steady state. By preliminary, informal inspection, the images of the Notch activity were deemed to be more informative than the Delta images, and therefore used in the subsequent analyses. The Notch level was defined on a black/white scale to render a two-dimensional image of each solution. The shade of each cell was based on the final level of N: a white cell corresponds to a cell with a value of N close to zero while a black cell has a value of N close to 1 (see, e.g., Fig. 2). Each such steady-state output was converted to a .pdf image file for quantitative profiling by computer image analysis and sensory analysis. 2.1.2. Image characterisation The steady-state images were characterised by automated computer-based image analysis, and the resulting profiles were related to the model parameters such that a subset of particularly informative images could be chosen, using a modified fractional factorial design that spanned the relevant space (Martens et al., 2009). The selected solution patterns/images were then described in

two ways: (1) using a more detailed automated computer-derived image analysis (see Section 2.2) and (2) using sensory descriptive analysis (see Section 2.3). Finally, some surprising discoveries from the sensory analysis were pursued in more detail by a new sensory evaluation, using a lower-dimensional experimental design and a 51  51 cell lattice.

2.2. Computer-derived image analysis The .pdf files were submitted to automated computer-derived image analysis profiling, based on a range of standard image analyses (intensity grey-scale summaries, spatial auto-correlation summaries, spatial clustering summaries, etc.). An overview of the most important descriptors and their definitions is given in Table 1. For the present purpose, the two most important groups of descriptors (Martens et al., 2009) refer to grey-scale intensity statistics and spatial auto-correlation statistics. The former simply refers to the distribution of the blackness of the cells (blackness representing activity level of Notch), while the latter contains information on periodicity in the pattern, and is similar to cooccurrence matrices (e.g., Haralick, Shanmugan, & Dinstein, 1973). As the model was known to exhibit a (local) three-periodic pattern, a descriptor describing the deviation from this pattern was included. Various more advanced cluster analysis based summary characterisations were also used (see Table 1 for a brief explanation). The image analysis profiles were related to the known mathematical model parameters and initial conditions. By multivariate data modelling good predictive relationships were found, and

980

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

Fig. 2. Illustration of some of the sensory descriptors used as reference samples (see Table 2). (a) TwoHeadedness; (b) Curls; (c) Whiteness; (d) PatternBlack; (e) ThicknessCurls; and (f) MultiShade.

Table 1 Overview and description of extracted features. See text for details (Section 2.2). Features Grey-scale DMean, DStd, DMax, DMin NMean, NStd, NMax, NMin DMinusNMean, DMinusNStd, DMinusNMax, DMinusNMin Auto-correlation ALMean, ALp3corr

AGMean, AGp3corr

ADMean, ADp3corr

C3Mean, . . ., BErod

Definition

Interpretation

Mean, standard deviation, maximum and minimum over all cells Mean, standard deviation, maximum and minimum over all cells Mean, standard deviation, maximum and minimum over all cells

Final value of D

Difference between final levels of D and N in a given cell

Mean and closeness to the theoretical expected spatial pattern Mean and closeness the theoretical expected spatial pattern Mean and closeness to the theoretical expected spatial pattern

L denotes local 4  4 matrix for max. three cell shifts G denotes global 26  26 matrix for 25 cell shifts D denotes distant 11  11 matrix for 15– 25 cell shifts

A series of cluster types

Different local patterns

Final value of N

some parameter combinations proved to be more important than others in predicting the output patterns (not shown here). 2.3. Sensory descriptive analysis An alternative approach is to take advantage of the human ability to detect and interpret patterns. The human visual system is a

powerful system for parsing and assessing visual stimuli due to its highly parallel architecture and the brain’s apparent parallel processing. In this study, a sensory descriptive analysis (see, e.g., Meilgaard, Civille, & Carr, 1999) was performed on two different data sets (referred to by the year of analysis, 2006 and 2007, respectively). Overall, the analysis of the two data sets followed the same procedure. The analysis was carried out by the sensory panel at Nofima Mat, consisting of 11 trained assessors, using the same judges in the sensory panel in 2006 and 2007. The sensory laboratory used was designed according to guidelines in ISO standard (1988), i.e., with individual booths, controlled ventilation system, etc. Light intensity on the spot of the samples during evaluation was approximately 900 lux. Before and after the sensory profiling, discussions took place in a round-table room, designed for focus group discussions. The .pdf files of each of the images were printed in 11 copies, one set for each assessor, on a HP LaserJet 4650 in a random order (i.e., not in the order of the experimental design). In 2006 the solutions were smoothed spatially prior to printing, by averaging each cell with its six nearest neighbours, in an attempt to reduce the mental load for the assessors. This was considered unnecessary for the 2007 data. The vocabulary for the descriptive analysis was then developed in three steps: (i) The images were presented to a group of four scientists within the fields of mathematics, pattern formation modelling, biology and ecology, i.e., non-sensory persons, for a preliminary brainstorming session. In a round-table discussion they came up with approximately 60 terms describing the differences between the different images. This step was dropped in the analysis of the 2007 data.

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

(ii) The images were then presented to the trained sensory panel, i.e., sensory persons with no professional biological or mathematical knowledge. Each assessor individually wrote down terms that characterised the images. (iii) In a round-table discussion among the trained assessors, having the different images spread out on the floor, the terms suggested from step (i) and (ii) were discussed, and the panel agreed upon a consensus list of 14 descriptors. Their definitions and interpretations are given in Table 2. In order to train and calibrate the sensory panel on the use of the descriptors, a pre-test was carried out on six images. These images were selected to span the variation and were denoted reference samples (Fig. 2). The values of the descriptors were evaluated by using a continuous unstructured scale ranging from the lowest intensity on the left side of the scale to the highest intensity on the right side (corresponding to translated values for the data handling 1.0–9.0, respectively). The main sensory analyses were carried out during 2 days (four sessions of 1 h each day), according to the following procedure: a short calibration of the reference samples from the pre-test was followed by evaluation of the selected images in the sensory laboratory. Except for a few replacements (as described below), the same descriptors and scale as determined in the pre-test were used both years. The images were coded with a random three-digit number and presented in a randomised order to each assessor. In 2006, two of the images were repeated in three replicates, and in 2007 all the samples were replicated to check assessor performance. Each assessor evaluated the images at an individual speed on the computer system CompusenseÒ version 4.6 (www.compusense.com) for direct recording of data. In summary, the first study (2006) analysed 64 images selected using results from the data modelling of the computerised image analyses. Only 32 of these images will be reported here, namely those where all cells had been initially perturbed before the cell differentiation simulation started. (The other 32 images were per-

Table 2 Sensory descriptors. Name

Description

Low (1.0)

High (9.0)

Whiteness MultiShade

Average colour (NCS-system) How many shades of grey

No white No shades

Contrast

How well the pattern is defined Blurred, indistinct pattern Presence of straight lines, direction is irrelevant White pattern on dark background

Hardly

White Many shades Clearly

None None

Clear Many

No clear white pattern No clear dark pattern None Small (none)

Clear white pattern Clear dark pattern Many Large

None

Many

None None None None

High High Many High

None Narrow None

Clear Wide Clear

Sharpness StraightLines PatternWhite

PatternBlack Curls KernelSize Circles Continuous Regular Associations MentalLoad Added in 2007 Objectness ThicknessCurls TwoHeadedness

Dark pattern on light background Presence of connected paths Size of area around the centre point Presence of circular shapes(star, flower, circle) Degree of continuous regions Degree of order Degree of associations Visual burden during analysis of image Appearing global patterns Width of filaments Number of filaments with same features in both ends

981

turbed at only one central cell and gave similar, but simpler and less informative results). Abnormal values of some sensory terms (e.g., MultiShade) for some of these images alerted us to inspect the original, unsmoothed images, and revealed several characteristics, but completely unexpected qualitative types of pattern, for instance one which we subsequently termed TwoHeadedness (Fig. 2a). The second study (2007) was performed in order to explore the conditions under which TwoHeadedness arise. In this case all the model parameters and initial perturbations were kept constant while the parameter found to be most relevant, hD, was incremented in small steps and the solution images profiled by computerised image analysis. For 32 of these images, sensory profiling was also performed. The smoothing of the solution images was considered unnecessary this time, and a 51  51 cell lattice was chosen in order to avoid certain boundary artefacts. Moreover, the sensory descriptor list was slightly changed: Sharpness was integrated in Contrast and KernelSize. StraightLines was omitted and replaced by Objectness as a global descriptor for emergent patterns clearly showing up, and two distinctive, local descriptors, ThicknessCurls and TwoHeadedness were included (see Table 2). The present sensory assessments followed the requirements of sensory science, in terms of randomisation of samples, equal surroundings for all the assessors and standard procedures for averaging the panel responses. Thus, the data warranted statistical analysis. 2.4. Multivariate data modelling First, a note on notation: a sample corresponds to the solution of the mathematical model obtained with a set of parameter combinations from the experimental design, resulting in 32 images (averages across sample replicates and assessors were input to the data modelling). For the 2006 experiment, the final data set consisted of the following three matrices for each of the sample sets: (i) the 32  7 matrix of the seven mathematical model parameter values, (ii) the 32  58 matrix of values of the selected, automatically generated computer image analysis descriptors, and finally, (iii) the 32  14 matrix of the sensory descriptors (14 terms, given as the average over the values assigned by each of the 11 assessors). The 2007 data were analysed in similar ways. The data were analysed using bi-linear modelling by Partial Least Squares Regression (PLSR) as implemented in The UnscramblerÒ version 9.6 (www.camo.com). PLSR (Wold et al., 1983) is a useful method for uncovering relationships between two matrices, X (n  p) and Y (n  q), where n is the number of samples and p and q are the number of variables in the two data sets, respectively. The data approximation model is obtained by extracting the main variation patterns from X that also have relevance for Y. PLSR thereby finds a sequence of PLS components (‘‘PCs”) t1, t2, . . . — orthogonal weighted sums of the mean-centred X-variables — each consecutive component describing as much as possible of the remaining covariance with a corresponding linear combination of the Y-variables. For PC # a, this can be written:

 Þv a ; ta ¼ ðX  x

a ¼ 1; 2; . . .

where vector va (p  1) represents the PLS weights. These Y-relevant X-components t1, t2, . . . are then used for modelling both the X- and Y-variables:

; y  þ Rta ½p0a ; q0a  þ ½E; F ½X; Y ¼ ½x Cross-validation is used for determining the optimum number of components to trust, A. Y-values in a new sample # i can then be predicted from the data xi by summing up the contributions from the PCs  Þv a ; a ¼ 1; 2; . . . ; A; y ^i ¼ y  þ Rta q0a , or equivalently via via tia ¼ ðxi  x regression coefficient matrix of rank A, BA ¼ Rv a q0a ; yielding

982

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

^ i ¼ b0 þ xi BA . For more details and a practical introduction, we refer y to Martens and Martens (2001). All input variables were standardised (i.e., their value was divided with the standard deviation of the respective descriptor). The number of optimal principal components was determined by leave-one-out cross-validation, while the reliability of the regression coefficients was correspondingly estimated using jack-knifing (Martens & Martens, 2000).

3. Results and discussion 3.1. Prediction of sensory patterns from computer-derived descriptors It would be desirable to have calibration models that can predict the more expensive but informative sensory descriptors from the cheaper but less interpretable automatic image analysis descriptors; thereby allowing fast, but interpretable profiling of future images of the same type. The analysis in Martens et al. (2009) of the 2006 data gave an indication of the relations between sensory descriptors and computer-derived descriptors. More information are here explored from a model in which the computer-derived descriptors are the predictors (X) and the sensory descriptors are the responses (Y). The significance and sign of the regression coefficients for this model is pictured in Fig. 3. Naturally, several of the computer-derived descriptors related to grey-scale features are highly significant for the corresponding sensory descriptors (for example Whiteness and PatternBlack). Note also that the regression model for Curls includes several of the greyscale descriptors. The same applies to the negative correlation with the auto-correlation descriptors. This was further studied in the 2007 data. As shown in Fig. 4, displaying predicted versus measured data for the 32 samples submitted for sensory analysis, the regression model suggests that the computer-derived descriptors may be good predictors of several of the sensory descriptors. For instance, sensory descriptor Curls is predicted well, and looking at the images this seems to make sense. It was also apparent from these

images that Curls is positively associated with Contrast, Sharpness and PatternBlack. Nevertheless, the combinations of computer-derived image analysis descriptors were difficult to interpret. Even if a combination of various grey-scale-level- and auto-correlation-statistics may discriminate between different model parameters, the interpretation of a high-dimensional linear combination is too complicated to give intuitive, simple meaning. In contrast, sensory descriptive profiling yields intuitively simple and directly interpretable quantitative descriptions. Moreover, a priori chosen mathematical feature extraction methods reflect the scientists’ prior bias and may fail to identify unexpected features. In contrast, the judges’ ability to invent new descriptor terms as needed during the sensory profile development ensures that even unexpected qualitative variation types will be quantified. The powerful human eye/mind combination, employed impartially and inter-subjectively by a professionally trained sensory panel, can detect global as well as local patterns and textures. Hence, even though qualitative/quantitative sensory descriptive analysis is somewhat slower and more labour intensive than automatic computer image analysis, the sensory profiling is more complete, and yields reliable and valid results that are easier to interpret. 3.2. Relation between mathematical model parameter data and sensory data Knowing the values of the seven model parameters allowed us to predict the sensory descriptors. Martens et al. (2009) show some results from the general-purpose 2006 experiment. In the present paper we show more details from the 2007 experiment, more focused on local patterns: Fig. 5 gives an overview of two of the predictive dimensions involved, in terms of the correlation loadings for the first two components from a PLSR model where X is the matrix of mathematical model parameters, and Y is the matrix of sensory descriptors. Initially, all the interaction terms were also included in X; however, in Fig. 5 only the significant interactions are shown. The optimal number of PLSR components, obtained by full cross-validation, was five. Several of the sensory descriptors

Fig. 3. Statistical summary of the PLS regression model of Notch pattern variations, where X = computer-derived descriptors (32  58), summarising grey-scale statistics, spatial auto-correlations and various clusterings in the images. Y = the most important sensory descriptors (32  12) of the same images. The optimal number of principal components obtained by full cross-validation is five, predicting 70% of the variation in these sensory descriptors (PC1: 22.73%, PC2: 56.07%, PC3: 64.51%, PC4: 62.62%, and PC5: 70.56%). The figure shows the significance level and sign of the rank-5 PLS regression coefficient matrix BA=5.

983

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

Fig. 4. Prediction of sensory appearance from computer image analysis. Measured (abscissa) of the sensory descriptors Y versus the same variables predicted (ordinate) from b The predictions were made using the linear model Y b = b0 + XBA=5 (Fig. 3) The straight line indicates the ideal of linear combinations of the computer-derived descriptors, Y. 100% prediction.

1.0

PLSR Correlation Loadings (X and )

PC2

0.5

MentalLoad

Whiteness MultiShade Curls2

PatternLight

0 ThicknessCurls

TwoHeaded

NeighbourRelat.

-0.5

PatternDark Objectness NumberCurls Curls1 LengthCurls

Contrast

-1.0

PC1 -1.0

-0.5

0

0.5

1.0

PLS-DA IndWgt,X…, X-expl: 42%,41% Y-expl: 17%,6%

Fig. 5. How particularly interesting sensory variables relate to the main mathematical model parameters in the 2007 experiment: correlation loadings for the two first PLS ; y  þ t1 ½p01 ; q01  þ t2 ½p02 ; q02  þ ½E; F, i.e., correlation coefficients between the individual input variables in [X, Y] and PCs t1 and t2. Xcomponents in the model ½X; Y ¼ ½x variables = sensory descriptors; variables with enlarged names were up-weighted in this PLSR. Y-variables: hD = mid-level parameter of protein Delta in the sigmoid stimulus–response curve (Fig. 1); pD = sigmoidal steepness parameter for Delta. Detailed notation: e.g., hD7, pDHigh = interaction term between indicator variables representing hD = 0.7 and pD = high (i.e., 10). D* and N* = initial homogeneous equilibrium levels of Delta and Notch (Fig. 1). Inner and outer circles: squared correlation coefficients of 0.5 and 1.0 (i.e., 50% and 100% explained variance) by the first two PCs; with the current aspect ratio these appear as ellipses.

were well modelled by the model parameters, (i.e., appearing outside of the ellipse denoting 50% explained variance already after two components; the form of the ellipses comes from the software package used, reflecting PC1 as the main component). These first

two PLSR components were mainly related to variance in the two threshold values hD and hN: hD is correlated with descriptors for whiteness (Whiteness, PatternWhite) and anti-correlated with blackness (PatternBlack) as well as Curls and Contrast. Referring to

984

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

the 2006 data, hN was then found to be correlated with Continuous, Regular, and StraightLines (Veflingstad, 2006). Finally, it should be mentioned that when modelling each of the sensory descriptors separately, a variable describing the order in which the images were printed was included in the X-matrix. This print order showed no significant influence on any of the sensory descriptors (data not shown), and thus, we can conclude that the variation in image profiling actually was a result of variation in parameter values and not an artefact of the image generation. Moreover, some intriguing pattern types were discovered. For instance, in the first study, the sensory term MultiShade was badly modelled by the first two PLSR components. It was found to stand out with unique variance and hence dominated the third PC. When inspecting the images standing out in this third dimension, a recurrent pattern of ‘‘two-headed worms” (see, e.g., Fig. 2a) was discovered. Now we needed to study these and other details, to characterise them in more detail and to check under what conditions they arise. The more detailed studies in 2007 of the peculiar ‘‘two-headed worms” are summarised in Fig. 6 in terms of statistical image properties (protein activity levels) and three of the most informative sensory descriptors, as functions of the stimulus–response threshold parameter hD. Fig. 6A shows three image examples: (a), (b) and (c) obtained with hD = 0.4, 0.7 and 0.9, respectively and with pD high (i.e., =10). For each of the hD levels used in the simulations the relative frequency of Notch levels in the resulting 51  51 cell lattice was recorded. The diameter of the spots in Fig. 6B represents the logarithm of the relative number of cells with a given Notch level at each parameter setting; this multivariate histogram sum-

mary was termed a ‘‘differentiation diagram” by Martens et al. (2009). The sensory responses in Fig. 6C represent the mean of the panel of 11 sensory assessors. In total, Fig. 6 shows that the peculiar image patterning (ThicknessCurls and TwoHeadedness) discovered previously were now found to follow the non-linear dynamic model parameter hD in a highly systematic way, along with peculiar distribution patterns in the differentiation diagram. At hD = 0.7, the centre of the ‘‘TwoHeadedness”-region, four distinct Notch levels are prevalent. Moving hD up or down from 0.7 leads to increased distribution complexity as well as increased computation time, and the loss of TwoHeadedness. ThicknessCurls, on the other hand, arose at two other parameter values, around hD = 0.4 and 0.9, but with patterns of very different Whiteness levels, and other Notch differentiation patterns. This may reflect regions for intensive cell differentiation. Work is now in progress to study the processes leading from random initial noise to these highly structured steady-state solutions. In order to communicate about ‘what is going on’ during cell differentiation, resulting in these structured steady-state solutions, a generic vocabulary of descriptors that aims at translating biological phenomena into interpretable patterns is needed. 3.3. Development of a sensory morphological wheel In general, two important criteria in vocabulary development are that the descriptors should pick up systematic differences between the samples, and be cognitive clear, i.e., showing high reliability and validity (Martens, 1999). In the present case, the 14 sensory descriptors seem to fulfil these criteria: The sensory

Fig. 6. Correspondence between spatial image pattern, computerised statistical image summaries and sensory descriptors in the 2007 experiment. (A) Examples of solution images from the mathematical model: Three examples of the most important model parameter (a), (b) and (c): hD = 0.4, 0.7 and 0.9, respectively under a standard condition (pD = high, i.e., 10, pN = low, i.e., 3, hN = low, i.e., 0.1, l = 0.5). (B) Differentiation diagram: distribution of Notch level as a function of parameter hD. (C) Three of the sensory descriptors as functions of parameter hD.

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

profiling was successfully performed at two different points in time and under different conditions (in 2006 on smoothed-lattice images from a low-resolution but high-dimensional design, and in 2007 on raw lattice images from a high-resolution but lowdimensional design). Internal sensory replication showed good repeatability, and cross-validation between mathematical parameter combinations showed strong multivariate correlations between the sensory profile data and the computer image analysis as well as the parameter settings. Overall, this gave confidence to the results, and the prediction models revealed understandable underlying relationships. Furthermore, it is relevant for modelling, interpretation and understanding of complex systems (e.g., within systems biology as well as sensory perception processing), that a link between semantics and mathematical foundations could be established. In the approach taken here, we have extracted inspiration from various aspects in the science of ‘morphology’, including a semantic/linguistic, a biological and a mathematical part. Bookstein (2009) argues for morphometrics as the statistics of biological size and shape. Investigating visual pattern recognition in a sensory scientific context has much to gain from biologists by integrating morphometrics into a well-known hierarchical set-up as manifested through the traditional sensory wheels. Sensory wheels, being developed for a large number of foods and beverages, first for example beer (Meilgaard, Dalgliesh, & Clapperton, 1979) and wine (see Noble, 1990) meet the need for having an inter-subjective language across disciplines. Characteristics for such ‘wheels’ are that they start with an inner circle describing an overall picture and then shows details in an increasing hierarchical order. In Fig. 7 we show the first version of a similar type of sensory ‘wheel’, designed to systemise the descriptive terminology for texture — in this case descriptors for images from mathematical modelling of cell differentiation. Here we have attempted to group the 14 sensory terms according to their domain of information: The inner circle represents a high-level classification into ‘intensity’ and ‘spatial’ as well as ‘mental process’ descriptors, the latter being a subjective evaluation. At the outer circle, the spatial terms were split into more detailed groups representing symmetrical terms (primarily applicable for cases where one single cell was initially perturbed; not pursued here) and non-symmetri-

Spatial

Intensity

Mental process

Fig. 7. A preliminary sensory morphological wheel based on visual pattern recognition of images from a mathematical model of a complex biological system.

985

cal terms (applicable for the present case where all cells were initially perturbed). Fig. 3 indicated the ‘intensity’ descriptors to be representative for the variables from the computer-derived ‘grey-scale’ features, while the spatial descriptors correspond more to the ‘auto-correlation’ and clustering features, respectively. Using this wheel might facilitate communication of the biological phenomena behind the words. At least, the mathematicians, biologists, data analysts and sensory scientists, involved in the present study, found it valuable to have developed a way to communicate. The approach that we have taken here was not settled beforehand. It most closely resembles the meaning of ‘serendipity’: when ‘one discovers something fortunate while looking for something else’. This leads to an important discussion around explorative mathematics to discover underlying phenomena in science (Martens & Martens, 2008).

4. Conclusions A complex biological system, as in this case inspires to a step into understanding cell differentiation and ways to grasp ‘what is going on’ in the cells. Mathematical models characterised by being high-dimensional, non-linear and dynamic, had previously been found to generate a bewildering variety of spatial patterns. This paper shows how the behaviour of the system could be explored by a sensory approach to sensory visual pattern recognition. The methodology involved a series of investigations involving a systematic use of statistical experimental planning of computer simulations. The resulting patterns were profiled by computer image analysis and by human sensory analysis. The results were interpreted by multivariate data modelling with predictive validation. We believe there are three important lessons from the study undertaken here. Firstly, although the computer-derived image analysis profiling was difficult to interpret per se, it was found to give good predictors of many of the important sensory descriptors. This illustrates that once statistical prediction models have been developed, a prediction of human perception of future images of the same general kind may be based on the automated image analysis. This could open up for a larger set of images to be analysed intelligibly in a shorter time when compared to the evaluation by a sensory panel. But if images with completely new pattern types arise, for which no prediction model has been developed, the resulting sensory profile prediction would be erroneous, but such an error can be guarded against by automatic outlier detection (Martens et al., 2009). Secondly, the systematic exploration of the high-dimensional non-linear dynamic model in terms of sensory analysis enabled us to detect and quantify several types of global patterns, most of which were unknown to us and in the literature prior to the analysis. The presence of Curls is one example of this, while straight lines and circular structures also fall under the category of global patterns. TwoHeadedness is another example. These unexpected pattern types were difficult to detect in a meaningful way by automatic mathematical image analysis alone; only after their discovery and characterisation by human visual assessment could we have designed dedicated mathematical filters to detect and quantify them. Moreover, a priori chosen mathematical feature extraction methods reflect the scientists’ prior bias and may fail in identifying unexpected features. In contrast, the judges’ ability to invent new descriptor terms as needed during the sensory profile development ensures that even unexpected variation types will be quantified. Thus, the benefit of using a sensory panel here was that the panel was trained in using their senses and language capabilities to report what they sense in an interpretable way.

986

M. Martens et al. / Food Quality and Preference 21 (2010) 977–986

Finally, the sensory descriptive analysis gave valuable qualitative and quantitative information regarding the effects of the different parameters. The variation in sensory descriptors was well explained by the mathematical model parameters, representing intricate patterns of cell differentiation. This discovery represents a completely new insight into the behavioural repertoire of the complex system under investigation. In addition, the sensory descriptors provided valid and extensive profiling of the observed patterns, resulting in a preliminary sensory morphological wheel, to be further developed in the future. Acknowledgements This research was supported by The National Programme for Research in Functional Genomics in Norway (FUGE) in the Norwegian Research Council. Further support came from the Agriculture Research Foundation of Norway. We would like to thank Asgeir Nilsen at Nofima Mat for scientific and economic support to the sensory analysis. References Bookstein, F. L. (2009). Measurement, explanation, and biology: Lessons from a long century. Biological Theory, 4(1), 6–20. Brodatz, P. (1966). Textures. A photographic album for artists and designers. New York: Dover Publications Inc. Bundesen, C., & Habekost, T. (2008). Principles of visual attention. New York: Oxford University Press Inc. Collier, J. R., Monk, N. A. M., Maini, P. K., & Lewis, J. H. (1996). Pattern formation by lateral inhibition with feedback: A mathematical model of Delta–Notch intercellular signalling. Journal of Theoretical Biology, 183, 429–446. Gibson, J. J. (1950). The perception of the visual world. Cambridge, MA: The Riverside Press. Gibson, J. J. (1979). The ecological approach to visual perception. Boston, MA: Houghton Mifflin.

Haralick, R. M., Shanmugan, K., & Dinstein, I. (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621. Harper, R. (1972). Human senses in action. Edinburgh: Churchill Livingstone, Longman Group Ltd. ISO (1988). Sensory analysis – Methodology – General guidance for the design of test rooms. Report no. 8589. Geneva, Switzerland: International Organization for Standardization. Langton, M., & Hermansson, A. M. (1996). Image analysis of particulate whey protein gels. Food Hydrocolloids, 10, 191. Martens, M. (1999). A philosophy for sensory science. Food Quality and Preference, 10, 233–244. Martens, H., & Martens, M. (2000). Modified Jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR). Food Quality and Preference, 11, 5–16. Martens, H., & Martens, M. (2001). Multivariate analysis of quality. An introduction. Chichester, UK: John Wiley & Sons Ltd. Martens, M., & Martens, H. (2008). The senses linking mind and matter. Mind & Matter, 6(1), 51–86. Martens, H., Thybo, A. K., Andersen, H. J., Karlsson, A. H., Dønstrup, S., StødkildeJørgensen, H., et al. (2002). Sensory analysis for magnetic resonance-image analysis: Using human perception and cognition to segment and assess the interior of potatoes. Food Science and Technology, 35, 70–79. Martens, H., Veflingstad, S. R., Plahte, E., Martens, M., Bertrand, D., & Omholt S. W. (2009). The genotype–phenotype relationship in multicellular patterngenerating models – The neglected role of pattern descriptors. BMC Systems Biology, 3, 87 . Martens, M., & Tschudi, F. (submitted for publication). Human senses in action: Multivariate measurement of quality. Meilgaard, M., Civille, G. V., & Carr, B. T. (1999). Sensory evaluation techniques (3rd ed.). USA: CRC Press Inc. Meilgaard, M. C., Dalgliesh, C. E., & Clapperton, J. F. (1979). Beer flavor terminology. American Society of Brewing Chemists, 37(1), 47–52. Noble, A. (1990). . Veflingstad, S. R. (2006). The search for relations between structure and behaviour in models of gene regulatory networks (p. 17). PhD thesis. Ås: Norwegian University of Life Sciences. ISBN: 82-575-0719-9. Wold, S., Martens, H., & Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. In A. Ruhe & B. Kågstrom (Eds.), Proceedings of the conference on matrix pencils, March 1982, lectures notes in mathematics (pp. 286–293). Heidelberg: Springer-Verlag.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.