Identifying emerging hotel preferences using Emerging Pattern Mining technique

June 8, 2017 | Autor: Huy Quan Vu | Categoría: Marketing, Tourism Management, Tourism
Share Embed


Descripción

Tourism Management 46 (2015) 311e321

Contents lists available at ScienceDirect

Tourism Management journal homepage: www.elsevier.com/locate/tourman

Identifying emerging hotel preferences using Emerging Pattern Mining technique Gang Li a, 1, Rob Law b, 2, Huy Quan Vu a, 3, Jia Rong a, 4, Xinyuan (Roy) Zhao c, * a

School of Information Technology, Deakin University, Vic 3125, Australia School of Hotel & Tourism Management, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region c School of Business, Sun Yat-Sen University, Guangzhou, Guangdong, 510275, China b

h i g h l i g h t s  A novel means for online review analysis identifies features of interest in hotel selection.  Emerging Pattern Mining is utilized to identify those features.  A dataset of 118,000 hotel reviews in Asia Pacific destinations was collected from TripAdvisor.

a r t i c l e i n f o

a b s t r a c t

Article history: Received 11 November 2013 Accepted 25 June 2014 Available online

Hotel managers continue to find ways to understand traveler preferences, with the aim of improving their strategic planning, marketing, and product development. Traveler preference is unpredictable; for example, hotel guests used to prefer having a telephone in the room, but now favor fast Internet connection. Changes in preference influence the performance of hotel businesses, thus creating the need to identify and address the demands of their guests. Most existing studies focus on current demand attributes and not on emerging ones. Thus, hotel managers may find it difficult to make appropriate decisions in response to changes in travelers' concerns. To address these challenges, this paper adopts Emerging Pattern Mining technique to identify emergent hotel features of interest to international travelers. Data are derived from 118,000 records of online reviews. The methods and findings can help hotel managers gain insights into travelers' interests, enabling the former to gain a better understanding of the rapid changes in tourist preferences. © 2014 Elsevier Ltd. All rights reserved.

Keywords: Hotel preference Data mining Travel behavior Emerging pattern mining Natural language processing

1. Introduction Most people prioritize accommodation when planning a trip, spending most of their planning time and effort on selecting the right option. Travelers have different expectations and/or preferences, depending on their destination, purpose and mode of travel, as well as previous accommodation experience (Liu, Law, Rong, Li, & Hall, 2013; Liu, Shi, & Hu, 2013). A comprehensive understanding of customer requirements can help hotel managers gain a lead

* Corresponding author. Tel.: þ86 20 84112721; fax: þ86 20 84036924. E-mail addresses: [email protected] (G. Li), [email protected] (R. Law), [email protected] (H.Q. Vu), [email protected] (J. Rong), zhaoxy22@ mail.sysu.edu.cn, [email protected] (X. (Roy). Zhao). 1 Tel.: þ61 3 9251 7434; fax: þ61 3 9251 7604. 2 Tel.: þ852 3400 2181; fax: þ852 2362 9362. 3 Tel.: þ61 432 411 359. 4 Tel.: þ61 3 925 17711. http://dx.doi.org/10.1016/j.tourman.2014.06.015 0261-5177/© 2014 Elsevier Ltd. All rights reserved.

in the market in terms of strategic planning, marketing, and product development (Wilkins, 2010). However, it is difficult to identify such crucial knowledge due to the complex decisionmaking process and the wide range of selection criteria (Li et al., 2013). Of these criteria, the most important has to do with hotel features (i.e., attributes or factors) that most travelers seriously consider. The most valuable hotel features that significantly affect a traveler's selections include location, price, facilities, and cleanliness (Lockyer, 2005). Other features, such as the size and type of building, quality of service and a quiet environment, are important to some people (Albaladejo-Pina & Diaz-Delfa, 2009; Merlo & de Souza Joao, 2011). Merlo and de Souza Joao (2011) examine specific hotel features, such as air conditioning in bedrooms. Sohrabi, Vanami, Tahmasebipur, and Fazil (2012) present another list of important hotel features, including promenade, comfort, security, network, pleasure, news, recreational information, expenditure, room facilities, and car parking.

312

G. Li et al. / Tourism Management 46 (2015) 311e321

Advances in Internet technology enable travelers to share their travel-related experiences, opinions, and concerns on many online platforms (Mack, Blose, & Pan, 2008). Thus, researchers are now shifting their attention to this data source as a way of mining traveler preference in a cheap, efficient, and nonintrusive manner. For instance, Stringam and Gerdes (2010) use a corpus-based approach to analyze guest comments on online hotel distribution sites as well as to identify frequently used words, patterns of word usage, and their relationship to hotel features rating. Furthermore, descriptive statistical data are used to assess the importance and effect on ratings of several features, including location, size of guest rooms, staff, facilities, and breakfast offerings (Stringam, Gerdes, & Vanleeuwen, 2010). Chaves, Gomes, and Pedron (2012) show that room, staff, location, cleanliness, friendliness, and helpfulness are the most frequently used words in online reviews of small and medium hotels in Portugal. Liu, Law, et al. (2013), Liu, Shi, et al. (2013) analyze comments collated from TripAdvisor.com and changes in hotel customers' expectations according to travel mode, using the association rule mining technique. Li et al. (2013) utilize the Choquet integral, a method of fuzzy decision support, to analyze the selection preferences of different groups of travelers in terms of several hotel features. Researchers have yet to meet the increasing demand of hotel managers for more accurate knowledge on the hotel preferences of travelers. Several limitations that prevent researchers from identifying such knowledge are listed below. Identifying Emerging Features Traveler preference is unpredictable and dynamic. For example, travelers once preferred having a telephone in their room. During that time, charging for telephone usage used to be a significant source of revenue, but usage has declined to a point wherein investing in this facility resulted in losses for many hotels that offer this facility (Huettel, 2010). Today, hotels gain significant customer satisfaction by offering free wireless Internet (Bulchand-Gidumal, Melian-Gonzalez, & Lopez-Valcarcel, 2011). These changes in travelers' concerns can affect the performance of hotel businesses. As such, managers must effectively identify features that are becoming important to travelers. However, efforts to address this issue have been limited. User Identification for Feature Improvement Different types of travelers have different expectations of hotel features (Liu, Law, et al., 2013; Liu, Shi, et al., 2013). Some aspects may be important to all travelers, whereas others may be significant only to a specific subgroup. A clear picture of such differences could benefit hotel managers. For instance, if travelers from Western countries prefer clubbing facilities, managers can design appropriate business solutions to improve those features and meet the specific expectations of this group. However, this aspect has received little research attention. The identification of emerging features is different from traditional approaches to hotel feature analysis, because analysts have no prior knowledge on what features should be included in the study. Large data samples are also required to identify emerging changes in customer preference patterns. Traditional research methods, such as surveys, opinion polls or focus groups, are inadequate. Therefore, resorting to available online data, such as online reviews generally expressed as textual comments, is necessary. These reviews contain abundant information on user opinions, experiences, or concerns, and are considered potential goldmines from which tourism researchers can gain insights into the behavior of travelers (Pan, MacLaurin, & Crotts, 2007).

The analysis of hotel features treats each feature as an item, and a set of hotel features associated with a traveler is an item set. Identifying emerging changes in traveler response to such features is typically formulated as a problem of Emerging Pattern Mining (EPM). Originally proposed by Dong and Li (1999), EPM can capture emerging trends in time-stamped databases or sharp contrasts between data sets or groups. This technique is mainly applied in bioinformatics (Li, Liu, Downing, Yeoh, & Wong, 2003; Li & Wong, 2002) and remains an active topic in computer science (Li & Yang, 2007; Yu, Chen, & Tseng, 2011). By using EPM, researchers can identify emerging hotel features. The current study aims to fill the current research gap by introducing the EPM technique to establish emerging hotel features. The term “hotel features” includes any entity or concept that concerns travelers when reviewing a hotel. In our case study, we first construct a comprehensive list of candidate hotel features from a large collection of text-based online reviews (N z 118,000). We use the EPM technique to identify emerging features that currently receive more attention from international travelers. We also construct a set of user profiles to assist hotel managers in improving the features available in their properties. The method and the findings of this study are potentially valuable to hotel managers who want to gain insights into travelers' concerns and find ways to adapt to rapid changes in the tourism market. The rest of the paper is organized into sections. Section 2 summarizes the methods used and attempts made to analyze hotel features. Section 3 presents the review framework used for creating a hotel features list from textual comments, and a detailed description of EPM concepts used to identify emerging features. Section 4 demonstrates the effectiveness of the proposed method in a case study. Finally, Section 5 concludes our study and offers suggestions for future research directions. 2. Related work This section reviews existing studies that utilize hotel features to explore traveler preferences. We also present a critical analysis of the limitations of these studies and our research objectives. 2.1. Hotel features analysis Several studies have analyzed hotel features to acquire knowledge on traveler preference. A popular method for data collection is using survey questionnaires, wherein hotel features are represented by short-answer questions or a set of keywords. Table 1 summarizes the hotel features included in traditional studies. Due to the increasing interaction among travelers, studies are increasingly utilizing observation data collected from online resources (e.g., blogs, travel websites, and social media) through online reviews. Traditional, statistics-based data analysis models are ineffective in extracting information from text-based reviews and comments. Thus, new techniques and approaches have been proposed. For instance, manual content analysis is used to study traveler characteristics and communications about Australia as a tourism destination (Carson, 2008; Wenger, 2008). Another study employs the narrative structure analysis to identify key marketing elements, including characterization, space categorization, and evaluation of the product experience (Tussyadiah & Fesenmaier, 2008). Manual methods are time consuming and incapable of obtaining the overall differences in traveler preferences. Although automated approaches, such as corpus-based semantic analysis (Rayson & Garside, 2000) and stance-shift analysis (Davidson & Skinner, 2010), are also employed, such methods require users to have a background in linguistics and access to expensive software (Capriello, Mason, Davis, & Crotts, 2013). Thus, tourism researchers

G. Li et al. / Tourism Management 46 (2015) 311e321 Table 1 Predefined hotel features used in existing studies. Hotel features

Related work

Cleanliness, Location, Room, Service, Sleep Quality, Value Shabby Bed, Clean Rats, Friendly Staff, Limited Parking, Good Room Personalization, Warm Welcome, Special Relationship, Straight from the Heart, Comfort Promenade and Comfort, Security and Protection, Network Services, Pleasure, Hotel Staff and Their Services, News and Recreational Information, Cleanliness and Room Comfort, Expenditure, Room Facilities, Parking Location, Size and Diversity, Characteristics of the Lobby, Characteristics of the Rooms, Parking Type of Building, Location, Number of Bedrooms, Price per Room, Horses for Hire, Play Area, Meal Service, Swimming Pool, Sports Facilities, MiniFarm, Bathroom, Type of Rent, ‘Q’ Quality Award, Booking Problem-Solving Abilities by Service Personnel, Price Level, Sanitary Hot Spring Environment, Convenience of Traffic Route/Shuttle, Special Promotions, Convenience of Reservation Procedure, Food and Beverages Service Location, Price, Facilities, Cleanliness Staff Service Quality, Room Quality, General Amenities, Business Services, Value, Security, IDD Facilities Cleanliness, Location, Room Rate, Security, Service Quality, Reputation of Hotel

(Liu, Law, et al., 2013; Liu, Shi, et al., 2013) (Bjorkelund, Burnett, & Norvag, 2012) (Ariffin & Maghzi, 2012) (Sohrabi et al., 2012)

(Merlo & de Souza Joao, 2011) (Albaladejo-Pina & Diaz-Delfa, 2009)

(Hsieh, Lin, & Lin, 2008)

(Lockyer, 2005) (Choi & Chu, 2001)

(Ananth, DeMicco, Moreo, & Howey, 1992)

are also unable to take advantage of such methods for efficient analysis of online reviews. In studying traveler preference, Lento, Park, Park, and Lehto (2007) investigates online product and service reviews to identify critical and prominent (dies) satisfaction factors for virtual travel agencies. Tsou (2010) identifies the geographic information of travelers embedded on tourism web pages. Bulchand-Gidumal et al. (2011) use online reviews to verify a hypothesis, which states that free wireless Internet connection can improve traveler satisfaction. Ye, Law, Li, and Li (2011) analyze online reviews to extract features about travel destinations of Chinese customers. Bosangit, Dulnuan, and Mena (2012) use travel blogs to examine the post-condition behavior of tourists, while Banyai (2012) analyze the blog content of travelers to Stratford, Canada to identify popular topics and tourist perceptions of the destination. 2.2. EPM applications The original purpose of EPM was to capture emerging trends in time-stamped databases or to explore differentiating characteristics between groups of data (Dong & Li, 1999). The research on emerging patterns focuses on the use of the discovered patterns for classification purposes, such as in works on emerging patterns (Li, Dong, & Ramamohanarao, 2000) and jumping emerging patterns (Li, Dong, & Ramamohanarao, 2001). Advanced classification techniques used for emerging patterns are proposed based on Bayesian approaches (Fan & Ramamohanarao, 2003), and bagging methods (Fan, Fan, Ramamohanarao, & Liu, 2006). The development of EPM techniques remains an ongoing effort (Gan & Dai, 2009; Liu, Law, et al., 2013; Liu, Shi, et al., 2013; Liu et al., 2014; Yu et al., 2011). Emerging patterns are mainly applied in bioinformatics. For instance, Li and Wong (2002) attempt to find groups of genes using EPM and apply these on a colon tumor data set. Li et al. (2003) develop an interpretable classifier on an acute lymphoblastic leukemia microarray data set. Wang, Zhao, Zhao, Wang, and Qiao

313

(2010) adopt the EMP technique to mine local conserved clusters from gene expression data. Park, Lee, and Park (2010) apply incremental EPM module on ECG signal data for automatic diagnosis of cardiovascular diseases. Sherhod, Gillet, Judson, and Vessey (2012) utilize the jumping EPM to develop a method for automatic toxicity alert, while Huang, Gan, Lu, and Huan (2013) use this technique in mining the changes of medical behavior for clinical pathways of bronchial lung cancer. Studies have also shown the importance of emerging patterns in applications in other areas. Kim, Song, and Kim (2005), for example, adopts the emerging pattern concepts in developing the methodology to detect changes in customer segments between timestamped data sets. Tsai and Shieh (2009) explore the emerging sequential patterns to identify the trends in consumer behavior. Shie, Yu, and Tseng (2013) focus on discovering user behavior patterns from mobile commerce environments using mobile sequential pattern mining. However, although these studies have explored the capabilities of EPM as a technique, its strong potential has yet to be fully utilized. Moreover, the concept of emerging patterns has yet to be used in the tourism and hospitality industries.

2.3. Problem definition and research objective The majority of existing studies have focused on identifying and analyzing the most valuable hotel features that significantly affect the hotel selection process for travelers. These studies examine several commonly mentioned hotel features, such as price, location, room quality, staff, and service (Chaves et al., 2012; Liu, Law, et al., 2013; Liu, Shi, et al., 2013; Stringam et al., 2010). However, the natural assumption that popular or frequently mentioned hotel features are worth studying is one limitation of these studies. Infrequently used hotel features may also be interesting despite being mentioned less because these features are new and have already gained increasing interest from travelers. The wireless Internet facility is one such feature (Bulchand-Gidumal et al., 2011). Wireless technology has only become widely used in recent years, with many wireless Internet-enabled devices available to users. The demand for wireless Internet is growing significantly, even though the traditional hotel features remain more popular. In relation to this, most studies have not focused on discovering hotel features that are emerging as important to travelers. Identifying target users for hotel features is important for tourism managers, as they help the latter improve strategies that will meet traveler expectations. Existing studies have focused on identifying features that are frequently mentioned in online reviews (Chaves et al., 2012; Stringam & Gerdes, 2010). This approach is ineffective due to the challenges involved in enhancing popular features that can impress travelers. A feasible solution that ensures user satisfaction is difficult to develop. These features are generally standardized and managers are well aware of them, thereby limiting the scope for market competition. In contrast, some features concern certain groups of travelers. Specific improvement plans must be developed to address these concerns. Identifying relevant features and their target users can provide hotel managers the knowledge they need to create more competitive hotel products. This aspect has not been studied, and in the hotel features analysis of the current study, such feature is referred to as “features of specific interest”. This study uses the EPM concept to detect changes in travelers' concerns and to demonstrate its effectiveness through an analysis of emerging hotel features. We also identify a number of specifically interested features and their target users to develop appropriate business solutions for improving hotel services.

314

G. Li et al. / Tourism Management 46 (2015) 311e321

Fig. 1. Hotel feature identification process.

3. Methodology Several challenges must be addressed to analyze the changes in travelers' concern on hotel features using online reviews. Table 1 shows how researchers label hotel features differently despite shared similarities. Selecting keywords that appropriately and accurately describe hotel characteristics is a challenging task. A possible solution would be to incorporate text-mining techniques into the analysis of online reviews. This approach can extract useful knowledge from unstructured text and then transform the information into structured data for analysis, thereby revealing relationships, patterns, or trends from textual data (Singh, Hu, & Roehi, 2007). Statistical tests are commonly used in studies that examine changes and trends in traveler preference (see, e.g., Bulchand-Gidumal et al., 2011). However, when a large number of variables are considered, such analysis on every available feature becomes inefficient and costly. Therefore, we present a textprocessing framework that can automatically identify hotel features from review comments. Next, we introduce the EPM concept and describe how it can be used to discover emerging hotel features. 3.1. Review processing We employ General Architecture for Text Engineering (GATE), one of the most powerful software packages capable of solving

most text-processing problems (http://gate.ac.uk/). This software has been used in a wide range of applications, such as mining sequence information from protein structure databases in bioinformatics (Witte & Baker, 2005) as well as extracting and mining patient information from free-text clinical records in healthcare (Tseng, Lin, & Lin, 2007; Zhou & Han, 2006). In tourism applications, GATE has been used in building Tourist Face, a contents system based on the concept of a freebase that provides access to cultural-tourist information (Munoz Gil et al., 2011), in developing a tourism recommender system (Varga & Groza, 2011), and other tourism applications (Ruiz-Martinez, Minarro-Gimenez, Castellanos-Nieves, Garcia-Saanchez, & Valencia-Garcia, 2011). An advantage of this tool is the availability of several language databases, especially the English lexicon. This software contains a comprehensive list of English vocabulary terms that can be used to identify hotel features from online reviews. Suppose that we collect a data set, R, with m review comments, R ¼ {r1, r2, … ,rm.}.; The process of identifying the hotel features mentioned in the reviews, ri 2R , is performed in two major stages, as shown in Fig. 1. 3.1.1. Text processing This step transforms the unstructured textual material into a more useful data format. In more detail, each review, ri, is first loaded into a text tokenizing algorithm, wherein the stream of text is broken into words, phrases, symbols, or other meaningful elements called “tokens”. The token for each review is passed through

G. Li et al. / Tourism Management 46 (2015) 311e321

a text filter, wherein capital letters are normalized to lower case. Tokens containing symbols or numbers are removed, because they are considered irrelevant to feature analysis. The remaining tokens are encoded into a stemming process to reduce inflected words to their stem, base, or root form. For instance, a stemming algorithm can reduce the words “cleans”, “cleaning”, “cleanliness”, and “cleaned” to the root word “clean”. This process allows the user to extract the features that have been mentioned using different word forms. A stemmed token list, SðiÞ ¼ fs1 ðiÞ ; s2 ðiÞ ; …g, is constructed for

0

1

i

j

supp ðX; Gi Þ  ; otherwise supp X; Gj

each review, ri, i ¼ 1, ..., m, and saved to a processed-document database. In hotel feature identification, a natural assumption is that the English vocabulary of noun types is commonly used to refer to entities, such as hotel features (e.g., room location, view, service, and staff). Therefore, we identify and construct a list of the stemmed nouns, N ¼ {n1, n2, …, no}, which appear in the review corpus using the English lexicon of GATE. The lexicon resource is used as a lookup database, which contains approximately 63,000 commonly used English words. Each word is accompanied by a set of tags that helps determine the word type, such as noun, verb, or adjective. The list of nouns is used to identify candidate hotel features in the next stage. 3.1.2. Hotel feature candidate identification In this step, we select interesting nouns, nj 2N, for further analysis as potential hotel features. Specifically, a binary vector, vðiÞ ¼ fv1 ðiÞ ; v2 ðiÞ ; …; vo ðiÞ g, is constructed for each stemmed token list, S(i), where vj ðiÞ takes the value of 1 if nj 2SðiÞ , or 0 otherwise. The degree of interest of each noun, nj 2N, is evaluated by a support value given by

    count nj supp nj ¼ ; jRj

EPM concept to address this issue. Emerging patterns are defined as item sets, whose support increases significantly from one data set to another (Dong & Li, 1999). Let F ¼ {f1, f2, …, fo} be a set of items. Subset X4F is called a kitem set, where k ¼ jXj. Given a number of groups, {G1, G2, …}, the support for an item set, X3Gi , is denoted as supp (X, Gi), which reflects how frequently X appears in this group. The change in the support for X from a group Gi to a group Gj is measured by a growth rate metric defined as

8   0; if supp ðX; Gi Þ ¼ 0 and supp X; Gj ¼ 0 > > > > > < ∞; if supp ðX; G Þ ¼ 0 and suppX; G s0

C B C B C B GrowthRate BX; Gi ; Gj C ¼ C > B > A > @ > > :

(3.1)

where count (nj) is the count of vector, v(i), whose values, vj ðiÞ ¼ 1; c i ¼ 1; ::; m. jRj is the total number of records in the data set R. We use a user-specified minimum support or the support threshold, (ds), to measure the significance of the nouns in the review corpus. If a noun, nj, satisfies supp (nj)  ds, that noun is selected into the hotel feature candidate list; otherwise, it is removed. The advantage of this method is that users do not need to provide a set of predefined keywords to identify and extract hotel features for analysis. Instead, a list of candidates is automatically constructed from the review comments. All potential features mentioned in the reviews are considered, and interesting candidates are returned. The support threshold, ds, is set to eliminate insignificant features while retaining potentially interesting ones for subsequent analysis. 3.2. EPM Efficiently discovering changes or trends in travelers' concerns is a challenging task for hotel managers, given the large number of features mentioned in online reviews. This section introduces the

315

:

(3.2)

Given de > 1 as a growth rate threshold, an item set, X, is called an emerging pattern if it satisfies the condition given by:

   maxði; jÞ GrowthRate X; Gi ; Gj  de ;

(3.3)

where max(i, j) mean that groups Gi and Gj can be of any order, but only the order with the largest growth rate is compared with the growth rate threshold. When GrowthRate (X, Gi, Gj) ¼ ∞, X is called a jumping emerging pattern, because it appears in one group, but not in the other. Several issues must be considered when using EPM in hotel features analysis. Interpreting a hotel feature on its owndrather than as part of an item set containing many othersdis easier and more meaningful. When the number of features is large, the use of item set X with one hotel feature, (k ¼ 1), is suggested. Equation (3.2) does not take into account the order of groups Gi and Gj, although it is important in detecting changes in travelers' concerns over time. For example, we group the hotel reviews according to the years, G1 contains reviews created in year 2012, and G2 contains reviews created in 2013. An increase in the rate of change for a hotel feature from 2012 to 2013, if G1 is considered before G2, would mean something different from a decrease if G2 is considered before G1. Therefore, we define the concepts of positive and negative emerging patterns to distinguish between increasing and decreasing amounts of change in the concerns expressed by travelers. Let Gi be an initial group, and Gj a target group, (i s j). An item set X is a positive emerging pattern if it satisfies Equation (3.3) and supp (X, Gi) < supp (X, Gj). The growth rate, GrowthRate (X, Gi, Gj) ¼ supp (X,Gj)/supp(X, Gi), is termed a positive growth rate. On the contrary, X is a negative emerging pattern if it satisfies Equation (3.3) and supp (X, Gi) > supp (X, Gj). The growth rate,:GrowthRate ðX; Gi ; Gj Þ ¼ supp ðX; Gi Þ=supp ðX; Gj Þ, is termed a negative growth rate and indicated by a negative sign ð:Þ. Below is an example that illustrates the use of EPM in the hotel context. Example 1. A set of hotel features is defined to include price, room, service, telephone, and wireless Internet connection. Next, a sample data set is constructed (Table 2). Each record is represented as a vector, v(i), where each element, vj ðiÞ , takes a value of 1 if its corresponding feature is mentioned in review ri and 0 otherwise. The year attribute represents the dates of the reviews, and indicates the group to which a record belongs, namely, 2012 or 2013. The

316

G. Li et al. / Tourism Management 46 (2015) 311e321

Table 2 A sample data set. ID

Price

Room

Service

Telephone

Wireless

Year

r1 r2 r3 r4 r5 r6 r7 r8 r9 r10

1 1 1 1 1 1 1 1 1 1

1 0 1 1 1 0 1 1 1 1

1 0 0 1 0 1 1 0 1 1

1 1 0 1 0 0 0 1 0 0

0 0 0 0 0 1 0 1 0 1

2012 2012 2012 2012 2012 2013 2013 2013 2013 2013

review is grouped according to years and their support for each hotel feature is computed. Growth rates are also computed to reflect changes in traveler concerns (Table 3). In this example, we set the emerging threshold, de ¼ 2. Table 3 shows the results, namely, the features that can be interpreted as an emerging increase in interest on the hotel service from 2012 to 2013, as indicated by the growth rate of 2.0. As can be seen, room telephone has received significantly less attention from travelers over time, as indicated by a negative growth rate of :3.0. A jumping emerging pattern is shown for wireless facility in 2013, with a growth rate of ∞. Price and room are not emerging features, because their growth rates are less than the emerging threshold. The hotel features, such as telephone and wireless connection, are less mentioned in the review, i.e., these features do not receive much attention in the traditional approach because the studies focus on the popular features. EPM targets any major change in the support values between groups rather than the values themselves. Thus, EPM can effectively identify the interesting changes in the traveler's attention. Standard chi-square (c2) tests can also be applied to verify the significance of group differences in real-life applications. 3.3. Summary The overall analysis flow for hotel features is presented in this section. The review processing method is applied to an online review corpus to construct a set of features. EPM is then applied to identify the emerging hotel features in the list. As discussed in Section 2.2, hotel managers are also interested in specifically targeted features, to enable them to create products that are more competitive. We group reviews that mention features into a subset and then perform segmentation analysis. We examine the proportion of reviews that identify and report the features of specific interest in terms of the demographic characteristics of travelers. For instance, the users are grouped according to their travel mode, namely, business, couple, family, friend, and solo. In relation to this, a hotel manager must be able to identify facilities, such as the Internet and telephones, as most interesting to business travelers. Given a feature, fi, whose users are segmented into groups according to the proportion of fG1 ðiÞ ; G2 ðiÞ ; …; Gn ðiÞ g.; fi is identified as ðiÞ a feature of specific interest if max ni fGj g  l, where l is a Table 3 Supports and growth rates of hotel features. Features

Price Room Service Telephone Wireless

Support

Growth rate

Year 2012

Year 2013

1.0 0.8 0.4 0.4 0.0

1.0 0.8 0.8 0.6 0.3

0 0 2.0 :3.0 ∞

specifically interested threshold with a constraint, l > 100/n. Here, Gj ðiÞ is the proportion of any group j ¼ 1, …, n for feature fi, and only the group with the largest proportion is compared against the constraint, l. The effectiveness of the proposed method is demonstrated in a case study reported in the subsequent section. 4. Experiment and analysis This chapter first describes our experimental data set, which we collected from online hotel reviews. The chapter also describes the experimental design and analysis, presents a summary, and analyzes the managerial implications. Finally, it presents suggestions to hotel managers to help them improve their products and services. In turn, these will enable the managers to better meet the expectations of international travelers. 4.1. Data collection and experimental design We collect the data set used in this paper from TripAdvisor (www.tripadvisor.com), one of the most popular travel review websites. This website has been widely used as a data resource for research on hotel selection criteria (Liu, Law, et al., 2013; Liu, Shi, et al., 2013; Li et al., 2013). We use the professional data extraction software, Visual Web Ripper (www.visualwebriper. com), to extract review content. We focus the data extraction process on reviews of hotels located in Hong Kong, Singapore, Shanghai, Bangkok and Sydney, because these countries are major Asia Pacific international tourist destinations. A total of 1740 hotels are included for extraction, ranging from one- to five-star hotels based on TripAdvisor's ratings. The software navigates each travel review and extracts its text and date of posting, together with demographic data about the traveler, such as travel mode (i.e., business, couple, family, friends, or solo) and country of origin. In tourism research, culture is a well-known and important factor that influences behavior and decision-making (Reisinger & Turner, 1997; Tsai, Yeung, & Yim, 2011), particularly in hotel evaluation (Leung, Lee, & Law, 2011). International travelers from different continents are also likely to have different backgrounds (Khadaroo & Seetanah, 2008). Therefore, we group reviewer locations according to their continent of origin, which is convenient for the analysis. We note that the majority of reviewers in our data set come from North America, Europe, Asia and Oceania, so we only considered these continents in this study. Most of the collected reviews were posted in recent years (i.e., 2010e2013), with a few posted in 2009 or earlier. For convenience, we refer to the latter reviews as part of the 2009 group. We remove records with missing attributes, decreasing the data set to 118,300 records. A detailed description of the data set is given in Table 4. Our case study is organized into several groups to address the research objectives. 4.1.1. Hotel feature list construction The effectiveness of the proposed text processing method in automatically identifying hotel features from text reviews is first demonstrated. Given that hotel managers are interested in travelers' current concerns, we only perform this experiment on the reviews posted in 2013 (i.e., 31,639 records). From this, a list of critical hotel features is constructed. 4.1.2. Emerging hotel feature identification Gaining insights into the behavioral tendencies of customers is important for hotel managers, as they can use such information in business planning and decision-making. We first identify hotel features that have been subject to changes in traveler's attention

G. Li et al. / Tourism Management 46 (2015) 311e321

317

Table 4 Description of the collected data set. Attribute

Description

COMMENT Text comment of each hotel review YEAR Year when the reviews were posted online

ORIGIN

Location of travelers according to continents of origin

GROUP

Travel mode of travelers

Value

Percentage (%)

Review text

100

2013 2012 2011 2010 2009 and before Asia Europe North America Oceania Business Couple Family Friends Solo

26.74 39.85 17.50 7.5 8.39 31.26 16.71 21.20 30.03 24.73 35.91 19.36 10.66 9.34

over the past five years (from 2009 to 2013). The reviews are then grouped according to the year when they are created. We apply EPM to these groups to detect emerging hotel features. The emerging threshold is set to de ¼ 1.1. A small de value is used because our data set contains many records. A slight change in the growth rate indicates a large change in the record number. The c2 test is used to validate the significance of the changes detected by the EPM algorithm. We then compute the growth rate for the emerging features identified for each pair of the years 2009e2013 to identify the trends. 4.1.3. Specifically interested features identification This case study also identifies features of specific interest that are relevant to a particular group of travelers. Given that travel mode and origin influence traveler preference, we incorporate the attributes ORIGIN and GROUP in Table 4 to construct user profiles. The reviews are then grouped according to the user demographic profiles, and a segmentation analysis is applied as described in Section 3.3. The l value is set to 30%. Reviews from years 2012 and 2013 are used in this analysis because they represent the most updated data. 4.2. Results and analysis 4.2.1. Hotel feature list construction Here, we apply the proposed review processing method to the reviews posted in 2013. A stemmed noun list is constructed based on the English lexicon. The support threshold ds is used to determine whether a noun in the list is interesting enough for further analysis. We first examine the effect of setting different support thresholds between 0 and 0.1 to the number of hotel features returned as candidates. Fig. 2 shows that the algorithm returns 5523 features with ds ¼ 0, which is the total number of nouns in the stemmed list. The hotel feature candidate number drops gradually to 419 when ds is set to 0.01, then decreases slightly with higher support thresholds. When ds ¼ 0.1, only 47 feature candidates are returned. Notably, the candidate generation is an automated process; thus, users may examine the output further to select their features of interest. This approach should be feasible in practice because the candidate number is usually small. The advantage of this work is that hotel features of interest are identified directly from reviews rather than from a predefined set. Hence, this condition allows for a more comprehensive and relevant list of hotel features to be constructed. Next, we select ds ¼ 0.05 to generate a list of hotel feature candidates for further analysis in our case study. This value is

Fig. 2. Output hotel feature candidates.

commonly used for evaluating the extent to which items in a data set are deemed interesting (Law, Rong, Vu, Li, & Lee, 2011; Li, Law, Rong, & Vu, 2010). This analysis resulted in 111 candidates. We inspect the list and find that several popular hotel features are included (Fig. 3), including room, staff, location, breakfast, service, and cleanliness. This result is reasonably consistent with previous studies (Chaves et al., 2012; Stringam & Gerdes, 2010), which indicates the effectiveness of our approach. Several other features of interest to travelers are also identified. These detailed hotel aspects include the lobby, lounge, door, coffee, tea, or surrounding environment such as roads, streets, parks, rivers, and spaces. Factors such as stations, airports, taxis, trains, and access are particularly significant to reviewers. Their support values are similar to or higher than some of the popular features, such as the Internet, clubs, receptions, prices, and bars. These results are interesting because these terms are about transportation, which is not directly related to the hotel domain. However, they are important in the hotel evaluation process of international travelers. Prior works on hotel features has focused mainly on popular hotel features and less on such ancillary aspects. We use the list of 39 features (Fig. 3) for further analysis. 4.2.2. Identification of emerging hotel features Next, we identify features where levels of interest among travelers have changed over the past five years. The year 2009 is set as the initial group, whereas 2013 is set as the target group. Our data sets for these groups are encoded into the EPM algorithm. Only features whose growth rates are greater than the positive growth rate with a threshold de ¼ 1.1 are considered. If a feature has a smaller 2009 group support value than that of the 2013 group, then it is selected as a positive emerging pattern; otherwise, it is selected as a negative emerging pattern and denoted with a negative sign ð:Þ in front of its growth rate value. This approach produces 16 emerging hotel features (Table 5). Table 5 shows some interesting patterns in traveler's concerns as reflected in their reviews. The growth rate metric is the factor of interest for detecting change in the EPM context. The c2 test results of p-value  0.05 demonstrate statistically demonstrate significant changes in the support values for 2009 and 2013. Some interesting findings for the positive emerging pattern are presented here. First, international travelers focus more on the clubbing feature, as shown in the growth rate of 1.994 (Pattern P1); specifically, the number of travelers using this facility has nearly doubled since 2009. Some hotel areas, such as the lounge and pool, have also

318

G. Li et al. / Tourism Management 46 (2015) 311e321

Fig. 3. Identified popular hotel features.

received more attention (P2 and P4, respectively). Rivers and views appear to attract more interest from travelers in 2013, with growth rates of 1.115 and 1.105 (P3 and P7, respectively). Pattern P5 shows that travelers are now more concerned about service. Slightly increased attention to dinner and food provided by hotels was also identified. These positive patterns can focus the attention of hotel managers to features that are becoming “hot” among international travelers. Subsequently, managers can make appropriate changes to attract more customers. In comparison, negative emerging patterns can help hotel managers in terms of saving effort and concentrating investment resources away from areas that are unimportant to customers. Some popular features such as price and cleanliness are of less concern over time, as indicated by the negative growth rates in P9 and P14. Interest in indoor facilities [e.g., bathrooms and beds (P12 and P15, respectively)], or outdoor factors [e.g., streets, taxis, and parks (P10, P11, and P13, respectively)] have also decreased. Given these emerging features, hotel managers need to know whether trends exist in the areas that travelers focus on for planning purposes. Thus, we also compute the growth rate for each feature against each pair of years from 2009 to 2013. This process is performed with the same procedure as that used in the previous Table 5 Identified emerging hotel features. Features

Club Lounge River Pool Service Dinner Food View Price Street Taxi Bathroom Park Clean Bed Location

Support value 2009

2013

0.039 0.065 0.039 0.186 0.306 0.050 0.207 0.176 0.185 0.143 0.133 0.245 0.078 0.371 0.224 0.479

0.077 0.089 0.053 0.216 0.353 0.058 0.230 0.194 0.122 0.096 0.094 0.176 0.056 0.293 0.190 0.426

Growth rate

c2

p-value

Pattern ID

1.994 1.383 1.351 1.159 1.155 1.150 1.109 1.105 :1.520 :1.495 :1.413 :1.396 :1.383 :1.265 :1.174 :1.125

176.96 60.43 30.17 39.94 75.39 8.16 22.30 16.79 256.43 177.57 123.32 235.54 60.66 212.05 52.13 86.84

0.000 0.000 0.000 0.000 0.000 0.004 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16

case. The results are shown in Table 6. Here, the signs of the growth rates, rather than the values, are of interest. A feature has an increasing trend (b) if the growth rates for all year pairs show a positive trend, and if the growth rates are all negative, then a feature has a decreasing trend (a). If no clear trend is found, then the symbol “ ” is used. We summarize the findings for the emerging trends here. The trend in relation to clubs and food is increasing, as indicated by the positive growth rates across different year pairs (T1 and T7, respectively). Clear downward trends are found for the most negative emerging hotel features from T9 to T16, as shown by the negative growth rates over time. A slight drop for lounges, dinners, and views from 2009 to 2010 can be observed. However, they have gained more attention since then. 4.2.3. Identification of specifically interested features Approximately 39 subsets from the data are generated for each of the features on the list constructed earlier to identify specifically interested features. Reviews posted in 2012 and 2013 are considered because these represent the most recent data. Features whose segmentations satisfy maxni fGj ðiÞ g  l are selected because they are of special interest to a particular group of travelers. Next, we perform a segmentation analysis of these features according to traveler origin, resulting in the identification of six features (Fig. 4). Asian and Oceanian travelers are most concerned with bed quality at 67%. Lobby areas of hotels receive particular interest from Asians. Reception areas are the top concern to Oceanian travelers, whereas very few North American travelers focus on this feature. Oceanian travelers comprise the major group that worries the most about noise (42%), parking (37%), and dinner (35%). We then perform segmentation according to travel mode, and identified six features (Fig. 5). Couples focus more on outdoor features, including rivers and views, as well as some indoor aspects, such as dinner, tea, and bed. Internet facilities receive high levels of attention from business travelers (36%) and couples (28%). 5. Discussion and managerial implications Section 4.2 describes a list of 39 hotel features, which are currently of concern to international travelers. The features identified are in line with those emerging from previous studies, which demonstrate the effectiveness of our proposed method. Several outdoor (e.g., streets, parks, and rivers) and transportation features

G. Li et al. / Tourism Management 46 (2015) 311e321

319

Table 6 Emerging trends of hotel features. Features

Growth rate

Trend

c2

p-value

Trend ID

b

309.02 168.80

0.000 0.000

T1 T2

2009vs.2010

2010vs.2011

2011vs.2012

2012vs.2013

Club Lounge

1.111 :1.047

1.434 1.196

1.205 1.221

1.039 :1.009

River

:1.040

1.050

1.142

1.171

78.69

0.000

T3

Pool

:1.014

:1.003

1.134

1.039

126.37

0.000

T4

Service

1.072

1.007

1.074

:1.003

126.65

0.000

T5

Dinner

:1.151

1.139

1.138

1.020

43.72

0.004

T6

Food

1.041

1.012

1.049

1.003

36.40

0.000

T7

View

:1.142

1.064

1.103

1.076

122.16

0.000

T8

Price Street Taxi Bathroom Park Clean Bed Location

:1.152 :1.145 :1.206 :1.141 :1.072 :1.126 :1.056 :1.065

:1.111 :1.117 :1.030 :1.094 :1.161 :1.068 :1.079 :1.053

:1.091 :1.097 :1.082 :1.041 :1.086 :1.038 :1.002 :1.003

:1.090 :1.066 :1.051 :1.074 :1.023 :1.012 :1.032 :1.001

311.90 224.38 141.39 269.61 92.64 262.14 63.72 114.25

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

T9 T10 T11 T12 T13 T14 T15 T16

b

a a a a a a a a

(e.g., airports, taxis, and trains) are also perceived as very important to travelers when evaluating hotels. From such information, hotel managers can thus develop more effective marketing campaigns by presenting these features in their advertisements (e.g., showing pictures of beautiful outdoor settings or mentioning the convenience of local transportation). The identification of emerging hotel features also highlights changes in the topics of concern to travelers over the past five years. Hotel facilities, such as clubs, lounges and pools, now attract significantly more attention. Therefore, managers can develop investment plans focused on these emerging positive features, while directing efforts and resources away from those that receive less attention. The decision-making abilities of managers can also be further enhanced by trend analysis. For instance, long-term plans

can be made for adding club facilities or improving food quality based on the apparent increasing trend in traveler's interest in these features. Section 4 also focuses on some features of specific interest, which can help managers develop effective and targeted improvement plans. For instance, bed designs can be modified to fit the expectations of Asian or Oceanian travelers, because they are most concerned about this issue. Outdoor features, such as rivers or landscape views, should be given special consideration when arranging accommodations for couples. Internet facilities can also be improved to meet the expectations of their main traveler groups, specifically business travelers and couples. The support threshold used in our case study generates a relatively few hotel feature candidates. Nevertheless, the list

Fig. 4. User segmentation by travelers' origin.

Fig. 5. User segmentation by travel mode.

320

G. Li et al. / Tourism Management 46 (2015) 311e321

encompasses most of the features of current concern to travelers. Depending on users' needs and aspirations, a lower support threshold can be set to generate a list, which includes more hotel features. The present case study demonstrates the EPM use in the context of the tourism industry. This method is applied on a largescale data set to identify the general changes and trends in travelers' hotel preferences in the Asia Pacific region. The user profile analysis of previously reported hotel features is conducted only on two popular attributes, namely, traveler origin and travel mode. In the future, more attributes and detailed groupings can be used to construct profiles in real-life applications. 6. Conclusions Hotel managers who are looking at product design and development should understand travelers' concerns so that they can enhance business performance. Managers are interested in emerging issues or trends to make appropriate adjustments to their plans, which can save internal resources and maximize returns on investment. Hotel managers also need to identify features of interest to specific groups to enable them to develop more efficient hotel improvement plans and meet their guests' expectations. Despite considerable efforts made by researchers, generating insights to help tourism managers address travelers' concerns and create a competitive hotel industry remain challenging tasks. Current research has been unable to demonstrate an effective method for addressing such demands comprehensively. To fill this research gap, we applied the EPM concept to discover changes and trends in travelers' attention. A set of features of specific interest and their target users have been identified. The analysis reported in this paper is based on a large-scale data set of online reviews, which is a promising data source because the concerns expressed by travelers in these websites closely reflect those in real-life. An extension of this work could identify which hotel features are usually of interest to a specific type of traveler. This way, managers can design their travel packages to target different groups. More attributes can also be considered to construct more detailed user profiles. The method introduced in this paper is a general technique that can identify specific features from online reviews. This means that the proposed method can be used to pinpoint issues of concern to travelers in other tourism contexts, such as airlines, restaurants, or other attractions. Finally, surveys in future research with a sufficient number of respondents can, and probably should, investigate hotel managers' views in terms of industrial applications. Industry practitioners with various personal and business backgrounds will likely have different views on this approach. Acknowledgment This project was partly supported by a research grant funded by the Hong Kong Polytechnic University, Hong Kong Scholars Program, and a research grant funded by the National Natural Science Foundation of China (71361007). References Albaladejo-Pina, I. P., & Diaz-Delfa, M. T. (2009). Tourist preferences for rural house stays: evidence from discrete choice modeling in Spain. Tourism Management, 30(6), 805e811. Ananth, M., DeMicco, F. J., Moreo, P. J., & Howey, R. M. (1992). Marketplace lodging needs of mature travellers. The Cornell Hotel and Restaurant Administration Quarterly, 33(4), 12e24. Ariffin, A. A. M., & Maghzi, A. (2012). A preliminary study on customer expectations of hotel hospitality: influences of personal and hotel factors. International Journal of Hospitality Management, 31(1), 191e198.

Banyai, M. (2012). Travel blogs: a reflection of positioning strategies? Journal of Hospitality Marketing and Management, 21(4), 421e439. Bjorkelund,, E., Burnett, T. H., & Norvag, K. (2012). A study of opinion mining and visualization of hotel reviews. In Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services, Bali, Indonesia (pp. 229e238). Bosangit, C., Dulnuan, J., & Mena, M. (2012). Using travel blogs in examining postconsumption behavior of tourists. Journal of Vacation Marketing, 18(3), 207e219. Bulchand-Gidumal, J., Melian-Gonzalez, S., & Lopez-Valcarcel, B. G. (2011). Improving hotel ratings by offering free wi-fi. Journal of Hospitality and Tourism Technology, 2(3), 235e246. Capriello, A., Mason, P. R., Davis, B., & Crotts, J. C. (2013). Farm tourism experiences in travel reviews: a cross-comparison of three alternative methods for data analysis. Journal of Business Research, 66(6), 778e785. Carson, D. (2008). The ‘blogosphere’ as a market research tool for tourism destinations: a case study of Australia's northern territory. Journal of Vacation Marketing, 14(2), 111e119. Chaves, M. S., Gomes, R., & Pedron, C. (2012). Analysing reviews in the web 2.0: small and medium hotels in Portugal. Tourism Management, 33(5), 1286e1287. Choi, T. Y., & Chu, R. (2001). Determinants of hotel guests’ satisfaction and repeat patronage in the Hong Kong hotel industry. International Journal of Hospitality Management, 20(3), 277e297. Davidson, L., & Skinner, H. (2010). I spy with my little eye: a comparison of manual versus computer-aided analysis of data gathered by projective techniques. Qualitative Market Research: An International Journal, 13(4), 441e459. Dong, G., & Li, J. (1999). Efficient mining of emerging patterns: discovering trends and differences. In KDD '99 Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA, USA (pp. 43e52). Fan, H., Fan, M., Ramamohanarao, K., & Liu, M. (2006). Further improving emerging pattern based classifier via bagging. In Proceeding of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining (PAKDD), Singapore (pp. 91e96). Fan, H., & Ramamohanarao, K. (2003). A Bayesian approach to use emerging patterns for classification. In Proceeding of the 14th Australasian database conference (ADC-03), Adelaide, Australia (pp. 39e48). Gan, M., & Dai, H. (2009). Efficient mining of top-k breaker emerging sub-graph patterns from graph data set. In Proceeding of the 8th Australasian Data Mining Conference (AusDM), Melbourne, Australia (pp. 183e191). Hsieh, L.-F., Lin, L.-H., & Lin, Y.-Y. (2008). A service quality management architecture for hot spring hotels in Taiwan. Tourism Management, 29(3), 429e438. Huang, Z., Gan, C., Lu, X., & Huan, H. (2013). Mining the changes of medical behaviors for clinical pathways. Studies in Health Technology and Informatics, 192(1e2), 117e121. Huettel, S. (2010). Technology, consumer preferences changing hotel business, Westin Tampa Bay manager says. Florida, United States: Tampa Bay Times. URL http:// www.tampabay.com/news/business/tourism/technology-consumer-preferenc es-changing-hotel-business-westin-tampa-bay/1125506. retrieved, 13 July, 2013. Khadaroo, J., & Seetanah, B. (2008). The role of transport infrastructure in international tourism development: a gravity model approach. Tourism Management, 29(5), 831e840. Kim, J. K., Song, H. S., & Kim, H. K. (2005). Detecting the change of customer behavior based on decision tree analysis. Expert Systems, 22(4), 193e205. Law, R., Rong, J., Vu, H. Q., Li, G., & Lee, H. A. (2011). Identifying changes and trends in Hong Kong outbound tourism. Tourism Management, 32(5), 1106e1114. Lento, X., Park, J., Park, O., & Lehto, M. R. (2007). Text analysis of consumer reviews: the case of virtual travel firms. In M. J. Smith, & G. Salvendy (Eds.), Human interface and the management of information. Methods, techniques and tools in information design (pp. 490e499). Berlin Heidelberg: Springer. Leung, D., Lee, H. A., & Law, R. (2011). The impact of culture on hotel ratings: analysis of star-rated hotels in China. Journal of China Tourism Research, 7(3), 243e262. Li, J., Dong, G., & Ramamohanarao, K. (2000). Instance-based classification by emerging patterns. In Proceedings of the 14th European Conference on Principles and Practice of Knowledge Discovery in Database (PKDD-2000), Lyon, France (pp. 191e200). Li, J., Dong, G., & Ramamohanarao, K. (2001). Making use of the most expressive jumping emerging patterns for classification. Knowledge and Information Systems, 3(2), 1e29. Li, G., Law, R., Rong, J., & Vu, H. Q. (2010). Incorporating both positive and negative association rules into the analysis of outbound tourism in Hong Kong. Journal of Travel & Tourism Marketing, 27(8), 812e828. Li, G., Law, R., Vu, H. Q., & Rong, J. (2013). Discovering the hotel selection preferences of Hong Kong inbound travelers using the Choquet integral. Tourism Management, 36, 321e330. Li, J., Liu, H., Downing, J. R., Yeoh, A. E.-J., & Wong, L. (2003). Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (all) patients. Bioinformatics, 19(1), 71e78. Li, J., & Wong, L. (2002). Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics, 18(5), 725e734. Li, J., & Yang, Q. (2007). Strong compound-risk factors: efficient discovery through emerging patterns and contrast sets. IEEE Transactions on Information Technology in Biomedicine, 11(5), 544e552. Liu, S., Law, R., Rong, J., Li, G., & Hall, J. (2013). Analyzing changes in hotel customers' expectations by trip mode. International Journal of Hospitality Management, 34, 359e371.

G. Li et al. / Tourism Management 46 (2015) 311e321 Liu, Q., Shi, P., & Hu, Z. (2013). Fast algorithms for mining strong jumping emerging patterns using the contrast pattern tree. ICIC Express Letters, Part B: Applications, 4(1), 121e128. Liu, Q., Shi, P., Hu, Z., & Zhang, Y. (2014). A novel approach of mining strong jumping emerging pattens based on BSC-tree. International Journal of System Science, 45(3), 598e615. Lockyer, T. (2005). Understanding the dynamics of the hotel accommodation purchase decision. International Journal of Contemporary Hospitality Management, 17(6), 481e492. Mack, R., Blose, J. E., & Pan, B. (2008). Believe it or not: credibility of blogs in tourism. Journal of Vacation Marketing, 14(2), 133e144. Merlo, E. M., & de Souza Joao, I. (2011). Consumers attribute analysis of economic hotels: an exploratory study. African Journal of Business Management, 5(21), 8410e8416. Munoz Gil, R., Aparicio, F., De Buenaga, M., Gachet, D., Puertas, E., Giraldez, I., et al. (2011). Tourist face: a content system based on concepts of freebase for access to the cultural-tourist information. In Proceedings of the 16th International Conference on Applications of Natural Language to Information System (NLDB 2011), Alicante, Spain (pp. 300e304). Pan, B., MacLaurin, T., & Crotts, J. C. (2007). Travel blogs and the implications for destination marketing. Journal of Travel Research, 46(1), 35e45. Park, J. H., Lee, H. G., & Park, J. H. (2010). Real-time diagnosis system using incremental emerging pattern mining. In Proceedings of the 5th International Conference on Ubiquitous Information Technologies and Applications (CUTE2010), Sanya, China (pp. 1e5). Rayson, P., & Garside, R. (2000). Comparing corpora using frequency profiling. In Proceedings of the Workshop on Comparing Corpora. Hong Kong, China (pp. 1e6). Reisinger, Y., & Turner, L. (1997). Cross-cultural differences in tourism: Indonesian tourists in Australia. Tourism Management, 18(3), 139e147. Ruiz-Martinez, J. M., Minarro-Gimenez, J. A., Castellanos-Nieves, D., GarciaSaanchez, F., & Valencia-Garcia, R. (2011). Ontology population: an application for the E-tourism domain. International Journal of Innovative Computing, Information and Control, 7(11), 6115e6183. Sherhod, R., Gillet, V. J., Judson, P. N., & Vessey, J. D. (2012). Automatic knowledge discovery for toxicity prediction using jumping emerging pattern mining. Journal of Chemical Information and Modeling, 52(11), 3074e3087. Shie, B. E., Yu, P. S., & Tseng, V. S. (2013). Mining interesting user behavior patterns in mobile commerce environments. Applied Intelligence, 38(3), 418e435. Singh, N., Hu, C., & Roehi, W. (2007). Text mining a decade of progress in hospitality human resource management research: identifying emerging thematic development. International Journal of Hospitality Management, 26(1), 131e147. Sohrabi, B., Vanami, I. R., Tahmasebipur, K., & Fazil, S. (2012). An exploratory analysis of hotel selection factors: a comprehensive survey of Tehran hotels. International Journal of Hospitality Management, 31(1), 96e106. Stringam, B. B., & Gerdes, J. J. (2010). An analysis of word-of-mouse ratings and guest comments of online hotel distribution sites. Journal of Hospitality Marketing and Management, 19(7), 773e796. Stringam, B. B., Gerdes, J. J., & Vanleeuwen, D. M. (2010). Assessing the importance and relationships of ratings on user-generated traveler reviews. Journal of Quality Assurance in Hospitality and Tourism, 11(2), 73e92. Tsai, C. Y., & Shieh, Y. C. (2009). A change detection method for sequential patterns. Decision Support Systems, 46(2), 501e511. Tsai, H., Yeung, S., & Yim, P. H. L. (2011). Hotel selection criteria used by mainland Chinese and foreign individual travelers to Hong Kong. International Journal of Hospitality & Tourism Administration, 12(3), 252e267. Tseng, Y.-H., Lin, C.-J., & Lin, Y.-I. (2007). Text mining techniques for patent analysis. Information Processing and Management, 43(5), 1216e1247. Tsou, M.-C. (2010). Geographic information retrieval and text mining on Chinese tourism web pages. International Journal of Information Technology and Web Engineering, 5(1), 56e75. Tussyadiah, I. P., & Fesenmaier, D. R. (2008). Marketing places through first person stories e an analysis of Pennsylvania road tripper blog. Journal of Travel and Tourism Marketing, 25(3/4), 299e311. Varga, B., & Groza, A. (2011). Integrating DBpedia and SentiWordNet for a tourism recommender system. In Proceedings of the 7th International Conference on Intelligent Computer Communication and Processing (ICCP 2011), Cluj-Napoca, Romania (pp. 133e136). Wang, G., Zhao, Y., Zhao, X., Wang, B., & Qiao, B. (2010). Efficient mining local conserved cluster from gene expression data. Neurocomputing, 73(7), 1425e1437. Wenger, A. (2008). Analysis of travel bloggers' characteristics and their communication about Austria as a tourism destination. Journal of Vacation Marketing, 14(2), 169e176. Wilkins, H. (2010). Using importance-performance analysis to appreciate satisfaction in hotels. Journal of Hospitality Marketing and Management, 19(8), 866e888. Witte, R., & Baker, C. J. O. (2005). Combining biological databases and text mining to support new bioinformatics applications. In Natural Language Processing and Information Systems: 10th International Conference on Applications of Natural Language to Information Systems. Alicante, Spain (pp. 310e321). Ye, Q., Law, R., Li, S., & Li, Y. (2011). Feature extraction of travel destinations from online Chinese-language customer reviews. International Journal Services Technology and Management, 15(1/2), 106e118.

321

Yu, H.-H., Chen, C.-H., & Tseng, V. S. (2011). Mining emerging patterns from time series data with time gap constraint. International Journal of Innovative Computing Information and Control, 7(9), 5515e5528. Zhou, X., & Han, H. (2006). Approaches to text mining for clinical medical records. In Annual ACM Symposium on Applied Computing 2006, Technical Tracks on Computer Applications in Health Care. Dijon, France (pp. 235e239).

Gang Li, Ph.D., IEEE Senior member, is a Senior Lecturer at the School of Information Technology, Deakin University. His research interests are machine learning, data mining, and technology applications to tourism and hospitality. He has coauthored four best paper awarded articles. He served as PC member for 80 þ international conferences, and is a regular reviewer for international journals in relevant research areas.

Rob Law, Ph.D. is a Professor at the School of Hotel and Tourism Management, the Hong Kong Polytechnic University. His research interests are information management and technology applications.

Huy Quan Vu is currently a PhD student at Deakin University. His research interests include machine learning, data mining, and their applications in tourism.

Jia Rong, PhD. is a research associate at the School of Information Technology, Deakin University. Her research interests are data mining, multimedia data analysis, and technology applications to tourism and hospitality. She was awarded The Professor of Information Technology Award (2010) for the most academically outstanding PhD student, School of IT, Deakin University, Australia.

Xinyuan (Roy) Zhao, Ph.D., is an Associate Professor in Hospitality Management at Business School, Sun Yat-Sen University (SYSBS). His research has been published widely on top-tier tourism and hospitality journals, and has been funded by National Natural Science Foundation of China, Chinese Department of Education, Guangdong Social Scie n c e Fo u n d a t i o n , a n d G u a n g z h o u S oc i a l S c i e nc e Foundation.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.