Exploring Tourists’ Points of Interest through Social Media Reviews: A Study Based on “Datang Furong” Park in Xi’an

Article information

J. People Plants Environ. 2025;28(1):63-79
Publication date (electronic) : 2025 February 28
doi : https://doi.org/10.11628/ksppe.2025.28.1.63
1Ph.D. Candidate, Department of Landscape Architecture, Jeonbuk National University, Republic of Korea
2Assistant Professor, Department of Landscape Architecture, Jeonbuk National University, Republic of Korea
*Corresponding author: Byungsun Yang, byang@jbnu.ac.kr
First authorXu Chu, chuxu798@gmail.com
Received 2024 December 3; Revised 2025 January 2; Accepted 2025 January 23.

Abstract

Background and objective

Currently, urban parks are important green leisure spaces that play crucial roles in enhancing citizen satisfaction and improving urban living environments. Therefore, this study utilized web crawler technology to collect and analyze user comments from online media platforms to assess visitor satisfaction and points of interest of “Datang Furong Garden” in Xi’an, China.

Methods

Through lexical analysis and sentiment classification, we identified tourists’ preferences and focal points in terms of time, sentiment categories, and geographic regions and compared the experiential differences between local and non-local visitors.

Results

The results showed that tourists were most interested in night scenery. Night scenery elicited positive emotions, whereas ticket prices led to negative emotions. In recent years, tourist interest in night scenery and historical and cultural elements of this site has significantly increased, Local tourists showed a stronger interest in the park’s historical and cultural elements, whereas non-local tourists were more attracted to children’s facilities and parent-child activities. Additionally, there was no significant difference in the emotional experiences of local and non-local visitors when visiting the park (p = 0.223 > 0.05). Therefore night scenery was the core factor attracting visitors.

Conclusion

Night scenery has consistently been the core factor attracting visitors to the park. In the future, further development in historical and cultural aspects is needed. Local tourists are more interested in history, while non-local tourists are more interested in the development of children’s facilities. Additionally, the cost-effectiveness of tickets should be improved to enhance visitor satisfaction. There is currently no evidence to suggest that the park discriminates against non-local visitors or engages in any exclusionary practices.

Introduction

In the fast-paced era of globalization, urban residents face increasing work pressure, which has significantly reduced their leisure time. This has heightened the importance of developing urban recreational spaces to relieve stress. The growing demand for outdoor recreation among urban residents underscores the need for effective landscape management strategies (Sandak et al., 2019). Urban parks are ideal recreational destinations that offer diverse health and social benefits, including promoting physical activity, alleviating stress, and fostering social cohesion (Wolch et al., 2014; Pampel et al., 2010). Despite these benefits, various issues often arise regarding the use of urban parks, such as poorly planned routes and a lack of appealing themes. Therefore, park planners need to adopt user-centered designs to meet the growing needs of visitors, making it essential to collect visitors’ feedback to better understand their interests and preferences.

Collecting urban residents’ opinions to strengthen the construction of urban parks

To strengthen the construction and improvement of urban parks, understand people’s personal perceptions and interests in urban parks, and realize people-oriented design concepts, it is important to collect the opinions of general users, such as urban residents. User surveys are useful tools to understand people’s feelings of use and satisfaction to better meet the needs of the public. Numerous studies have shown that visitor satisfaction plays an important role in motivating visitor behavior, such as giving positive comments, return visits, and recommendations (Dwivedi et al., 2021). Therefore, residents’ satisfaction plays an indispensable role in encouraging repeat visits by tourists and influencing the visitation behavior of other potential visitors, A variety of factors contribute to visitor satisfaction (e.g., ticket price, service quality, and type of activity); therefore, this involves a multidimensional structure consisting of different aspects or sub-structures (Dwivedi et al., 2022) to capture the opinions and ideas of urban residents efficiently. Therefore, it is necessary to adopt a tourist-focused survey method that can quickly and accurately capture respondents’ opinions while identifying the multifaceted factors influencing visitors’ satisfaction with urban parks.

Utilizing user-generated content (UGC) on social media to collect feedback from urban residents

In current research on urban park satisfaction surveys, previous studies have relied on traditional fieldwork survey methods such as on-site observations, surveys, questionnaires, and interviews. Although these methods can also be used to obtain data and information about park visits, they have some drawbacks, including small sample sizes, time-consuming and expensive data collection processes, and difficulty in comparing and evaluating the characteristics of different groups across space and time (Assunção et al., 2014; Malaysia, 2020).

As the use of social media has increased dramatically worldwide, the Internet has facilitated the rapid growth of online user-generated content (UGC), where city dwellers can share experiences of their excursions as well as specific suggestions (e.g., the service attitude of the staff, accessibility of the surrounding transportation, cleanliness of the environment, etc.) (Grimett, 2022; Roberts et al., 2009). UGC can be considered as the consumers’ provision for spontaneous, passionate, and authentic feedback that is free or low-cost and can be easily accessed anytime and anywhere; it also has the advantage of being informative, providing clear and specific feedback that can be recorded over a long time span. Research suggests that UGC is one of the most popular forms of data in green-space research (Dwivedi et al., 2023; Kang et al., 2019; Karlsen and Stavelin, 2013; Su, 2019). As an increasing number of data mining applications are being applied to environmental and urban sustainability issues (Cordell et al., 2009), sentiment analysis is being used to quantify and analyze UGC to explain the different levels of the distribution characteristics of parks to study park satisfaction or popularity based on visitation feelings (Godovykh et al., 2019; Song et al., 2023). Although there has been limited research on UGC satisfaction in Chinese urban parks, an increasing number of studies focusing on these sources have demonstrated that such UGC can be a reliable data source for green space or park visit studies (Wilson et al., 2020). Therefore, to reduce the limitations of traditional research methods, we utilized social media platforms (Ctrip.com) to efficiently obtain online review data and analyze urban residents’ ideas and needs regarding urban parks. As Ctrip (ctrip.com) is not only one of China’s leading social media platforms but also the largest online travel service platform in the country (Bloom et al., 2011; Bloom et al., 2015), it allows visitors to directly book tickets for parks in the study area and quickly share their authentic travel experiences. As a result, social media reviews on this platform are more reliable and credible than those on other social media platforms. In addition, Ctrip offers advantages such as a large volume of data, high timeliness, and easy accessibility. By contrast, other social media platforms have fewer authentic visitors, and their review data may compromise the accuracy of the results. Additionally, data collected from different platforms lacks uniformity in format. Therefore, Ctrip was selected as the sole data collection source.

In our literature review, we found that some scholars have begun using social media review data to identify the factors that significantly influence tourist satisfaction. For example, Liu and Xiao (2020) discovered that elements such as signage systems, mosquitoes, and air quality significantly affected tourist satisfaction. Other researchers have used social media reviews to evaluate the cultural ecosystem services of urban parks, identifying seven types of cultural services: aesthetics, recreation, sports, inspiration, education, cultural heritage, and spiritual fulfillment (Dai et al., 2019). Additionally, some scholars have analyzed emotional patterns in urban parks using social media review data, finding that internal park quality factors influence visitors’ emotional experiences more strongly than external quality factors (Liang et al., 2022). However, in China, most studies have focused on overall tourist satisfaction with parks or points of interest, with limited research exploring the differences in interests between local and non-local visitors and their comparative emotional experiences during visits. Particularly rare are Studies that conduct multidimensional analyses of tourist interests differ from perspectives such as emotions, time periods, and regions. To address this gap, this paper was developed.

Research objectives and hypotheses

Existing research indicates that nightscapes, as an integral part of tourist attractions, often evoke positive emotions in visitors. These emotional experiences are closely associated with tourists’ overall satisfaction and the perceived quality of their visits (Rahmani et al., 2019). However, ticket price is considered a significant factor influencing tourists’ emotions. Excessively high ticket prices can lead to a sense of relative deprivation among visitors, thereby reducing their willingness to purchase and diminishing their overall tourism experience (Xu et al., 2024; Yin and Jung, 2024). Studies on urban parks have shown that cultural and historical heritage constitutes a vital component of a destination’s attractiveness. Modern tourists increasingly prefer destinations that offer rich cultural and historical experiences (Dašić and Savić 2020). Local tourists are generally more interested in the natural landscape and the historical culture of parks. This is likely due to their stronger sense of identity and belonging to the area’s history and culture, leading to greater attention to the historical and cultural elements of the park. Moreover, research suggests that Chinese children typically travel with their nuclear families, emphasizing family bonding and physical activities, including parent-child interactions, during trips (Wu et al., 2019). When analyzing social media comments, we also identified instances of remarks such as “discrimination against non-local tourists” and “unfriendliness towards non-local visitors.” These observations prompt us to address the following specific questions:(1)What are the primary concerns of tourists regarding urban parks, and what factors contribute to both positive and negative emotions?(2)What are the future trends and directions in tourists’ interests in urban parks?(3)Are there significant differences between local and non-local tourists’ interests and emotional tendencies concerning urban parks?

Based on the aforementioned research findings, as well as existing issues and observed phenomena, the following hypotheses are proposed: H1: The primary factor that evokes positive emotions among tourists is the “night view,” whereas the primary factor that triggers negative emotions is the “ticket price.” H2: In the future, tourists will show greater interest in the “historical” and “cultural” elements of urban parks. H3: Local tourists are more interested in the “historical” and “cultural” elements of urban parks, while non-local tourists show greater interest in “child activities activities.” H4: There are significant differences in the emotional experiences of local and non-local tourists regarding urban parks.

Research Methods

Study area

The study site is located at “Datang Furong” Park (Fig. 1; https://www.tangparadise.cn/), southeast of Da Yan Pagoda in Yanta District, Xi’an City, Shanxi Province, China, encompassing an area of 66 ha and a water area of 20 ha. The study area has a warm temperate monsoon climate with four seasons: cold, warm, dry, and wet. It is in the transition zone from the humid climate of the southeast coast to the arid climate of the northwest inland, with two types of climatic features in the region; the winds throughout the year are mostly from the northeast, followed by the northwest.

Fig. 1

Overview of the research area.

Datang Furong” Park is China’s first large-scale royal garden-style cultural theme park that showcases the Tang Dynasty in an all-round manner. In addition, as early as 2011, the “Datang Furong” Park was recognized as a “National 5A-level Tourist Attraction” (https://www.yanta.gov.cn/), indicating that this scenic spot is highly popular, generating large amounts of social media data, making it suitable as an ideal research area.

Data source

We chose Ctrip.com (https://www.ctrip.com/) as the source of the online review data. As a mainstream ticketing and review platform in China, its review data include information such as tourist usernames, ratings, review texts, pictures, and review times and locations (Fig. 2). For this study, we were able to accurately and abundantly obtain data from the above information source.

Fig. 2

Schematic diagram of scraping online comment information.

Data acquisition

Web Scraping is a computer software technology that extracts information from websites (Arora et al. 2015). Web data collection programs simulate the behavior of a browser and can extract any data that can be displayed on the browser; therefore, they are also called screen-scraping programs (McPherson et al., 2016; Acar et al., 2020). Web data collection involves collecting the required unstructured information from a specified website, analyzing and processing it, and then storing it as a local data file in a unified format or directly in a local database (Hu et al., 2014).

To retrieve comment information quickly and accurately, we used a web data collection program called Octoparse, which is a user-friendly and powerful web data scraping tool developed by Shenzhen Visual Web Information Technology Co., Ltd. in China, to extract and classify the data. This tool simulates human browsing behavior and allows for the creation of automated data-scraping workflows through simple page selection(Pu et al., 2022).

Using Octoparse, we automated the process of retrieving and categorizing comments on the Datang Furong Garden from the Ctrip website. By pre-configuring the categories of information to be collected, the number of entries, and enabling the auto-pagination feature, we were able to extract the desired data, such as comment texts, timestamps, and locations. The retrieved data were then converted into structured formats such as Excel files for further analysis. This approach significantly enhances the efficiency of data collection, while ensuring accuracy and completeness. Therefore, Octoparse was chosen as the preferred tool to quickly and accurately obtain large volumes of data relevant to the study area.

Analysis of tourist points of interest

In this study, we utilized the word frequency analysis feature of the GooSeeKer tool to analyze the tourist comment data. Comments were processed in descending order of word frequency to identify the overall focal points of tourist attention within the study area. GooSeeKer is a multifunctional text analysis software capable of performing tasks, such as word frequency statistics, sentiment analysis, and sentiment intensity evaluation. Based on natural language processing (NLP) technology, the tool breaks down comment texts into individual word units and calculates the frequency of each word across the dataset, producing a measure known as “word frequency.” This step is referred to as “word frequency analysis.” By applying various analytical conditions (e.g., time, location, and user identity), we were able to identify key topics or points of interest within different categories of data, based on word frequency.

In word frequency analysis, we primarily focused on nouns and manually excluded irrelevant or redundant words with similar meanings to avoid duplication. Following word frequency analysis, we filtered out nouns based on the word types automatically identified by the GooSeeKer tool. Additionally, meaningless words (e.g., “Oh!” or “Yes!”) were automatically excluded by using the built-in meaningless word dictionary of the tool. We further refined the results by manually excluding words unrelated to park elements (e.g., “Weather” or the names of specific locations within the park). To minimize the impact of synonyms on the results, we used the Baidu Dictionary (Baidu Dictionary, 2023) to map synonyms in the comments. For instance, synonyms such as “mood,” “happiness,” and “joy” were unified under the category “emotion.” Following this approach, we created a matrix of synonym categories (partially presented in Table 1) to ensure that words with different expressions in comments could be grouped under the same point of interest.

Synonym classification matrix

Analysis of points of interest for different types of emotions

In this study, sentiment analysis was employed to determine the emotional polarity (positive or negative) of comment data, classify and summarize comments, and, in combination with word frequency analysis, identify high-frequency words for each type of comment. This approach allowed us to infer key underlying factors. For sentiment type identification, the GooSeeKer tool was used for the analysis. This software utilizes the BERT model, a deep learning algorithm, to extensively train textual data. Based on its robust training model, segmented words were classified into positive, negative, and neutral sentiment groups. Sentiment analysis is a powerful natural language processing (NLP) technique that identifies emotional tendencies in target texts (Jaidka et al., 2020; Zheng et al., 2019) and can calculate the sentiment intensity of each comment based on the number of words in each sentiment category.

To determine the positive and negative factors affecting tourist satisfaction, we automatically classified all comment data into positive, negative, and neutral comment groups according to the sentiment determination dictionary that comes with the GooSeeKer software (which determines statements as positive, negative, or neutral). To make the sentiment determination dictionary more consistent with our research data, we manually adjusted it several times, after which the validity of the sentiment judgment lexicon was verified. A random sample of 30 comments was selected as the test case, and the adjusted sentiment lexicon was applied to evaluate the sentiment of the words in the comment texts. The results were compared with those of manual judgment, achieving an accuracy rate of 98.4%. Therefore, the sentiment-judgment lexicon is effective and suitable for further use.

Subsequently, we used GooSeeKer to perform word frequency analysis on the positive and negative comment groups, and generated word frequency analysis tables for each group. By comparing high-frequency words in positive and negative comments, we identified the key factors underlying tourists’ positive and negative reviews, respectively.

Analysis of points of interest in different time periods

By analyzing tourists’ comments in different time periods, we can understand the trend of tourists’ concerns over time, which provides a reasonable prediction of the subsequent trend of tourists’ concerns using the GooSeeKer software, by dividing the raw comment data from to 2015–2023 into three time periods: 2015–2017, 2018–2020, and 2021–2023. Following the same method, We created separate review groups for each time period and In the word frequency analysis, we selected nouns, merged synonyms(Based on the synonym matrix), and manually deleted irrelevant words. We conducted a word frequency analysis to analyze the main concerns of tourists regarding “Datang Furong” Park in each period.

Analysis of points of interest for tourists from different origins

By analyzing comments from tourists of different origins, we can identify differences in their points of interest. Using GooSeeKer software, we processed 310 location-based comments, categorizing them into local and non-local visitor groups. Word frequency analysis was conducted separately for each group with irrelevant or meaningless words removed. Similarly, by ranking words based on frequency, we identified the differences in points of interest between local and non-local tourists.

Analysis of emotional differences among tourists from different origins

To explore whether there was a significant difference in the sentiment experience of the study area across different origins (local and non-local tourists), we selected 310 geographically grouped comments to evaluate the intensity of the sentiment bias. We assigned +1 points for every positive sentiment word that appeared in the comments and −1 points for every negative sentiment word that appeared in the comments. Using this scoring method, we created a sentiment analysis matrix for the comments of local and non-local tourists using the GooSeeKer tool.

To compare whether there are significant differences in the tourism experience between local and non-local visitors, we imported the sentiment intensity data of local and non-local tourists into the SPSS software. Normality tests were conducted on the data, and independent sample tests were used to analyze the differences between the two groups. Owing to the relatively small sample size (only 150 entries), we employed the Shapiro-Wilk test to assess whether the data followed a normal distribution (Razali and Wah, 2011), supported by Q-Q plots and P-P plots for further normality checks. Since the sentiment score data for both groups did not follow a normal distribution, we used the Mann-Whitney U test, which is more suitable for this type of data, to examine whether there were significant differences in the emotional experience between local and non-local tourists (Okeh, 2009). This indirectly assessed whether there was any discrimination against non-local visitors to the park.

Results and Discussion

Data collection results

We collected comments related to “Datang Furong” Parkand removed invalid comments(e.g., comments with obvious promotional intent or meaningless comments). Valid comments accounted for 98% of the total number of comments, and according to research needs, we categorized all comment data based on different sentiment tendencies, time periods, and places of origin of tourists. To facilitate the subsequent analysis of sentiment scores, we split each complete comment into multiple individual sentences and classified and counted the sentiment tendency of each sentence. The data were compiled (Table 2).

Statistical status of online comment data

During the data collection process, 3,000 pieces of data from “Ctrip” were used. The sample covered over a hundred different points of interest, and significant variations were observed in terms of both emotional types and temporal changes. Overall, the sample size of this study was sufficient. These review texts span 2015 to 2023 and cover all four seasons (spring, summer, autumn, and winter), providing a comprehensive temporal range. However, these data only represent the urban parks in Xi’an, China. Furthermore, 310 of these reviews contained location information, with tourists coming from cities, such as Xi’an, Shanghai, and Beijing. The proportion of local tourists (from Xi’an) an to non-local tourists was approximately 50%. Therefore, the geographic distribution is unlikely to introduce bias into the analysis. Therefore, this data source was considered robust and suitable for further analysis.

We found that, overall, positive reviews dominated (accounting for 74% of the total), indicating that tourists generally evaluated this scenic area positively. The number of reviews from non-local tourists is almost equal to that of local tourists, indicating that the attraction draws a significant number of visitors from outside the region.

Analysis of the overall concerns of tourists

We extracted the top five words from the text of the online comments related to the “Datang Furong” Park (Table 3). Thus, the main concerns of tourists were visualized through the frequency of words. Through a comprehensive lexical analysis of social media comments about the “Datang Furong” Park, we identified the top five elements that visitors were most interested in: night view, architecture, performances, tickets price, and time. To visualize the frequency of each term, we created a word cloud for the overall visitor online review data for the study area (Fig. 3).

Frequency distribution table of top 30 words

Fig. 3

Word cloud of overall online comment data.

We observed that “night view” and “architecture” were the main concerns of the visitors in the study area, Many researchers widely agree that the design of night lighting can significantly enhance tourists’ visual experience, thereby increasing their satisfaction with the attraction (Guo et al., 2011; Faragallah, 2021). Our study further supports this view, with night lighting emerging as the most frequently mentioned theme in this research and cited 973 times. Additionally, studies have shown that architecture with historical and cultural styles is highly popular in tourism as it can attract visitors and enhance cultural experiences (Ananeva and Zharkova, 2020). This is consistent with our findings, as the “architecture” element was mentioned 820 times in the overall comments, making it one of the main points of interest for tourists. Overall, based on the analysis of the data, the “night view” and “ architectural “ elements emerged as the primary focal points for tourists.

Analysis of concerns for different emotion types

We obtained 4,790 comment sentences and calculated the proportions of different sentence types (Fig. 4). We extracted the word frequency distribution tables of the top five nouns in the negative and positive comment groups (Tables 4 and 5, respectively), and created their word clouds (Figs. 5 and 6, respectively).

Fig. 4

Distribution of sentiment tendency in main text.

Frequency distribution table of top 30 positive words

Frequency distribution table of top 30 negative words

Fig. 5

Word cloud of positive online comment data.

Fig. 6

Word cloud of negative online comment data.

In this aspect, the frequency of the word “night view” was significantly higher than in the other aspects, indicating that the night view was the most important reason for tourists to give favorable comments. This was similar to the results of the analysis of the overall review.

In the negative comments, “ticket” was mentioned more frequently than other factors, with comments such as “not good value for money,” “ticket price is too expensive,” and “too little content, not good value for money” accounting for more than half of the comments. This suggests that high ticket prices may be the main reason for negative comments from tourists. Perhaps visitors felt that ticket prices did not match the value of the various aspects of the study area. Research has shown that tourists often express dissatisfaction with high entrance fees and associated costs, and this dissatisfaction is a common theme in negative reviews (Manci and Tengilimoğlu, 2021). This aligns with the findings of the present study. Therefore, based on the analysis, we accept hypothesis H1: the main factor contributing to positive evaluations from tourists is the “night view,” while the primary factor leading to negative evaluations is the “ticket price”.

Analysis of concerns at different periods of time

For each of the three time periods, we separately extracted the top five high-frequency words and organized them into a table (Table 6). For the main concerns of tourists in different periods, we categorized all comments into three periods: “2015–2017,” “2018–2020” and “2021–2023.” This resulted in three groups of reviews. We then conducted word frequency analysis to reveal the changing trend of interest points in each period. To visualize the trend of visitors’ interest points over time and the focus of each period, we created a bar chart based on the high-frequency words in each period (Fig. 7).

Vocabulary frequency statistics for each period

Fig. 7

Vocabulary change situation for each period.

Overall, night view and architecture were consistently the main points of interest for visitors from 2015 to 2023. Therefore, night views and architectural elements have been important factors in attracting visitors in recent years. Additionally, interest in culture and history has gradually increased; however, from 2021 to 2023, interest in culture suddenly declined and even dropped out of the top five. This decline may have been influenced by the COVID-19 pandemic, during which all cultural activities were reduced or even canceled (Wu et al., 2022), leading to decreased attention. However, considering the growth trend of the “culture” element from 2015 to 2020, as well as the further development of the social economy, which has led people to pay more attention to spiritual and cultural needs (Liu et al., 2021), we believe that the “culture” element will continue to attract more visitors in the future. Moreover, attention to the “history” element has gradually increased. Based on these findings, we accept hypothesis H3. In conclusion, visitors will be more interested in the “history” and “culture” elements of urban parks in the future.

Analysis of concerns according to tourists’ origins

To explore the differences in the interests of tourists from different places of origin in the study area, a table of vocabulary frequency analyses of local and non-local tourists’ comments (Tables 7 and 8, respectively) was created. We also created word clouds for local (Fig. 8) and non-local tourists (Fig. 9) to better represent the interests of different types of tourists.

Vocabulary frequency table of local tourist comments

Vocabulary frequency table of non-local tourist comments

Fig. 8

Word cloud map of local tourist.

Fig. 9

Word cloud map of non-local visitor.

Overall, the “night view” garnered significant attention from both local and non-local visitors. However, when comparing the points of interest between the two groups, we found that local visitors showed a greater interest in the “history” element of the park. This can be attributed to their familiarity with the local city and their deeper understanding of its cultural heritage. On the other hand, non-local visitors appeared more concerned with the practical costs of travel, as reflected in their focus on “ticket prices.” They may have higher expectations regarding the balance between costs and values. However, since Xi’an is one of China’s more economically developed cities, the focus on ticket prices may be influenced by varying consumption levels across regions, and is therefore excluded from further analysis. Additionally, non-local visitors showed a greater interest in “children,” suggesting that they often travel with families and exhibit a strong preference for family friendly activities. In conclusion, these findings support hypothesis H3: local visitors are more interested in “history” and “culture” factors, while non-local visitors are more interested in “children and family activities.” Currently, research comparing the differences in interests of local and non-local visitors to urban parks or attractions is scarce, and our study fills this gap in the field.

Analysis of emotional differences among tourists of different origins

We derived the results of sentiment score analysis for local (Table 9) and non-local tourist comments (Table 10).

Main text sentiment analysis results - local tourist comments

Main text sentiment analysis results - non-local tourist comments

For comment data with location information, we collected 156 local and 154 non-local comments. After calculating the sentiment scores of each comment group using sentiment analysis, we created a box-and-line plot of the sentiment scores of each comment group for easy observation (Fig. 10) We deleted outlier comments with particularly high or low scores based on the outliers provided by the box-and-line plots and actual outlier comments (meaningless repetition of the same word in the same comment) in the two comment groups. After processing, we imported the sentiment analysis results for the remaining 150 local and 150 non-local comments into SPSS for further analysis.

Fig. 10

Boxplot of sentiment scores for local and non-local visitor comment groups.

First, we analyzed the descriptive statistics for both sets of data (Table 11). The results showed that the average sentiment score for local comments was 2.50, which was slightly higher than the average sentiment score (2.11) for non-local comments.

Sentiment analysis statistics of main text for local and non-local tourists

The differences between the two datasets were tested to obtain more scientific and accurate test results. Normality of the data was tested before testing (Table 12). Owing to the relatively small sample size of each group (150), we chose to test it using the Shapiro-Wilk test. The results showed p = 0.01 < 0.05; Combined with the histogram and Q-Q plot of the data distribution, we found that the data did not conform to a normal distribution. Therefore, we chose a non-parametric test that was more stable for the non-parametric data. Since local and non-local tourists were two independent samples that did not interfere with each other, we used the Mann-Whitney U test to determine whether there was a significant difference between the two sets of data.

Normality test

According to the test results, there was no significant difference between the sentiment scores of the local and non-local tourist groups (p = .223 > .05) (Table 13). This indicates that there is insufficient evidence to show the difference between the sentiment scores of local and non-local tourists; therefore, we reject hypothesis H4 and that there is not much difference between local and non-local tourists in terms of their satisfaction with the study area or that there is insufficient evidence to show that there is discrimination against out-of-town tourists in the study area.

Summary of independent samples mann-whitney U test

Conclusion

This study combines social media reviews with text analysis to collect and analyze tourists’ opinions. It comprehensively compares differences in tourists’ interests across multiple dimensions (e.g., emotional categories and temporal trends), predicts future trends in tourists’ interests, and examines the differences in interests and emotional experiences between local and non-local tourists. This study fills a gap in existing studies and presents the following research conclusions and specific recommendations for park management policies.

We collected and analyzed online media reviews of “Datang Furong” Park and, overall, the evaluations tended to be positive. The elements of “night view” and “architecture” emerged as the primary points of interest for tourists. To maintain and enhance the park’s appeal, priority should be given to dynamic lighting designs such as using color-changing lights, integrating light animations, and incorporating diverse lighting combinations to improve the visual effects of nighttime illumination. Introducing an artificial intelligence lighting system could enable interactive experiences between visitors and lighting effects, thereby enhancing their sense of immersion and interaction. Additionally, designing dedicated night sightseeing routes that integrate stage performance with dynamic lighting effects could encourage visitors to stay longer, and a preservation and restoration mechanism should also be established, including the development of long-term protection plans. Regular inspection of the surrounding environment and annual monitoring of changes in building damage should be conducted. Restoration efforts should prioritize the use of original materials to avoid the stylistic inconsistencies caused by modern materials. Visitors should be encouraged to participate in maintaining historical buildings, and residents should be engaged in the protection and supervision of these structures, fostering a sense of community involvement and increasing visitor participation in preservation efforts.

The “night view” element was consistently identified as the core factor driving positive reviews, while “ticket price” was the primary reason for negative feedback. Therefore, to make ticket prices more acceptable to a wider range of visitors, differentiated pricing strategies could be adopted based on different time periods, seasons, and visitor groups, aligning more closely with the personalized needs of the public. Group tickets or family tickets can be introduced, accompanied by appropriate discount policies. Additionally, enhancing the value-for-money aspect of tickets by engaging in activities could further increase visitor satisfaction. For instance, visitors can be encouraged to complete specific tasks, such as exploring designated attractions within the park, to earn discounts. Furthermore, promotional content about the park can be shared via social media platforms, allowing visitors to receive ticket price reductions. These initiatives would enrich the visitor experience and garner positive feedback.

Over different periods, visitors’ focus shifted, but night views and architecture consistently remained points of interest. Historical and cultural elements have recently emerged as new points of interest for visitors. Therefore, to attract more visitors in the future, apart from enhancing the appeal of the park’s night view and architecture, it is recommended to introduce cultural activities related to the park’s theme, such as Tang Dynasty-themed handicraft festivals. Unique cultural experiences, such as ancient role-playing and microfilm shooting, can be developed to enhance visitors’ sense of cultural pride and enjoyment. Additionally, during the park renovation process, efforts should be made to incorporate more “historical” elements, such as constructing small-scale structures representative of the Tang Dynasty and building a history museum.

We also found that local tourists were more concerned about the development of historical and cultural aspects, whereas tourists from other regions were more interested in facilities for children and family activities. Therefore, for local tourists, the abovementioned policies to enhance historical and cultural features can be implemented. For non-local tourists, strategies should focus on increasing facilities for children and promoting family friendly activities. Examples include constructing specialized play areas for children, introducing family handicraft workshops, and developing forest adventure trails where children can learn and grow through exploration. Although local tourists tended to be slightly more satisfied with the park than non-local tourists, there was no evidence of a substantial difference between the two groups. Carefully crosschecking the content of the reviews, this slight satisfaction gap could be due to differences in income or spending levels between local and non-local visitors. To address this minor gap, activities that allow non-local tourists to experience local culture could be offered, and the cost-effectiveness of tickets could be further enhanced to increase the appeal for non-local visitors.

In comparison with previous research findings, which highlighted factors such as signage systems, mosquitoes, and the environment as influencing visitor satisfaction, our study newly identified “night view” and “architecture” as significant contributors to visitor satisfaction. Additionally, we provide reasonable predictions about future changes in visitors’ interests. Furthermore, research on the differences in interests and emotional experiences between local and non-local tourists is scarce, and our study fills this gap. However, this study has certain limitations, including the potential bias associated with using data collected solely from the “Ctrip” platform. Additionally, this study focuses on only one representative urban park, which imposes regional constraints on the findings. Although this study focuses on the “Datang Furong” Park, its research framework or methods could be useful for studying other parks or cultural sites. Future research should expand the sample size of urban parks and collect data from multiple social media platforms in order to enhance the generalizability and robustness of the results. Moreover, future studies could explore new research directions, such as differences in tourist interests across various types of parks or among visitors from different regions and cultural backgrounds.

References

Acar G., Englehardt S., Narayanan A.. 2020;No boundaries: data exfiltration by third parties embedded on web pages. Proceedings on Privacy Enhancing Technologies 2020(4):220–238. https://doi.org/10.2478/popets-2020-0070.
Ananeva Y., Zharkova A.. 2020;Architectural copies and imitations of historical cultural styles in tourism industry 4:38–44. https://doi.org/10.32340/2414-9101-2020-4-38-44.
Arora S.K., Li Y., Youtie J., Shapira P.. 2015;Using the wayback machine to mine websites in the social sciences: A methodological resource. Journal of the Association for Information Science and Technology 67(8):1904–1915. https://doi.org/10.1002/asi.23503.
Assunção M.D., Calheiros R.N., Bianchi S., Netto M.A., Buyya R.. 2014;Big data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing 79–80:3–15. https://doi.org/10.1016/j.jpdc.2014.08.003.
Baidu Dictionary. 2023. Baidu dictionary Retrieved February 16, 2024, Retrieved from https://dict.baidu.com.
Bloom N., Liang J., Roberts J., Ying Z.J.. 2015;Does working from home work? Evidence from a Chinese experiment. The Quarterly Journal of Economics 130:165–218.
Cordell D., Drangert J., White S.. 2009;The story of phosphorus: Global food security and food for thought. Global Environmental Change 19(2):292–305. https://doi.org/10.1016/j.gloenvcha.2008.10.009.
Dašić D., Savić M.. 2020. The influence of cultural and historical heritage on the attractiveness of a tourist destination Bastina: https://doi.org/10.5937/BASTINA30-27671.
Dai P., Zhang S., Chen Z., Gong Y., Hou H.. 2019;Perceptions of cultural ecosystem services in urban parks based on social network data. Sustainability 11(19):5386. https://doi.org/10.3390/su11195386.
Dwivedi Y.K., Ismagilova E., Hughes D.L., Carlson J., Filieri R., Jacobson J., Jain V., Karjaluoto H., Kefi H., Krishen A.S., Kumar V., Rahman M.M., Raman R., Rauschnabel P.A., Rowley J., Salo J., Tran G.A., Wang Y.. 2021;Setting the future of digital and social media marketing research: Perspectives and research propositions. International Journal of Information Management 59:102168. https://doi.org/10.1016/j.ijinfomgt.2020.102168.
Dwivedi Y.K., Laurie H., Abdullah M.B., Samuel R.-N., Mihalis G., Mutaz A.-D., Denis D., Bhimaraya M., Dimitrios B., Christy M.K.C., Kieran Conboy C., Ronan D., Rameshwar D., Vincent D., Reto Felix F., Goyal D.P., Anders G., Chris H., Ikram J., Marijn J., Kim Y.-G., Kim J., Koos S., Kreps D., Kshetri N., Kumar V., Ooi K.-B., Papagiannidis S., Pappas I.O., Polyviou A., Park S.-M., Pandey N., Queiroz M.M., Raman R., Rauschnabel P.A., Shirish A., Sigala M., Spanaki K., Wei-Han Tan G., Tiwari M.K., Viglia G., Wamba S.F.. 2022;Metaverse beyond the hype: Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. International Journal of Information Management 66:102542. https://doi.org/10.1016/j.ijinfomgt.2022.102542.
Dwivedi Y.K., Kshetri N., Hughes L., Rana N.P., Baabdullah A., Kar A., Koohang A., Ribeiro-Navarrete S., Belei N., Balakrishnan J., Basu S., Behl A., Davies G., Dutot V., Dwivedi R., Evans L., Felix R., Foster-Fletcher R., Giannakis M., Gupta A., Hinsch C., Jain A., Patel N.J., Jung T., Juneja S., Kamran Q., Mohamed S., Pandey N., Papagiannidis S., Raman R., Rauschnabel P.A., Tak P., Taylor A., tom Dieck M.C., Viglia G., Wang Y., Yan M.. 2023;Exploring the darkverse: A multi-perspective analysis of the negative societal impacts of the Metaverse. Information Systems Frontiers 25(5):2071–2114. https://doi.org/10.1007/s10796-023-10400-x.
Faragallah R.. 2021;Enhancing social engagement through nightscape in qaitbay promenade in alexandria, Egypt. Riham Nady Faragallah :129–146. https://doi.org/10.4018/978-1-7998-7004-3.CH009.
Godovykh M., Milman A., Tasci A.. 2019;Theme park experience: Factors explaining amount of pleasure from a visit, time allocation for activities, perceived value, queuing quality, satisfaction, and loyalty. Journal of Tourism and Leisure Studies 4(2):1–21. https://doi.org/10.18848/2470-9336/cgp/v04i02/1-21.
Grimett L.. 2022;The status of coastal marine tourism in KwaZulu Natal in 2022. American Journal of Industrial and Business Management 12(12):1725–1760. https://doi.org/10.4236/ajibm.2022.1212095.
Guo Q., Lin M., Meng J., Zhao J.. 2011;The development of urban night tourism based on the nightscape lighting projects--a Case Study of Guangzhou. Energy Procedia 5:477–481. https://doi.org/10.1016/J.EGYPRO.2011.03.083.
Hu H., Wen Y., Chua T., Li X.X.. 2014;Toward scalable systems for big data analytics: A technology tutorial. IEEE Access 2:652–687. https://doi.org/10.1109/access.2014.2332453.
Huang C., Chen S.. 2015;Smart tourism: Exploring historical, cultural, and delicacy scenic spots using visual-based image search technology. Applied Mechanics and Materials 764–765:1265–1269. https://doi.org/10.4028/www.scientific.net/AMM.764-765.1265.
Jaidka K., Giorgi S., Schwartz H.A., Kern M.L., Ungar L.H., Eichstaed J.C.. 2020;Estimating geographic subjective well-being from Twitter:a comparison of dictionary and data-driven language methods. Proceedings of the National Academy of Sciences 17(19):10165–10171. https://doi.org/10.1073/pnas.1906364117.
Kang Y., Jia Q., Gao S., Zeng X., Wang Y., Angsuesser S., Liu Y., Ye X., Fei T.. 2019;Extracting human emotions at different places based on facial expressions and spatial clustering analysis. Transactions in GIS 23(3):450–480. https://doi.org/10.1111/tgis.12552.
Karlsen J., Stavelin E.. 2013;Computational journalism in Norwegian newsrooms. Journalism Practice 8(1):34–48. https://doi.org/10.1080/17512786.2013.813190.
Leung D., Law R., Lee H.A.. 2011;The perceived destination image of Hong Kong on Ctrip. com. International Journal of Tourism Research 13:124–140. https://doi.org/10.1002/jtr.803.
Liang H., Yan Q., Yan Y., Zhang L., Zhang Q.. 2022;Spatiotemporal study of park sentiments at metropolitan scale using multiple social media data. Land 11(9):1497. https://doi.org/10.3390/land11091497.
Liu R., Xiao J.. 2021;Factors affecting users’ satisfaction with urban parks through online comments data: Evidence from Shenzhen, China. International Journal of Environmental Research and Public Health 18(1):253. https://doi.org/10.3390/ijerph18010253.
Shan Liu, La Yun B., Huang L.. 2021;Application of image style transfer technology in interior decoration design based on ecological environment. Journal of Sensors :1–7. https://doi.org/10.1155/2021/9699110.
Malaysia M.. 2020;Planning Malaysia :18. https://doi.org/10.21837/pm.v18i13.798.
Manci A., Tengilimoğlu E.. 2021;TripAdvisor Ziyaretçi Yorumlarının İçerik Analizi: Göbeklitepe Örneği. IEEE Transactions on Reliability 5(2):1525–1545. https://doi.org/10.26677/TR1010.2021.779.
McPherson R., Houmansadr A., Shmatikov V.. 2016;Covertcast: Using live streaming to evade internet censorship. roceedings on Privacy Enhancing Technologies 2016(3):212–225. https://doi.org/10.1515/popets-2016-0024.
Noffsinger W.B., Niedbalski R., Blanks M., Emmart N.. 1998;Legacy object modeling speeds software integration. Communications of the ACM 41(12):80–89. https://doi.org/10.1145/290133.290153.
Okeh U.. 2009;Statistical analysis of the application of Wilcoxon and Mann-Whitney U test in medical research studies. Biotechnology and Molecular Biology Reviews 4(6):128–131.
Ostrom E.. 1996;Crossing the great divide: Coproduction, synergy, and development. World Development 24(6):1073–1087. https://doi.org/10.1016/0305-750x(96)00023-x.
Pampel F.C., Krueger P.M., Denney J.T.. 2010;Socioeconomic disparities in health behaviors. Annual Review of Sociology 36(1):349–370. https://doi.org/10.1146/annurev.soc.012809.102529.
Pu X., Jiang Q., Fan B.. 2022;Chinese public opinion on Japan’s nuclear wastewater discharge: a case study of Weibo comments based on a thematic model. Ocean and Coastal Management 225:106188. https://doi.org/10.1016/j.ocecoaman.2022.106188.
Rahmani K., Gnoth J., Mather D.D.. 2019;A psycholinguistic view of tourists’ emotional experiences. Journal of Travel Research 58:192–206. https://doi.org/10.1177/0047287517753072.
Razali N., Wah Y.. 2011;Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics 2(1):21–33.
Roberts N.S., Chavez D.J., Lara B.M., Sheffield E.A.. 2009;Serving culturally diverse visitors to forests in California: a resource guide. :76. https://doi.org/10.2737/psw-gtr-222.
Ryan R.M., Deci E.L.. 2000;Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist 55(1):68–78. https://doi.org/10.1037/0003-066x.55.1.68.
Sandak A., Sandak J., Brzezicki M., Kutnar A.. 2019;In environmental footprints and eco-design of products and processes. Bio-based building skin https://doi.org/10.1007/978-981-13-3747-5.
Song Y., Yang R., Lu H., Fernandez J., Wang T.. 2023;Why do we love the high line? A case study of understanding long-term user experiences of urban greenways. Computational Urban Science 3(1)https://doi.org/10.1007/s43762-023-00093-y.
Su C.. 2019;Changing dynamics of digital entertainment media in China https://doi.org/10.5204/thesis.eprints.130744.
Wilson B., Tierney S., Toland B., Burns R., Steiner C., Adams C., Nixon M., Khan R., Ziegler M., Osburg J., Chang I.. 2020;Small unmanned aerial system adversary capabilities. RAND Corporation eBooks https://doi.org/10.7249/rr3023.
Wolch J.R., Byrne J., Newell J.P.. 2014;Urban green space, public health, and environmental justice: The challenge of making cities ‘just green enough.’. Landscape and Urban Planning 125:234–244. https://doi.org/10.1016/j.landurbplan.2014.01.017.
Wu M., Wall G., Zu Y., Ying T.. 2019;Chinese children’s family tourism experiences. Tourism Management Perspectives https://doi.org/10.1016/J.TMP.2018.11.003.
Wu Y., Yin G., Zhang Y.. 2022;Experience and perceptions of Chinese university students regarding the COVID-19 pandemic: A qualitative analysis. Frontiers in public health 10:872847. https://doi.org/10.3389/fpubh.2022.872847.
Xu X., Xue K., Li F.. 2024;The effect of price perception on tourists’ relative deprivation and purchase intention. Current Issues in Tourism 27:59–75. https://doi.org/10.1080/13683500.2022.2150153.
Yin X., Jung T.. 2024;Analysing the causes of tourists’ emotional experience related to tourist attractions from a binary emotions perspective utilising machine learning models. Asia Pacific Journal of Tourism Research 29:699–718. https://doi.org/10.1080/10941665.2024.2343077.
Zheng S., Wang J., Sun C., Zhang X., Kahn M.E.. 2019;Air pollution lowers Chinese urbanites’ expressed happiness on social media. Nature Human Behaviour 3:237–243.

Article information Continued

Fig. 1

Overview of the research area.

Fig. 2

Schematic diagram of scraping online comment information.

Fig. 3

Word cloud of overall online comment data.

Fig. 4

Distribution of sentiment tendency in main text.

Fig. 5

Word cloud of positive online comment data.

Fig. 6

Word cloud of negative online comment data.

Fig. 7

Vocabulary change situation for each period.

Fig. 8

Word cloud map of local tourist.

Fig. 9

Word cloud map of non-local visitor.

Fig. 10

Boxplot of sentiment scores for local and non-local visitor comment groups.

Table 1

Synonym classification matrix

Tags Synonyms
Emotion Mood, happiness, joy, delight, sadness, regret, sorrow
Tourists Visitors, people, travelers, spectators, citizens, everyone
Artificial landscapes Artificial rockeries, artificial landscapes, artificial lakes
Staff Staff, service attitude, attitude

Table 2

Statistical status of online comment data

Sentence/Text Type Subcategory Quantity Percentage (%) Total
Sentence Emotion Positive Sentences 2853 60% 4790
Neutral Sentences 1493 31%
Negative Sentences 444 9%

Text Time Period 2015–2017 103 8% 1321
2018–2020 640 48%
2021–2023 578 44%

Location Local (Shanxi Province) 156 50% 310
non-local 154 50%

Emotion Positive Reviews 2172 74% 2937
Neutral Reviews 513 17%
Negative Reviews 252 9%

Source Data Valid Data 2937 98% 3000
Duplicate Data 63 2%

Table 3

Frequency distribution table of top 30 words

No. Tag Frequency Document Frequency Word Type
1 Night view 973 719 Noun
2 Architecture 820 366 Noun
3 Performance 566 322 Noun
4 Ticket price 503 384 Noun
5 Time 432 307 Noun

Table 4

Frequency distribution table of top 30 positive words

No. Tag Frequency Document Frequency Word Type
1 Night view 694 560 Noun
2 Performance 364 241 Noun
3 Building 351 241 Noun
4 Ticket prices 251 198 Noun
5 Culture 245 125 Noun

Table 5

Frequency distribution table of top 30 negative words

No. Tag Frequency Document Frequency Word Type
1 Ticket prices 162 122 Noun
2 Night view 83 72 Noun
3 Time 67 53 Noun
4 Performance 46 30 Noun
5 Building 30 22 Noun

Table 6

Vocabulary frequency statistics for each period

2015–2017 2018–2020 2021–2023



Label Word Word Frequency Label Word Word Frequency Label Word Word Frequency
Night View 18 Night View 220 Night View 214
building 14 building 210 building 84
Ticket prices 14 Culture 166 History 69
Time 13 History 126 Performance 55
Culture 10 Performance 114 Ticket prices 54

Table 7

Vocabulary frequency table of local tourist comments

No. Label Word Frequency Document Frequency Word Type
1. Night View 92 66 Noun
2. building 42 30 Noun
3. History 37 25 Noun
4. Performance 30 15 Noun
5. Time 29 23 Noun

Table 8

Vocabulary frequency table of non-local tourist comments

No. Label Word Frequency Document Frequency Word Type
1. Night View 58 49 Noun
2. Ticket prices 17 12 Noun
3. building 16 11 Noun
4. Performance 15 11 Noun
5. Children 13 8 Noun

Table 9

Main text sentiment analysis results - local tourist comments

Text Words (Points) Score
Went in the evening for the night view and the view was pretty, but it was really cold, and we were about to freeze to death. Pretty (+1) 1
Wenting there at night and night lighting was great, and many people took pictures, perfect for photo ops. Great (+1); Perfect (+1) 2
... ... ...
... ... ...
... ... ...
February 1, 2023. Travelogue. The Datang Hibiscus Garden” was the least worthwhile attraction I visited in Xi’an. Strongly do not recommend it. It is just a park with no performance and only a few lights, and I wasted a hundred dollars. Least worthwhile (−1); Do not recommend (−1); Only a few lights (−1) −3

Table 10

Main text sentiment analysis results - non-local tourist comments

Text Words (Points) Score
Not bad for an evening light show. Not bad (+1) 1
The night view was still good. But it’s a bit dark at night, so if you don’t bring a fill light, you can only get black faces in pictures of people. Good (+1); A bit dark (−1) 0
... ... ...
... ... ...
... ... ...
The overall feeling is good and very recommendable; if you choose the package that is more cost-effective, the show tickets to choose the general ticket will be fine. Good (+1); Recommendable (+1); cost-effective (+1) 3

Table 11

Sentiment analysis statistics of main text for local and non-local tourists

Calculation Items Local non-local
Valid 150 150
Missing 150 150
Average 2.5 2.11
Median 2 2
Standard Deviation 2.41 1.92
Variance 5.78 3.67
Minimum Value −4 −3
Maximum Value 10 7

Table 12

Normality test

Category Kolmogorov-smirnova Shapiro-wilk


Statistics Degree of freedom CTT Significance Statistics Degree of freedom CTT Significance
Emotion score Local 0.14 150 0.00 0.97 150 0.00
non-local 0.16 150 0.00 0.96 150 0.00

Lilliefors corrected significance probability

Table 13

Summary of independent samples mann-whitney U test

Type of Inspection Emotional score
Mann-Whitney U 10344.50
Wilcoxon W 21669.50
Z −1.22
Asymp. Sig. (2-tailed) 0.22