Where to open a Coffee Shop in London? Mining Online Location-based Social Networks to tackle the site selection problem in urban areas

Tags: coefficient, London, spatial distribution, Coffee Shop, intra coefficient, retail store, business activities, coffee shops, selection problem, retail store locations, data analysis, summary statistics, dynamic mobility data, land economics, network science approach, site selection, business administration, Social Networks, mobility data, check-ins, Category, Jensen, Train Station 728 Coffee, Christoph Besel University College London, location-based, coefficients, fundamental results, statistical models, Fitness Center Sandwich Place Sushi Restaurant, geographic location, urban areas, spatial information, spatial data, location data, optimal location, spatial interactions, quality index, Barbershop Burger Joint Coffee Shop, Complementary Cumulative Distribution Function, business activity, Tech Startups, dynamic mobility, Cumulative Distribution Function, spatial distributions, random distribution, intra coefficients, aggregation, spatial graph, retail stores, NA
Content: Where to open a coffee shop in London? Mining Online Location-based Social Networks to tackle the site selection problem in urban areas Christoph Besel University College London, WC1E 6BT London, UK Abstract. This paper describes the implementation and evaluation of a network science approach to the site selection problem leveraged by data from online location-based social networks. The analysis includes static geographic, as well as dynamic user mobility data both mined from Foursquare. The spatial interactions between business activities and the mobility pattern of customers in London are studied. It is shown how this data can be used to predict optimal retail store locations as for example to open a new coffee shop in urban areas. 1 Introduction "Open a new coffee shop in one street corner and it may thrive with hundreds of customers. Open it a few hundred meters down the road and it may close in a matter of months" [6]. 1.1 Problem Statement Choosing a good location for a new business is of prime importance for its success, consequently the process of site selection has been researched by a broad spectrum of disciplines since many years. Simple methods such as checklists, as well as more sophisticated statistical models have been available in land economics and business administration [3] since at least 50 years . traditional approaches use demographics, census data, consumer surveys and human flow statistics, which are all time-intensive and expensive to obtain. However, with the growing popularity of online location-based social networks like Foursquare, fine-grained mobility data of hundreds of thousands of users has become attainable. The present paper investigates on possibilities how to exploit this new source of data to improve the site selection process for a new retail business and by way of example answers the introductory question of where to open a new coffee shop in London. 1.2 Main contributions The main objective of this work is to apply a network-based approach inspired by the work of Jensen [5, 4] to a large-scale Foursquare dataset containing long
term global check-in data to leverage it for predicting the optimal location of (new) retail stores in London. The results of this work show how a network science approach could be leveraged by data mined from the online location-based social networks to tackle the site selection problem. It also provides interesting insights and empiric evidence for human behaviour in urban environments that could be used for urban development. For instance, it can provide an approach to explain why certain retail shops are opened and visited more frequently in some locations than others or why tech Startups are particularly highly clustered. 2 Definitions and Background 2.1 The site selection problem Given a set L of potential areas that are considered by a company to open a new retail shop at, the site selection problem is defined by identifying the area l L that will generate the highest revenue, which usually is equivalent to attracting the most customers. 2.2 Spatial Graphs The nodes and edges of spatial graphs are embedded in metric space. This means that space is relevant, and the network topology alone does not contain all the information. In most cases weighted edges convey the spatial information. Well known examples of spatial graphs are the transportation and mobility networks, the electrical power grid or social contact networks. A further distinction of spatial networks can be made by classifying them as planar and non-planar. Whereas the former type of networks can be drawn in a plane such as the edges only intersect in their endpoints this is not the case for non-planar graphs. Highways and urban road networks are a typical example of planar graphs, mobile phone networks or the aviation network are usually non-planar graphs. 2.3 Online Location-based Social Networks Online Location-based Social Networks allow users to share their geographic location. Users can check-in to venues, which are in immediate geographic proximity and share this information with businesses (for example to get loyalty discounts) and friends. As mobile phones with cheap, but high-quality GPS sensor modules became widely disseminated, location-based networks started to originate and became increasingly popular. The most popular location-based social network is Foursquare, which was launched in New York seven years ago and attracts about 50 million monthly active users today. Even though check-in data is not publicly available users can post their check-ins on Twitter, which also can be set as a default option and
allows scraping data through the Twitter-Stream API. The dataset used in the present Research Project was collected by applying this approach, see Section 4.1 for more details. 2.4 Static geographic versus dynamic mobility data To offer its service Foursquare has to provide one of the most extensive point of interest (POI) databases (besides Google Maps) with exact geo-locations and a fine-grained hierarchical category system, which is a valuable source of (static) geographic information. Foursquare datasets not only allow to find out about places and number of check-ins, but also provide temporal data as subsequent check-ins can be seen as user movement/mobility. Following [6], a differentiation of static geographic (spatial) data and dynamic mobility (temporal-spatial) data will be used in this paper. 3 Related Work 3.1 The Site Selection Problem in economics The selection of sites for new businesses has been researched in business administration and land economics for at least 50 years now (for example see [2, 1]). The proposed approaches range from simple checklists, scoring models following an economic rationale to more complex statistical models. However, as Hernandez and Bennison (2000) [3] point out, despite the existence of such models in practice most business owners take their retail location decisions based on intuition and gut feeling, seeing site selection more as an "art" rather than a science. Other reasons are the prohibitive costs and the time it takes to acquire all the necessary data for the models. 3.2 Application of Network Science to the site selection problem In his seminal work [5, 4], Jensen analysed the distribution of retail store locations in the city of Lyon (France). The main rationale of his work is, that the deviation of the empirical distributions of retail store locations from a purely random distribution, which by definition is non-interacting, allows to infer the interaction of retail store activities and by this gain a deeper understanding of the principals governing the structure of their locations. Points in the two-dimensional geographic space represent retail stores/venues of a particular category, for example "Coffee Shop", "Restaurant", "Shoe Shop", et cetera. One can be interested in the interaction of points between themselves (for example "do coffee shops open next to each other or do they avoid each other?") or with other points ("open coffee shops next to restaurants or offices?"). Tow different coefficients to quantify both of the above mentioned interaction relations respectively are introduced: The intra coefficient and inter coefficient of categories.
Given points of the two different categories A and B, then one has Nt sites, of which NA are of type A and NB are of type B. For a given site S the total number of other sites within a radius of r around this site is denoted by Nt(S, r). The number of sites of type A and B is denoted by NA(S, r) and NB(S, r) respectively.
Intra Coefficient The intra coefficient measures the distribution of retail stores of category A around retail stores of the same category ("how is the spatial distribution of other coffee shops around a coffe shop?"). Following the reference law, which is a pure random distribution, the local concentration represented by the ratio NA(Ai, r)/Nt(Ai, r) of stores of type A around a given store of type A should, in average, not depend on the presence of this last store, and should thus be (almost) equal to the global concentration NA/Nt of stores of type A in the whole town. Any statistical significant deviation of this distribution represents a structural pattern of retail store placement, that is captured and quantified by the following intra coefficient:
Nt - 1 NA(NA - 1)
NA i=1
NA(Ai, r) Nt(Ai, r)
The expected value of the intra coefficient is 1, which leads us to deduce the following qualitative behaviour: If the observed value of the intra coefficient is greater than 1, we assume that A stores tend to aggregate, whereas a lower value would indicate a dispersion tendency.
Inter Coefficient The definition of the inter coefficient is very similar to the one of the intra coefficient. However, this time the average spatial distribution of type B points in the local surrounding of type A points is compared to a uniform random distribution ("are coffee shops more frequent around restaurants than elsewhere in town?"). The deviation of a random distribution is quantified by the following inter coefficient:
NA i=1
NB(Ai, r) Nt(Ai, r) - NA(Ai, r)
As the expected value of the inter coefficient is 1 again, we can deduce the following qualitative behaviour: If the observed value is greater than 1, we assume that A stores have a tendency to attract stores of type B, whereas a lower value indicates a rejection tendency.
Analysing Retail Store Interaction with Network Science From the coefficients defined above a network of retail store structure can be derived, where the nodes represent the different retail store categories (e.g. Coffee Shop, Italian Restaurant, Shoe store, etc.) and the weighted links are given by aAB = log(MAB). This results in a graph with positive and negative "anti-links", meaning repulsive links between nodes.
Based on the graph a clustering algorithm can be applied in order to identify communities. Jensen showed that more than 90 % of positive interactions could be captured inside of the groups. This strong cohesion shows the high quality of the clustering. Interestingly, the formed clusters also reflect categories of commonly used commercial classification systems, which validates the approach further. It also shows the amount of structural information that can be recovered exclusively from location data. A network-based quality index for retail store locations Based on the defined coefficients and the derived network a quality index for retail store locations can be defined. The basic idea is, the more places that attract stores of the given category exist in the area, the better the quality of the location is. For this reason a graph based on the inter-category coefficient is built (corresponding to the intra coefficient). The quality QA for a given location/site S and corresponding radius r for opening a new retail store of type A are the summed variances of the number of all other categories around the location weighted by the corresponding in-link in the inter-category graph: QA = aXA · (NX (S, r) - NX (S, r)) X T Where T is the set of all retail store categories and NX (S, r) the average number of type X stores in the area of A stores. 3.3 Leveraging Jensen's models with online location-based networks Jensen published his latest work on the network-based quality index for retail store locations in the same year Foursquare was launched. Even though Geo information systems (GIS) have been around since a few years back then, it was quite expensive to collect the relevant data even for a small city like Lyon, which often had to be done manually. However, with the advent of online location-based social networks like Foursquare, the necessary geographical and user mobility data became available in a scale and fine-grained quality never seen before. As it became clear, that this data could be useful for scientific research several publications explored those new possibilities. A model of urban mobility patterns In their much-noticed work Noulas et. al. [8] analysed the movement of people in urban environments around the globe using Foursquare data. They found that, even though there are variations in human movements in different cities, this only is due to a different distribution of places. They proposed a universal law for human mobility, which is based on the number of places (and their distribution) from origin to destination rather than on pure physical distance (as in other works), which could accurately capture
the mobility patterns in 34 different cities around the globe. With their work the authors have been among the first to show the extensive opportunities datasets of online location-based social networks can offer to scientific research. Predicting retail store placements with Foursquare Foursquare itself allows business owners to register for a business account and view basic statics about the check-ins to their business (temporal patterns, basic demographics of users, etc.). Besides that, Karamshuk et. al. [6] proposed a machine learning approach to predict optimal retail store locations based on a Foursquare dataset. For this they mined a broad range of static geographic features, including also one metric proposed by Jensen, and dynamic mobility features. In their evaluation the proposed supervised learning approach proved to be able to identify the real location of a retail store (ground truth) in more than 8 out of 10 cases. However, they only analysed three particular retail chains and the dataset they used was restricted to New York and rather small. Apart from this, they did not use the full network-based approach of Jensen, but only took the inter coefficient as one feature of their machine learning model. Nevertheless, their work proved the relevance of online location-based social networks data to predict optimal retail store locations. 4 Methodology The main objective of this research project is to leverage the work of Jensen [5, 4] with a large Foursquare dataset and apply it to London. The following section will give a brief description of the Foursquare dataset. Subsequently, an overview of the performed analysis and the thereby used metrics will be given. 4.1 Yang et. al. Foursquare dataset The used dataset was published by Yang et. al. [10, 9] and includes long term (about 18 months from April 2012 to September 2013) global-scale check-in data collected from Foursquare. It contains 33,278,683 check-ins by 266,909 users on 3,680,126 venues (415 cities in 77 countries) making it the most extensive publicly available Foursquare dataset (up to the knowledge of the author). The 415 cities, including London, are the most checked-in cities in the world (with at least 10,000 check-ins each). The dataset is separated in three different files. Whereas the first file contains all collected check-ins, with an anonymised user ID, Foursquare venue ID and a corresponding timestamp, the second file contains all data of the checkedin venues, including their unique venue ID, their geo-location (longitude, and latitude), the Foursquare venue category name (e.g. "Coffee Shop") and a country code. The last file lists the 415 cities with their geo-location and country code the check-ins took place in.
Following the classification introduced in section 2.4 the first data file contains dynamic user mobility (temporal-spatial) data and the second contains the static geographic (spatial) data. 4.2 Filtering the dataset and summary statistic In a first step, only the relevant data was extracted from the dataset by filtering the venue locations within the geographical boundaries of London (coordinates for the square containing Greater London: 5115'36.0"N 029'24.0"W and 5141'24.0"N 015'36.0"E). This lead to a remaining number of 28,687 venues and 188,530 corresponding check-ins. As the distribution of the fine-grained Foursquare venue categories follows a power-law with a long tail of categories only covering a single (or very few) venues, we consider only categories that cover at least 120 venues/places. This results in 50 different categories still covering about two third of all venues in London (Jensen distinguished 55 different categories, however it has to be noted that not all of our categories are retail trades strictly). To get a first overview of the dataset, a few Summary statistics are then calculated. Following other author's good practice a visualisation of the data is generated, as it often allows to get a very quick overview of the underlying structure of spatial distributions. In this case a hexbin plot [7] with a grid size of 400 is drawn. This type of bivariate histogram plot often is more informative than a simple scatter plot, especially if the size of the dataset is very large. For a similar summary statistic of the dynamic mobility data, the distribution of check-ins is analysed. For this reason the Complementary Cumulative Distribution Function (CCDF) for all check-ins and the check-ins to venues of particular categories is calculated. 4.3 Mining online location-based social networks to find the optimal retail store location The main part of the research project consists in the analysis of static geographic (spatial) and dynamic mobility (temporal-spatial) data to predict optimal retail store locations in London (for example to open a new coffee shop at). Static geographic (spatial) data analysis Relating to the static geographical (spatial) data the analysis mainly consists in calculating Jensen's inter and intra coefficients as introduced in section 3.2 on page 3. Subsequently, a spatial graph with the 50 different venue categories as nodes and positive/negative link weights based on the corresponding coefficients is generated as described in section 3.2. Dynamic mobility (temporal-spatial) data analysis Regarding the dynamic mobility data the user movement between venues of the different categories (business activities) is analysed. In more detail this is represented by the users' transitions between places inferred from their (temporal) consecutive
check-ins to the venues. If the same user checks-in to a venue of category A and after some time to a venue of type B in the same city, this is seen as an urban user movement (as defined in [8]). Based on the user movements a transition ratio r is defined. It is the ratio between the transition probability of a given pair of venue categories over the random transition probability between any pair of categories [6]:
|{(o, d)|(o, d) T (o A d B)}|
rAB = (| | - 1) ·
|{(o, d)|(o, d) T d B}
where T is the set of all transitions (consecutive check-ins of the same user, going from origin o to destination d) and the set of all venue categories (A, B ). Once again, the calculated transition ratio r can be used to weight the links of a category graph (nodes are the venue categories). The higher the weight of incoming edges, the higher the probability to attract visitors coming from this type of venue.
5 Results This section describes the results of this work in more detail. A first overview of the dataset is given by summary statistics, subsequently, the static geographic (spatial) and the dynamic mobility data are analysed respectively.
5.1 Summary statistics For an overview of the spatial distribution of different venue types corresponding hexbin plots were generated. As one could see from fig. 1 on the next page taken all together, the venues reflect the urban structure of London (parks, rivers and population density clearly visible), however the different venue types have a distinctive spatial distribution. Coffee Shops tend to strongly aggregate forming a cluster in central London, this becomes especially clear if compared to Cafґes (there is an equal number of both in London) which are fare less clustered and more similar to the reference distribution of all venues in London. This can be found in an even greater extent for Pubs, which are the most frequent Point of interest type in London. As Jensen [5] already found for Lyon, Clothing Stores are highly clustered, but in small clusters spread over the whole city. The clusters usually represent large shopping centres like Westfields or shopping streets as Oxford Street. As expected, train stations follow a very regular pattern (reflecting the London transport network). It is however, more interesting to note, that if Coffee Shops are placed outside of the large central cluster their location highly correlations with train stations. As clearly visible from fig. 2 on the facing page the cumulative distribution of check-ins follows a power-law spanning up to five orders of magnitude. Just like the spatial distribution, the different categories of venues have a distinctive check-in distribution. Even though the number of pubs and cafґes in London is
Fig. 1. Hexbin plots, gridsize = 400. From the top left to the bottom right: All venues, Coffee Shops, Cafґes, Pubs, clothing stores and train stations in London
Check-in Volume
100 10-1 10-2 10-3 10-4 100
Coffee Shops Pubs Cafйs
Check-in Volume
Fig. 2. Complementary Cumulative Distribution Function (CCDF) of check-ins per place for all venues in the dataset (left) and for selected categories (right)
higher (or equal) to the number of Coffee Shops, they have a significantly higher check-in volume and the CCDF of check-ins to coffee shops consequently features a longer tail. This means that either single coffee shops attract a much higher number of visitors or that customers of Coffee Shops are more likely to check-in on Foursquare. These results also confirm the findings of Karamshuk et. al. [6] regarding the check-in distribution of Starbucks coffee shops in New York City.
5.2 Static geographic (spatial) data analysis Regarding the analysis of the static geographical data the Foursquare dataset provides, the intra and inter coefficients introduced by Jensen [5] and defined in section 3.2 are calculated, which are used to build weighted location category graphs subsequently.
Jensen's Intra coefficient graph The intra-coefficient conveys the following qualitative behavior: If the observed value of it is greater than 1, we assume that the stores of the given category tend to aggregate, whereas a lower value indicates a dispersion tendency.
Table 1. Selected results of Jensen's intra coefficients (r = 200m)
intra coefficient
Bar Pub Cafґe Coffee Shop Clothing Store Home (private) Tech Startup
4.3603 3.1014 4.1271 2.9905 11.2942 15.0596 22.0613
Table 1 lists selected values for Jensen's inter coefficient, calculated with a radius of 200 meters. All of the business activities show a positive tendency towards aggregation, however to a different extent. In economic theory a higher local concentration of the same business activity has two counteracting effects: The overall attractiveness of the area increases (a so called positive external network effect), which attracts more people from outside, but it also divides up the demand to more businesses (competition). In regards of venues offering Food and Drinks, bars show the strongest local aggregation, which implies, that they profit from the proximity to other bars and the thereby generated higher demands compensates for a higher competition. Interestingly, Caf`es and Pubs show a higher local aggregation than Coffee Shops, which from an economic point of view seems absolutely reasonable as
Coffee Shops sell products, which are far more standardised (so called homogeneous goods) than Caf`es and Pubs, which often occupy niches and therefore offer a more heterogeneous range of products. Even though one might think so, this does not conflict with the different spatial distribution of Coffee Shops and Caf`es, as the intra-coefficient only measures the local aggregation (in this case within a radius of 200 meters). Indeed, if the radius is increased, the overall spatial distributions of Coffe Shops (more central) and Caf`es (more disperse) is also reflected in the intra coefficients (1.3 vs. 1.4 for a radius of 1 kilometer). The high intra coefficient of private homes reflects the definition of strictly residential areas by urban planners. It is more interesting to note, that Tech Startups show one of the highest intra coefficients, providing striking evidence for the heavily researched question of positive external network effects for these business activities (keywords "Silicon Valley", "Silicon Roundabout"). Jensen's Inter coefficient graph The inter coefficient measures the interaction of different business activities by calculating the deviation of a reference distribution and their local distributions, see section 3.2. The type of graph shown in fig. 3 on the following page is generated by linking the categories as nodes with the corresponding log value of their inter -coefficients. If we observe a positive value, we assume that the stores of the two categories attract each other, whereas negative values indicate a rejection tendency. Table 2 on the next page shows the highest and lowest weighted incoming links for coffee shops, representing the categories attracting and repelling coffee shops most, respectively. The location categories/business activities, that attract coffee shops most and therefore are the ideal environment for opening a new coffee shop include clothing stores, train stations (Platform), banks branches, sandwich places, bakeries and Italian restaurants. On the other hand it would not be a good idea to open a new coffee shop in a purely residential area, in parks/gardens or next to a church. The category graph shown in fig. 3 on the following page visualizes this information. A heatmap centered around the Coffee shop category node, represents the neighbor nodes distance (in terms of the inter -coefficient) with a color code. The analysis for coffee shops could be done in the same way for all other location categories. Some of the most attracting and repelling categories can be found in the appendix. 5.3 Dynamic mobility (temporal-spatial) data analysis The mobility data captured by consecutive Foursquare check-ins adds a completely new, dynamic perspective to the static geographic type of data used in Jensen's work. A first analysis of the traveled distances reveals a power-law distribution, which is shown in 4. This confirms the findings regarding urban mobility patterns in [8] and justifies the strong focus on the local area around venues in the static geographical analysis.
Table 2. Top 10 Coffe Shop Attractors/Distractors
inter coefficient
Clothing Store Platform Bank Sandwich Place Sushi Restaurant Bakery Italian Restaurant Burger Joint Fast Food Restaurant Salon / Barbershop
0.565090 0.510201 0.408108 0.319318 0.318352 0.289402 0.285620 0.233673 0.206584 0.201619
inter coefficient
Home (private) Park Other Great Outdoors Garden Residential Building Bus Line Church Road Neighborhood Gym / Fitness Center
-1.279373 -1.036917 -0.860271 -0.857003 -0.640580 -0.569095 -0.509075 -0.480610 -0.400648 -0.308092
Fig. 3. Category graph (inter-coefficient weighted links), weight sensitive heatmap centered around Coffee Shop node, color displays attraction/distraction
The analysis of user movements between the different venue types provides further insights in selecting the optimal store location. Table 3 on the next page lists the location categories with the highest outgoing transition ratio and absolute number of transitions towards coffee shops. The transition ratio measures the normalised probability of users to move/travel to a given destination. It is six times as likely that a customer visits a coffee shop after being to a Salon / Barbershop (compared to a random baseline model). Many of the main sources of visitors including bakeries, sandwich places, banks and restaurants are also among the categories with the highest inter-coefficients supporting Jensen's static geographic approach and the urban mobility pattern found in [8]. However, there are some business activities that seem to appear in proximity to coffee shops overproportionally often, but are no source of visitors (for example clothing stores). On the other hand, sources of visitors that seem reasonable, but are not always in geographic proximity to coffee shops like Tech Startups and offices can now be identified. This becomes particularly clear if we include the absolute number of transitions in our analysis. Even though the transition probability of one location category might be high, the absolute number could be very small. This fact, which is not dealt with in [6], results in a significantly different ranking. Now we can see that most visits to coffee shops come from stations (train and subway), other coffee shops (might be users only checking-in to coffee shops) and offices. Fig. 4. Empirical Cumulative Distribution Function of traveled distances 6 Discussion Key results The present work has shown, that methods from network science leveraged by data mined from online location based social networks like Foursquare are a powerful approach to tackle the site selection problem. This was shown by analysing the best places to open a new coffee shop in London as an example, however, this could have been done with any type of venue.
Table 3. Location categories with highest outgoing transition towards Coffee Shops
transition ratio
Salon / Barbershop Burger Joint Coffee Shop Cafґe Bakery Gym / Fitness Center Sandwich Place Sushi Restaurant Tech Startup Restaurant
5.990220 5.144686 4.996506 4.664361 4.489279 4.396619 4.342638 4.336283 4.177620 4.132138
Category number of transitions
Train Station
Coffee Shop
Interesting observations Many of the fundamental results in related works could also be reproduced for London. It is interesting to note, that the static (spatial data) and dynamic (temporal-spatial data) perspectives reflect each other partly, but are also complementary. Lessons Learned and Recommendations During the analysis it proved useful to visualise the data first (even though it took some time to find suitable graphs), so that the basic (spatial) distributions became clear. Moreover, the proper normalisation of the calculated metrics (coefficients and ratios) was particularly import in this work as the uneven distributions (spatial as well as user movements) would be misleading. This is also true for calculating different types of metrics to capture different (and complementary) perspectives of the data. For example, if one would only consider the transition ratios and ignore the absolute number of movements (and the other way round), wrong conclusions would be drawn. Limitations and Future work The validity of the results is obviously limited by the validity of the Foursquare dataset itself. It could be doubted, that Foursquare users are a valid sample of the whole population or at least all customers. Additionally, users might be biased to check-in to venue categories that make above average use of Foursquare for marketing campaigns and loyalty programs (like coffee shops). The data quality could be further improved by implementing more sophisticated ways of extracting transitions. For example only check-ins within a specific period of time could be considered. This however, would result in a further decrease of the available data. It also would be interesting for future research to compare the results of the site selection problem in different cities around the globe to see if there are any common patterns as they were found for urban mobility in [8].
References [1] M. F. Goodchild. "I lacs: A Location Allocation model For Retail Site Selection". In: Journal of Retailing 60 (1984), pp. 84­100. [2] G. Heald. "The application of the automatic interaction detector (AID) programme and multiple regression techniques to the assessment of store performance and site selection". In: Journal of the Operational Research Society 23.4 (1972), pp. 445­457. [3] T. Hernandez and D. Bennison. "The art and science of retail location decisions". In: International journal of Retail & Distribution Management 28.8 (2000), pp. 357­367. [4] P. Jensen. "Analyzing the localization of retail stores with Complex Systems tools". In: International Symposium on Intelligent Data Analysis. Springer. 2009, pp. 10­20. [5] P. Jensen. "Network-based predictions of retail store commercial categories and optimal locations". In: Physical Review E 74.3 (2006), p. 035101. [6] D. Karamshuk et al. "Geo-spotting: mining online location-based services for optimal retail store placement". In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge discovery and data mining. ACM. 2013, pp. 793­801. [7] N. Lewin-Koh. "Hexagon binning: an overview". In: (2011). [8] A. Noulas et al. "A tale of many cities: universal patterns in human urban mobility". In: PloS one 7.5 (2012), e37027. [9] D. Yang, D. Zhang, and B. Qu. "Participatory cultural mapping based on collective behavior in location based social networks". In: ACM Transactions on Intelligent Systems and Technology (2015). in press. [10] D. Yang et al. "NationTelescope: Monitoring and visualizing large-scale collective behavior in LBSNs". In: Journal of Network and computer applications 55 (2015), pp. 170­180.

File: where-to-open-a-coffee-shop-in-london-mining-online-location.pdf
Published: Tue Sep 5 17:36:25 2017
Pages: 15
File size: 1.98 Mb


Punished by rewards, 6 pages, 0.08 Mb
Copyright © 2018 doc.uments.com