Define Footfall Using GPS Data

Location-based GPS data provides accurate location points of individuals in the certain study areas. This project is finished for master dissertation with the cooperation with CBRE to define footfall in four popular shopping areas in London. Highlights: Comparing four machine learning methods to build footfall model taking weather, special events, holidays into consideration. Deploying semantic trajectory model to define attractiveness score of each retail unit and calculate similarity scores of different demographic profiles.

Introduction

With the rapid development of wireless communications technology and the smart mobile terminal, recording mobile object trajectories create opportunities to discover novel insights into human behaviors. The special attributes of GPS trajectory data in light of its containing both temporal and spatial statistics can be applied to urban planning, smart transportation, and commercial development. Traditional human behavior analysis relies on manual collection of data including questionnaires leading to low efficiency and accuracy. However, with the spread of smartphones, individual activity chains can be studied in detail. Retailers are aware that using this new technology generates new insights in the commercial industry, and every year they are trying to implement new technologies which could give them a deeper insight into where people visit and stay in the shopping areas.

Methodology

This study will explore GPS data for four shopping centres including Oxford Street, Westfield Shopping Centre, Kensington High Street, and Stratford Westfield City in the whole year of 2017. To explore the critical factors that may contribute to the footfall count, various machine learning methods such as random forest, XGBoost, neural network, and deep learning are adopted to predict the footfall in the four shopping areas by considering several temporal factors and weather conditions.
Furthermore, to analyse the semantic trajectory data of each individual generated by GPS, the retail attractiveness score is defined according to the stay points detected from each individual’s trajectory. The trajectory similarity is calculated taking space, time, and semantics into consideration to figure out whether people who come from the same ward in London are more likely to have high similarity in their trajectories.

Figure1. Methodology Overview

Key Findings

The influence of the variables has been shown in Figure 2. December is shown to be the factor that contributes most to the footfall count in four shopping areas. However, interestingly, the economic boosting brought by December does not exert great impacts on Kensington High Street. Instead, individuals who go shopping in Kensington High Street will pay more attention to the weather conditions owing to the fact that it is the open-air shopping area. Besides, the weather conditions dominate the footfall count in Kensington High Street, followed by Oxford Street, due to the fact that Oxford and Kensington High Street are two open-air shopping areas and people will care more about the weather conditions if they choose to go shopping in some open-air places.

Figure2. Feature Importance
feature importance.png

As for the semantic trajectory analysis, the attractiveness of each retail shop in four studied areas in different time periods is defined using staying points. Figure 3 shows the density of calculated stay points in the four shopping centres.

Figure3. Stay Points Heatmap
stay-points-heatmap.png

Table 1 demonstrates the attractiveness score of 91 shops on Oxford Street shown as an example. The index of clothes and shoes stores nearly accounts for half of the overall scores during the day time, and this index gradually drops down from 6 p.m. to 9 p.m. Meanwhile, the percentages of coffee shops or supermarkets rise steadily from 9 p.m. to 8 a.m. Apparently, the crowd tend to go to clothes or shoes stores in the day and evening time and they are more likely to cluster around the coffee shops or supermarkets such as Costa, Retail 24 during the night time.

Table 1. ​​Retail attractiveness score of shops on Oxford Street
table.jpeg

The trajectory similarity values of four studied shopping areas are calculated on 1st, January 2017. According to the analysis shown in Figure 4, Oxford Street has the highest median similarity value (0.48), and Westfield Shopping Centre has the lowest median value (0.41). It is not reasonable to infer that people who come from the same place (as a ward scale) share high similar trajectories, but further analysis needs to be developed in the future to help explore this hypnosis.

Figure 4. ​​​Number of individuals come from the same ward with simlarityThreh = 0.5
similarity-chart.jpeg

Research Value

Traditional urban planning methods use rule-based logic that may not be able to adapt or benefit from modern technology, which captures data at a finer resolution in space and time. With the rapid development of GPS technology, it is more convenient to obtain the user’s current location information, such as the latitude, longitude, time, speed, and direction of the user’s current location. Hence, this study provides a special perspective for retail insights using GPS data.