Parking in Marburg: a quantitative study

Wed 26 February 2020
data

In this text, I attempt to analyse the parking situation in Marburg - the city I am currently living in. Moving through cities these days, traffic is a big aspect as one is surrounded by cars at all times. When traffic is not flowing, this traffic typically parks. Hence, my motivation to study the parking situation in Marburg: what are the cars in Marburg doing when not driving?

This article is mostly about quantitative analyses based on (sort of) publicly available data. The city of Marburg maintains a system that summarises the currently free parking spots on a website. This system can be accessed under the following website: https://pls.marburg.de/.

If you are curious about the analyses and corresponding figures, then please read on: first I give an overview about the data acquisition process. This is for you to understand what data I am actually using and how I got it. Second, I continue with the actual analyses that covers analytics as well as predictions.

Data acquisition

I acquired the parking data in Marburg with a set of programming tools. The official “Parkleitsystem Marburg”, roughly translated as “parking direction system Marburg”, is used as data source. It is available online at https://pls.marburg.de/ and shown below in Fig. 1 for reference.

Figure 1: Screenshot of the official website maintained by the city of Marburg, https://pls.marburg.de/, as of 02/26/2020. This website serves as my data source. This screenshot shows the available data: the free parking spots for all parking decks, links to the positions of the parking decks, the maximal vehicle height allowances and the current data timestamp. The data is updated every five minutes.

The city website is updated every five minutes and lists parking decks spread across Marburg. These parking decks comprise popular shopping locations and places, such as:

Ahrens
City parking deck
Erlenring-Center
Furthstraße
Furthstraße - Parkdeck
Hauptbahnhof
Lahncenter
Marktdreieck
Marktdreieck - Parkdeck
Oberstadt

For each of the parking decks, a number of data is given:

Column “PARKHAUS”: The name of the parking deck.
Column “FREI”: The number of free parking spots.
Column “ROUTE”: A Google Maps link to the parking deck location.
Column “max. Einfahrtshöhe”: The permitted vehicle height.

This website and its content is used as data source for this study. The software used for data acquisition is written in Python and uses Scrapy to download the data from the website. Scrapy also stores the data in a machine readable format to facilitate the post processing and analysis steps. The whole software stack runs as a Docker container on a small Linux server. The data is saved every three minutes in order to subsample the website update interval of five minutes. Note that this time scale is chosen as to have sufficiently dense data points but not to overload the website servers - in particular the latter aspect should be given high priority when working on projects as the one in this study. Since only the table given on the city website is processed and stored, the storage requirements for the Linux server are minimal. Figure 2 summarises the data acquisition pipeline.

Figure 2: The data acquisition pipeline as described in the main text. The “data for analysis”, marked by a green rectangle, is used for all subsequent data analysis steps.

The analysis pipeline is also written in Python. It uses the Scientific computing Python stack. In particular, the Pandas Python package has been used extensively to arrange and evaluate the data. The “data for analysis” shown in Fig. 2 is exactly one such Pandas DataFrame. The data structure for the analysis is a table with timestamps along the rows and the parking decks along the columns, as shown in Fig. 3. Over the course of the data acquisition period, I accumulated around 100,000 snapshots of the website.

Figure 3: The data format that is used for the analysis. The data is appended to the table as time passes. The table cells contain the number of free parking spots at a given time and for a given parking deck. The red arrow shows how the accumulative data is obtained: all values per row are added up to get the overall number of free parking spots per timestamp.

Analysis of parking situation in Marburg

The data analysis is divided into four parts. We start with an introduction to verify that our data is reasonable. The second and third parts analyse the parking demand as summed over all parking decks in Marburg and for each parking deck separately, respectively. While the second part only takes temporal information into account, the third part also takes spatio-temporal information into account. Finally, the last part uses a machine learning prediction method for spatial interpolation that comes with the benefit of returning uncertainty measures about its predictions.

Introduction

This introduction serves the purpose of getting a very first hold of the acquired data. Here, we will look into the parking deck locations and how the metric that quantifies “parking demand” is computed. It turns out that the data preprocessing steps that I present in this section are necessary in order to correct corrupted data from the website.

Each of the snapshots consists of a timestamp, the number of free spots, the corresponding colour, the maximal vehicle height and a Google Maps link. The latter two are confirmed to remain unchanged throughout the whole dataset for each parking deck and the link is used to derive the parking deck locations. The locations that are linked in the table are parsed from the correponding URL and shown in Fig. 4 by the red markers. It turns out that the coordinates obtained from the links on the website are wrong. It seems that Google Maps corrects for that automatically and redirects to the correct coordinates. I searched for the correct coordinates and indicated them by green markers in Fig. 4.

Figure 4: The parking deck locations. The incorrect locations as given on the website are marked as red markers. Dashed lines connect these incorrect locations to their correct locations, indicated as green markers.

These correct coordinates are used for a quick experiment to warm up a little bit more. Using the OpenStreetMap, I compile a list of all restaurants, pubs, Biergarten and Kneipen in Marburg. The list comprises of classical pubs in Marburg like the “Sudhaus” and “Quod” but also incoporates the “Mensa”, taking a whopping 153 items into account. For each of the parking decks, the distance to all of the restaurants/pubs is computed and summed up. The accumulated distances are shown in Fig. 5 and guide you to the parking deck that you should choose whenever you are hungry: pick “oberstadt”, “lahncenter”, “city” or “ahrens” if you are particularly hungry as they have the smallest accumulated distance to restaurants and pubs.

Figure 5: The accumulated distances to places like restaurants and pubs for each of the parking decks. The type of place is encoded as colour. The parking decks with the smallest accumulated distances to restaurants and pubs are “oberstadt”, “lahncenter”, “city” and “ahrens”.

With the parking deck positions in place, let’s focus on the metric. The measure of “parking demand” is quantified here as the number of used parking spots. However, the website only states the number of free parking spots. The capacity of each parking deck is calculated as maximum of the free parking spots. To obtain the number of used parking spots, the number of free parking spots are subtracted from the capacity. The capacities determined by this procedure are shown in the corresponding column of the following table:

Parking deck	Determined capacity	Documented capacity
ahrens	213	225
city	195	160
erlenring-center	402	409
furthstraße	118	204
furthstraße - parkdeck	97	n.a.
hauptbahnhof	264	288
lahncenter	170	168
marktdreieck	190	280
marktdreieck - parkdeck	97	n.a.
oberstadt	209	235

These empirically determined values can be compared to the ones documented officially, as shown in the table column “documented capacity”. Apparently, the two sets of numbers are close to each other. Since the empirically computed parking capacities are more granular (e.g. they distinguish between “Marktdreieck” and “Marktdreieck - Parkdeck”) and seem to be more up-to-date (empirically, the “City” parking deck was found to have had 195 free parking spots at maximum despite only having 160 parking spots as documented officially), I will be using the empirically determined capacities for the analyses that I show in the following.

With the coordinates and used parking spots determined, we turn towards the main analysis. This main analysis begins with the accumulative signal.

Accumulative parking analysis

The accumulative data is obtained as shown by the red arrow in Fig. 3: the columns for all parking decks are summed up to yield the accumulated number of used parking spots.

Given the timestamps of the data, the number of used parking spots can be plotted against time. Since the data acqusition time of 3min is quite fast on the scale of the data collection process - namely month -, the data is resampled to hours, days and weeks, as shown in Fig. 6.

Figure 6: The used parking spots as summed over all parking decks against time. The data is resampled to hours, days and weeks, as shown by the three lines. These resamples highlight trends in the data.

The green line shows the weekly average. It drops during the time between Christmas and New Year’s Eve. Furthermore, the regularity of the data becomes obvious. There seem to be two types of periodicity: first, the periodicity per day (as hardly visible by the data resampled to one hour) and second, periodicity per week (as clearly visible by the data resampled to one day). Despite these obvious facts, the figure might be nice to look at but doesn’t quantify the periodicities further.

The autocorrelation of Fig. 6 is computed to quantify the periodicity. The autocorrelation function measures how similar a signal is to itself by shifting it by some time delay that make up the x-axis of autocorrelation plots. For time delays at which the autocorrelation function has maxima, the signal is similar to itself. Hence, periodicity can be quantified by exactly these maxima of the autocorrelation function. Figure 7 shows the autocorrelation function of the parking demand in Marburg.

Figure 7: The autocorrelation of the accumulatively used parking spots. The period of seven days is marked by a red line.

As suggested by the accumulative signal in Fig. 6, the autocorrelation shown in Fig. 7 peaks at seven days. This confirms that the signal has a period of seven days. The daily periodicity is not visible in this plot as it is based on the daily resample of the data; the initial small peak at the shift value of one in Fig. 7 is due to the fact that this value corresponds exactly to the sampling frequency of the resampled signal and, by that, does not carry any relevant information regarding the periodicity in the signal. All other peaks correspond to multiples of the identified 7-day periodicity.

Finally, to understand the full scope of the periodicity of the signal, I evaluated the day-hour-histogram. This type of histogram is used to visualise the number of used parking spots in Marburg against time and day of the week. This makes up a two dimensional matrix, as shown in Fig. 8.

Figure 8: The average of the used parking spots against time and day. Summing along rows and columns results in the daily and hourly usage histograms, respectively. Yellow corresponds to high and blue corresponds to low values, as quantified by the colourbar.

It becomes obvious that the peak hours of parking demand during the working week are between 11am and 4pm. Also, notably, the weekends are less busy as people are probably not going into the city by car. On Saturdays, the peak hours in parking demand are shorter, only from around 12pm to 3pm. On Sundays, there is significantly less parking demand in Marburg as compared to all other six days. Interestingly, the relaxed Sunday mood smears into Monday mornings, as these are much less busy than regular mornings during the working week. Finally, it can be observed that even at late nights, like 10pm to 4am, there are still cars using the parking decks.

Now that we know the periodicity of the parking demand in Marburg, let’s focus on all of the parking decks separately.

Separate parking deck analysis

Now we turn to the more detailed analyses that take the different parking decks into account. First, Fig. 9 shows the parking demand against time for each parking deck separately.

Figure 9: The parking demand against time, similar to Fig. 6. Here, however, the accumulative signal is split into the contributions from each parking deck.

Despite being overloaded with information, there are a few aspects that this figure points towards. First, the signals seem to be periodic as well and second, the parking decks seem to vary significantly in how many parking spots are used. The integral parking demand quantifies exactly that: it measures how many parking spots have been provided by a parking deck throughout the whole measurement period. Figure 10 shows the results thereof.

Figure 10: Integral parking demand of all parking decks. These numbers are obtained by summing up the number of used parking spots per each parking deck over the whole measurement period. The blue bars in the bottom show the integral parking demand during working days and the stacked orange bars show the parking demand during the weekends for each parking deck. The integral parking demand is normalised to the maximal integral parking demand across all parking decks.

The integral parking demand shows that the “Erlenring” parking deck is by far the most used parking deck. The second most important parking deck, “Lahncenter”, provides around half of the parking spots that “Erlenring” provides. The proportion of parking spots provided on weekends compared to the overall number of provided parking decks is roughly equal for almost all parking decks. Here, only “Marktdreieck - Parkdeck” differs as this parking deck provides significantly fewer parking spots on weekends.

The integral parking demand can be given an additional temporal information by rendering a video that takes time into account. The following video shows the parking deck usage against passing hours. The video shows the ratio of parking spots that are used with the hour and date running in the title of the figure. As before, the number of used parking decks is normalised to the maximal usage across all parking decks and, additionally, all times.

Clearly, this video visualises the periodicity in the signal. The parking deck bars disappear when there is no data - certain parking decks close down during the night and some days of the weekend. Most of the parking decks reach 100% occupancy and then decrease to smaller values below 5% during the night. The notable exceptions to this observation are the parking decks “Lahncenter” and “Erlenring” which only decrease to around 40% to 50% occupancy at night. That might explain the previous observation in Fig. 8 of cars that remain parked during night. In the beginning of the video there is some data missing as my data acquisition pipeline broke down during that time.

The dependence on time is studied further. Figure 11 shows the histograms of used parking spots for each parking decks separately. Not only that, it also shows histograms thereof that depend on the time of the day.

Figure 11: Histograms of used parkings spots conditioned on time of the day. Each subplot corresponds to one parking deck. The time-independent histogram is shown in blue. The histograms of colours yellow, green and red correspond to mornings, noons and evenings as defined by the hours between 7am to 9am, 11am to 1pm and 4:30pm to 6:30pm, respectively. Each histogram is normalised independently for a better visual appearance.

The temporal histograms shown in Fig. 11 visualises a number of results. For some of the parking decks it is clearly visible how the peak of parking demand shifts with time: parking decks like “ahrens”, “oberstadt”, “marktdreieck”, “marktdreieck - parkdeck”, “furthstraße” or “hauptbahnhof” all show how there is little parking demand in the mornings, a maximal parking demand around noon and a decaying parking demand in the evening. Contrary, parking decks like “city”, “ehrlenring-center” or “lahncenter” do not seem to depend on time as much.

After these analyses about the acquired data, we finally turn towards predictions.

Predictive analysis

As last part of this article, I attempt a predictive analysis of the parking data in Marburg. The parking decks indicate parking demand at their locations. With the following spatial prediction, I would like to forecast parking demand where there are no parking decks available.

Since I am also interested in the uncertainties in the predictions, I use Gaussian Process Regression (GPR) for the spatial predictions. GPR uses Gaussian Processes (GPs), which themselves generalise multivariate Gaussian distributions to function space. Being a Bayesian method, they return not only a prediction but also the uncertainty of the prediction.

Despite that GPR is a non-parametric statistical method, it needs hyperparameters. While I do not want to give an introduction to GPR in this article, the analysis details are noted here for the sake of a transparent analysis. Please consult the links provided before and make sure to focus on the amazing book by Rasmussen et al to understand the following technial details:

Contrary to the typically chosen zero-mean GP prior, I use the mean of the data as GP prior mean.
A radial basis function kernel is used as correlation function of the GP prior.
The kernel hyperparameters, magnitude and length scale, are set heuristically based on visual fit quality and the characteristic length scale of Marburg.
Very nearby parking decks (“marktdreieck” and “marktdreieck - parkdeck” as well as “furthstraße” and “furthstraße - parkdeck”) are merged in order to avoid discontinuities.
I use a noise-free GP prior as the integer measurements from the website do not come with noise.

With these technical details in place, it is time to look at the GPR results. The following video shows the parking deck usage averaged over the whole measurement period. It is a 3D plot with the plane as geographical coordinates of Marburg, the height of the upper subplot as predicted parking deck usage and the height of the lower subplot as prediction uncertainty of the parking deck usage.

The usage values of the parking decks are indicated with black dots. The maximum of the prediction, indicated as height as well as in yellow colour, is located close to the erlenring parking deck. As expected on account of the GPR algorithm, the prediction uncertainties decrease close to the measurement positions. In order to fully see the spatial predictions, the 3D plots are rotated. At the end of the video, the viewport is rotated to the top view.

The second video uses exactly that top down view to visualise the spatial predictions. However, it also introduces a temporal component by showing the spatial predictions as days pass. The data shown is the daily average. On the left of the following video, the colour encoded top-down view as introduced in the previous video is shown. The black dots correspond to the locations of the parking decks that served as training data. The red dots, however, correspond to points-of-interest (POIs) between the parking decks. Hence, the GPR is used to compute the predicted parking demand at these POIs. The right subplot shows the corresponding values of these POIs together with their names: I evaluated the parking demand as fitted by the GPR at the “mensa”, “physik”, “UB” (university library), “bahnhof” and “kino” in Marburg. These predictions do come with uncertainies, but they are very small. If you look very closely, you might spot the error bars in right subplot.

This video shows that the location “mensa” is the place with the highest parking demand. “Physik”, “UB” and “bahnhof” have roughly the same parking demand and “kino” fluctuates the most across the days that are shown.

Conclusion

In this article I showed how publicly available parking data in Marburg is saved, analysed and finally used for spatial predictions. The statistical analysis comprises a few aspects that are interesting for both, people looking for free parking spots as well as the city of Marburg. This analysis is interesting for the city of Marburg because the usage statistics of the parking decks could be utilised to ultimately improve the parking situation in Marburg, e.g. by incorporating load balancing or by re-thinking the necessity of more or less parking decks. The spatial prediction yields statements about where particularly many or particularly few parking spots in Marburg are required.

You might ask why I compiled this study. First, I like to do such things. Second, I believe that open data can contribute to a better quality of the lives of all of us as existing systems can be analysed and optimised. This is daily business in the corporate sector but has not reached the public sector, yet. This article is to foster the spirit of open data in Marburg in the hope that open data improves the quality of lives in Marburg in the future.

The more open data there is available, the more open data fans such as me can help to compile similar analyses for the city of Marburg. Dear Marburg, start to provide us with exciting open data! :-)

If you have questions about this article or would like to know more about the analyses, please feel free to contact me via email.