Cycling in Marburg (1/4): Project introduction and data source

Cycling comes with a number of benefits for health and the environment. People get fitter when cycling, spend less money on fuel for their cars and thereby save CO2. Based on these benefits, cycling becomes more and more popular: Let it be e-bikes that drastically increase the range of usability, heavy duty bikes to carry large things around or cities installing more and more bike lanes. The list of news in the city of Marburg which are concerned with bikes is so long that I cannot even remotely replicate it here . Each word in the previous sentence links to one official news article that is concerned with bikes in Marburg, demonstrating the large interest in bikes as means of transportation.

This blog post is the first article in a series of articles that cover the usage of rent-able bikes in an urban context. I use the city of Marburg as a demonstration case to tell you a story about bike usages using quantitative analyses. Ultimately, this article is motivated by the current trend of using bikes to travel around a city and my personal interest of doing so safely - hopefully this series of articles motivates the city council to improve the status quo of traveling with bikes in Marburg even further.

I analyse data from the company Nextbike for the scope of this series of articles.

After this series you know how people ride bikes in the city of Marburg. There is a number of facts that you will have learned after reading through the articles. Let me state a few of these learnings here:

  • What are the most occupied stations and how are the stations distributed across MR?
  • How do the bikes move between the stations? What is the typical distribution of bikes in MR?
  • What is the social and ecological impact of cycling in Marburg?

This first blog article introduces the underlying data, i.e. where it comes from, how I obtained it and what kind of data is available. The data is obtained as follows:

I query the Nextbike API regularly using a scraper written in Python that runs inside a Docker container. This data is parsed and saved on disk for subsequent analyses.

The data that is obtained contains a number of information that I will make use of in the next articles of this series. At each point in time, the number of bikes at each Nextbike station - including the specific bike numbers - is saved. Also, all bikes that are parked outside the regular Nextbike stations are saved. I use this data later on to compute bike trajectories as well as the occupation levels of each station. Over the course of a few months, I collected well over 600,000 data points.

To get a very first impression of the data, let me present two small analyses upfront already.

First, I show the number of parked bikes across the whole Nextbike system in Marburg against time.

One can see that the number of parked bikes starts out large in the beginning of the year and decreases towards the summer. The smallest number of bikes can be found mid-summer at the end of August. The reasons for this behaviour can be manifold and ultimately not solved solely based on statistics. Let me guess some explanations: The beginning of the Corona crisis in March 2020 has maybe increased the number of parked bikes in the beginning of the year. Then, as Corona became less of an issue in the summer - and also because of the warm temperatures -, the number of parked bikes decreased in the warm months. Lastly, with the cold temperatures and a second Corona wave, the number of parked bikes increases again at the end of 2020.

Now that we know of the yearly behaviour, let’s focus on how bikes are parked depending on day and hour of the day. The following figure shows the difference to the average number of parked bikes.

Many bikes are parked during the early hours of the day and fewer are parked during the later hours of the day. Also, fewer bikes are parked during weekends in general. Fewest bikes are parked on Saturday afternoon. All these statements are reasonable and make sense at first glance.

These were two very first analyses that showed you what kind of conclusions can be drawn based on the data at hand. There is much much more in the data, though! Read on to the next articles in the series if you are interested in more specific questions that are interesting for users of Nextbikes, the city council or people interested in machine learning based predictions, respectively.

This article belongs to a series of articles. This is the list of all articles:

Subscribe via email to get notified about new blog posts