The data camp is a joint event organized by the Central Bureau for Statistics of the Netherlands (CBS) and the University of Twente (UT).
During the camp, a set of CBS data analysts and UT researchers will answer research questions about statistics using big data technologies.
On Monday, the participants will be presented with overview presentations about the research questions and technologies.
The data camp participants will work in small, mixed teams in an informal setting. Experienced data scientists will support the teams by short mini-workshops and hands-on support. The hope is that the intense contact with the research question in an informal and spontaneous environment will produce valuable and innovative answers to the posed questions.
Exemplary Research Questions
Participants can work on one of the following pre-defined research questions or propose their own.
Day trips of Dutch people: What information does social media provide on day trips of people living in the Netherlands?
Tourist movements: What places in the Netherlands do tourist visit?
Selectivity of people producing GPS containing tweets: What are the characteristics of Dutch people active on Twitter that create tweets with GPS-coordinates?
Selectivity of people participating on online campaigns: What are the characteristics of Dutch people active on Twitter that participate in online campaigns?
Economic activity of companies: Can the economic activity of companies be classified by data available on their website?
Economic growth: Do changes in traffic intensity indicate changes in economic growth?
Map tick bite risks: Can we find a way to “normalize” the amount of tick bites according to the “population at risk”?
Map phenological development Find species that behave in a similar way or try to identify similar periodical trends in species development?
Freelancers/Independent contractors (‘ZZP’): Can social media be used to identify and determine the
number of freelancers/independent contractors (ZZPers) in the Netherlands?
Location data of ships: What information does AIS data provide on travel and
harbour visits of ships?
Social media usage in the Netherlands: How active are people really on the various social media platforms available in the Netherlands?
The agenda for the data camp can be downloaded as a pdf here .
Monday 14:15-15:30 (DesignLab) Erik Tjong Kim Sang; Twiqs.nl: Searching in billions of Dutch tweets
Twiqs.nl is a website which can be used for searching in billions of
Dutch tweets. We present the methods used for collecting, processing and
storing the tweets, and explain how we visualize search results.
Additionally, we show examples of research that might be done by using
this large collection of tweets.
Erik Tjong Kim Sang is researcher at the Meertens Institute in
Amsterdam. The institute studies and documents Dutch language and culture.
Erik holds a Masters diploma from Delft University of Technology and a PhD
from the University of Groningen. Previously he was employed by the
universities of Uppsala, Antwerp, Tilburg and Amsterdam (UvA), and by the
Dutch eScience Center. Erik organized several editions of the popular
CoNLL shared task, including the tasks on chunking and named entity
reconition. He started collecting Dutch tweets five years ago and
currently manages the largest collection of Dutch Tweets in the
Thursday 10:45-12:30 (DesignLab) David González; Visualizing (Geospatial) Data: Experiences at Vizzuality and CartoDB
This presentation will be an illustrated tour of the various projects, the work, and the experiences at Vizzuality and CartoDB.
Many of Vizzuality’s projects use a combination of maps and graphs to display the data. CartoDB is a geospatial software, created by Vizzuality, running in the cloud. CartoDB allows non-‐expert users to easily create maps and applications that include geospatial visualization and analysis. One recent example of Vizzuality’s projects is Global Forest Watch, an online forest monitoring system that unites satellite technology, open data and crowdsourcing, and that provides near-‐real-‐time alerts about suspected locations of recent tree cover loss.
David González is Head of Technology and Co-‐founder of Vizzuality. He graduated as Agricultural Engineer from the Universidad Politécnica de Madrid, where he participated in several research projects on plant growth computer modeling, sustainability and precision agriculture.
David also holds a Master Degree in Digital Communication, Culture and Citizenship from the Universidad Rey Juan Carlos, in cooperation with MediaLab Prado Madrid. David is involved in several research groups, think tanks and initiatives focused on neo-‐cartographies, urban spaces, commons and citizen peer-‐to-‐peer strategies.
If you are tweeting about the Data Camp, please use the hash tag #datacamp2015.
The Data Camp 2015 is organized by Robin Aly (UT), Yuri Engelhardt (UT), Djoerd Hiemstra (UT), Raul Zurita Milla (UT), Piet Daas (CBS), and Barteld Braaksma (CBS).