HisGIS/CLARIAH for ATM – building a geo infrastructure for time travel

The CLARIAH-board  awarded the Amsterdam Time Machine (ATM) project  a grant of € 251.000,- to help realize the geographical infrastructure . Thanks to this grant the geo-infrastructure HisGIS of the Fryske Akademy will become available to ATM and the general CLARIAH-infrastructure. Part of the grant will be used for three use cases, one in linguistics, one in social & economic history and one in media studies, which will demonstrate the possibilities of the CLARIAH-infrastructure.

The Fryske Akademy (FA) provides a solid GIS infrastructure for the Amsterdam Time Machine and its use cases. This infrastructure had to be based on precisely identified and localized historical addresses, since these function as the key to a great amount of civil and fiscal data. After all, until far in the twentieth century citizens were registered by address. There was a provisional dataset of historical addresses at hand at the Amsterdam City Archive, but this had been proven to be inaccurate and incomprehensive. Therefore, a new spatial infrastructure was built by the FA, related to the already vectorized Napoleonic cadastre from the years 1811-1832. It could be completed in May 2019. As the essential geo locational underpinning of the ATM project, it may thus be used for future projects and serve all kinds of purposes, for academic as well as non-academic researchers.

For this new GIS system, Thomas Vermaut introduced the principle of geographical coordinates as anchor points for all historical data with a spatial component that cannot be tied to specific geometries of buildings or plots. These had already been introduced as the main element in the Time Machine for the Frisian cities, which was officially launched by the Fryske Akademy in May 2017, serving the pilot project of Dokkum. The location points prevent the most common pitfalls of linking historical addresses with specific geometries of parcels or buildings as starting points, which because of changes through time eventually may lead to fuzzy and inaccurate connections caused by historical mutations such as the merging, demolition, split or aggregation of houses and their addresses.

Mark Raat created an entirely new and accurate concordance of historical house and parcel numbers, linked to the location point coordinates in a GIS infrastructure. With a view to the research focus area and period of the different uses cases, it was decided to bring at least four historical house identifiers of the nineteenth and early twentieth century into this system. These are

  1. the 1832 cadastral parcel numbers;
  2. the 1853 so-called district numbers (wijknummers in Dutch, in which each house has a number within a district or a city quarter in a continuous order);
  3. the 1876 system in which the house numbers are linked to streets via an odd and even principle;
  4. the later mutations and newly added numbers of the urban extensions from the period between 1876 and 1909. The numbers of 1876 are still in use today for a large part of Amsterdam.

Per address, the four identifiers were linked by visually comparing the 1832 cadastral maps, the 1853 district maps, the so-called ‘Looman’ maps of 1876, and the 1909 Publieke Werken maps. Thus, a total of ca. 51.500 location points has been established for the entire city, with the mapping of 1909 as the provisional end situation.

After the new concordance was completed, some extra effort has been put into an exploratory research on the requirements and possibilities to extend the concordance with the eighteenth and seventeenth century ‘verponding’ numbers.

To connect the data sources in this project, we relied on a technique called ‘Linked Open Data’. In Linked Open Data, pieces of information are represented as unique ‘web addresses) (URI’s to be technically correct). Statements are made through a combination of three URI’s representing a subject, predicate and object (in that order). When this basic three-part statement or ‘triple’ is combined with additional triples, complex information can be represented. Also, this representation is a web based technique, meaning that the data can be shared and linked to any other source that’s available online. A third advantage of Linked Open Data is that the triple patterns can be used regardless of the type of data. This means it allows us to retrieve spatial, textual, image and even audio sources in a single query.

From the CLARIAH tools we specifically used CoW/Cattle to transpose structured data into Linked Data, grlc to store queries in and Druid to store, visualise and browse Linked Data with.

As mentioned in the section on Infrastructure the spatial-temporal dimension of our data sources requires a specific way of modeling the data. We implemented the Time Geography model (Kesßler & Farmer, 2015), which, in short, assumes any observation to have time and space component along with an array of values. When modeled in this way, data from various sources can be aligned via the time and space dimension. In plain english: observations within a specified period or area can be retrieved, regardless of the dataset they originate from.

All this work will be made available shortly!

Url
AuthorIvo Zandhuis; Julia Noordegraaf; Kristel Doreleijers; Marieke van Erp; Mark Raat; Nicoline van der Sijs; Richard Zijdeman; Thomas Vermaut; Claartje Rasterhoff; Vincent Baptist
Discipline
AffiliationFryske Akademy; KNAW Humanities Cluster; University of Amsterdam
Project timespan-
Subject timespan-
Funding
DatasourcesHisGIS Amsterdam
Other sources