Go to content
Verkkoa kudotaan, 1942. Kuva: E. Nurmi. Sotamuseo. CC BY 4.0.
In the net! Photo: E. Nurmi, The Military Museum. CC BY 4.0(opens in a new window, you move to another service)

The Institute’s aim is to open its data resources by digitizing its linguistic corpora and making them available online to a broad audience.

Licencing

The Institute’s current and future electronic language corpora that are public and without legislative or contractual limitations on their use will be opened up as public data resources, under a Creative Commons licence and in machine-readable format. In accordance with recommendation JHA 189 based on the Finnish Act on the Openness of Government Activities, the primary licence is Creative Commons Attribution 4.0. Previously, the Institute has used GNU and EUPL licences. Current licences also include CLARIN.

In addition to the open data resources, the Institute has scientific corpora that are subject to licence, usually in the interests of protecting personal information.

Metadata from the Institute’s archives are available via following services


Electronic material

This is a list of the Institute’s corpora and material available online free of charge. Some corpora have their own interface, while other can be accessed via common platforms.

Corpora of Old Literary Finnish

Corpora of Modern Finnish

Onomastic Corpora

Dialect corpora

Etymology corpora

Other Uralic Languages corpora

Share