Finnish Wikipedia 2017, Korp

View resource name in all available languages

Suomenkielinen Wikipedia 2017, Korp

wikipedia-fi-2017-korp

Persistent Identifier of this resource:

http://urn.fi/urn:nbn:fi:lb-2018060401

The Finnish Wikipedia 2017 Corpus will be available in the concordance tool Korp.
The corpus contains all the Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018.
The text parts of the articles have been extracted from [Wikipedia Dumps](https://dumps.wikimedia.org/) with [WikiExtractor](https://github.com/attardi/wikiextractor).
The corpus has been tokenized and annotated with morpho-syntactic analysis produced with the [Turku Dependency Parser](http://turkunlp.github.io/Finnish-dep-parser/)

View resource description in all available languages

Aineisto kattaa Wikipedian suomenkielisen artikkelien leipätekstit vuoden 2017 lopulta. Tekstit on eristetty Wikipedian tarjoamista kielikohtaisista kokonaisaineistoista (https://dumps.wikimedia.org/). Aineisto on jaettu arikkeleihin, kappaleisiin ja lauseisiin. Lauseet on morfosyntaktisesti jäsennetty käyttäen Turku Dependenssi -jäsennintä (http://turkunlp.github.io/Finnish-dep-parser/).

You don’t have the permission to edit this resource.