The state of Wikimedia projects in South African and Africa – Dec 2008

I took a look at the South African language Wikimedia projects again today, to see whether there’s been much progress since May.

Wikipedia – number of articles

Language 1/10/2007 10/11/2007 9/12/2007 1/5/2008 16/12/2008
Afrikaans 8374 8608 8731 9679 11 285
Zulu 107 109 121 141 182
Tsonga 10 16 37 71 150
Swati 56 66 76 116 146
Venda 43 80 101 112 120
Xhosa 66 80 88 100 109
Tswana 40 41 43 66 102
Sotho 43 44 43 53 68
Northern Sotho* 0 0 0 230 301
Ndebele 0 0 0 0 0

* – incubator

The Afrikaans Wikipedia continues to develop well, shooting past the 10 000 milestone, and has a strong community. The quality of the articles tends to be quite high as well, as there’s a preference in the community for quality over quantity.

Otherwise, Tsonga has also had some momentum recently. Last year it was the smallest of the official South African language Wikipedias, while now it’s up to 3rd, and closing in on Zulu.

The Northern Sotho project is still in the Incubator, and not an official project, although it seems to be doing quite well there, and would be second behind Afrikaans if it were an official project. Getting out of incubator status seems unlikely though, as the requirements have changed since the early free-for-all, and an active community, as currently interpreted, seems to set the bar a bit high.

With the low level of activity, statistics can be easily skewed by one or two contributers. The Tsonga Wikipedia has benefited from a particularly active contributor, while the Venda Wikipedia, which was the 3rd South African language to reach the milestone of 100 articles, has seen only 19 new articles in the last year.

Wiktionary – number of entries

Language 9/12/2007 1/5/2008 16/12/2008
Afrikaans 9312 11 168 13 036
Sotho 1381 1383 1383
Tsonga 166 347 347
Zulu 102 102 124
Swati 31 41 46
Tswana 0 0 22
Xhosa 11 11 Closed

The Afrikaans Wiktionary continues to thrive, and a Tswana Wiktionary has sprung, or rather, crawled, into existence, but everything else really has ground to a halt. The Xhosa project has been closed due to lack of activity, and moved to the Incubator.

Wikipedia – African languages

Language 1/1/2007 16/12/2008
Afrikaans 6149 11 285
Swahili 2980 7807
Yoruba 517 6246
Amharic 742 3251
Lingala 292 1074

Afrikaans remains the largest African language Wikipedia (and 79th largest overall), remaining comfortably ahead of Swahili. At 2007’s growth rate, Swahili looked likely to pass Afrikaans this year, but 2008 has seen new momentum for Afrikaans. Yoruba though has seen rapid growth, and since Jan 2007, there have been more new articles in Yoruba than in any other African language. Amharic and Lingala are the only other African languages with more than 1000 articles.

It may seem like we’re a long way from achieving the dream of a world in which every single human being can freely share in the sum of all knowledge. But, perhaps, if we consider that of all the many years of human history, just 3 short years ago, an instant between the ticking of the galactic clock, most of these projects didn’t even exist, we’ll realise we’re doing quite well.

Related articles



  1. WHoops, I should have read more thoroughly. I didn’t see the distinction was “South African languages”, and that there is a subsection for “African languages” that does include Amharic. But now I see it. Please disregard last message.

  2. Nice overview. Please also include localisation of MediaWiki in further analysis. My current observation is that little to no localisation is being done for the above languages (Afrikaans may be an exception, with Yoruba gaining a little speed). In Betawiki ( we facilitate in-wiki localisation that is updated on the Wikimedia farm very quickly.

    We also have an ‘end of year translation rally’ at the moment, so all translators are most welcome!

  3. Thank you for your comments and particularly your comments about the Northern Sotho Wikipedia. I was one of the first promoters of it but due to time constraints, I have not been very active recently. We need also to get far more substance into the articles as many of them at this stage are what in English would be called stubs. This also begs the question of language standardisation and the development of appropriate terminology in the African languages. We need also to develop bilingual corpora which could feed statistical translation engines. Unfortunately the human input to check these translations will still be considerable. Please send me an email so that we can talk further.

Comments are closed.