October 2024 African language Wikipedia update

Image from Wikimedia Commons

African Language Wikipedias

I’m at the 2024 Wiki Indaba, and it’s been inspiring to see the progress in the African Wikimedia space. I was at the very first Wiki Indaba, back in 2014, and the event has grown dramatically, with many first-time attendees.

This has encouraged me to update my ‘annual’ African Wikipedia stats. Annual being somewhat loosely defined, as my last look at the South African languages was in 2022, and my last look at African languages in general was all the way back in 2020.

I’ve been very positively surprised by what I’ve found. Many languages are showing great signs of activity, and I must point out Dagbani as a particular inspiration. Dagbani is spoken in Ghana and Northern Togo, has about 1.2 million native speakers, and the Wikipedia launched in June 2021. It’s raced to over 10,000. Speaking to one of the Ghanaian participants, a big part of its success seems to have been engaging linguistic experts, as well as students, from the beginning.

With the exception of the South African languages, the cutoff for this analysis is 1,000 articles. There are a number of African languages spoken in Ghana knocking on the door. These include Kusaal (994), spoken in Ghana and Burkina Faso, Farefare (also called Frafra, or Gurenne) on 913, which is closely related to Dagbani and also spoken in Ghana and Burkina Faso, as well as Mooré (759), spoken in Burkina Faso, Ghana, Cote d’Ivoire, Benin, Niger, Mali, Togo and Senegal.

So the success in Ghana is something to be further explored.

And amongst the South African languages, Swati has passed Tsonga in the race to reach 1000 articles.

Here is the full list of all African languages with more than 1,000 articles, and all 11 original South African official languages. There are far more languages than before, and this has been written while still trying to be present at the conference, and published while rushing to make the bus to Soweto, so mistakes are likely – please let me know about them!

As always, number of articles is a poor metric on its own. Some languages have been populated by bots, or editors have focused on article creation rather than detail. I’ve also listed the project’s depth metric, which is one attempt to measure article quality. Higher is better.

Language Oct 2024 Nov 2020 % change Active editors
(5 edits)
Active users
(1 edit)
Depth Admins
Egyptian Arabic 1,625,117 1,170,158 +38.8% 43 196 0 7
Afrikaans 118,766 94,975 +25.0% 43 182 45 14
Malagasy 98,140 93,416 +5.1% 5 49 10 3
Swahili 83,821 60,143 +39.4% 50 131 9 14
Hausa 50,462 6,501 +679.0% 112 235 2 11
Igbo 36,382 1,968 +1748.7% 29 69 0 4
Yoruba 34,371 33,196 +3.5% 4 60 4 3
Tumbuka 18,688 Not recorded 1 19 3 1
Amharic 15,371 14,902 +3.1% 2 43 33 1
Zulu 11,531 5,125 +125.0% 5 45 6 1
Shona 11,443 6,390 +79.1% 2 31 3 1
Moroccan Arabic 10,296 Not recorded 12 40 190 5
Dagbani 10,090 Not recorded 22 48 5 2
Northern Sotho 8,701 8,217 +5.9% 1 16 0 1
Kinyarwanda 7,804 1,954 +299.4% 13 55 8 2
Somali 8,090 5,888 +37.4% 13 69 48 1
Kabyle 6,856 5,065 +35.4% 2 27 13 1
Fula 4,449 Not recorded 14 37 38 2
Twi 4,146 Not recorded 10 35 13 2
Lingala 4,070 3,178 +28.1% 2 19 31 4
Luganda 3,324 1,214 +173.8% 14 33 7 2
Standard Moroccan Amazigh 2,777 Not recorded 13 28 23 3
Xhosa 2,107 1,062 +98.4% 6 23 15 1
Fon 2,058 Not recorded 5 7 3 2
Ghanaian Pidgin English 2,020 Not recorded 5 13 49 2
Shilha 1,900 Not recorded 2 18 33 3
Tswana 1,888 712 +165.2% 8 22 27 1
Oromo 1,875 Not recorded 2 22 26 1
Kikuyu 1,787 1,366 +30.8% 1 17 5 1
Dagaare 1,714 Not recorded 10 12 5 1
Wolof 1,703 1,628 +4.6% 2 20 95 2
N’Ko 1,521 Not recorded 2 14 4 2
Gun 1,346 Not recorded 2 9 10 1
Kongo 1,366 1,216 +12.3% 2 20 95 2
Nigerian Pidgin 1,237 Not recorded 3 20 8 4
Tyap 1,226 Not recorded 8 18 76 2
Chewa (Nyanja) 1,035 Not recorded 2 14 114 1
Sotho 1,018 794 +28.2% 2 20 95 2
Swati 988 520 +90.0% 3 15 32 1
Tsonga 908 699 +29.9% 6 13 58 2
Venda 805 370 +117.6% 12 15 32 1
Ndebele (incubator) 139 11 +1163.6%
Language Oct 2024 Nov 2020 % change Active editors
(5 edits)
Active users
(1 edit)
Depth Admins

Egyptian Arabic has by far the most number of articles, but also a depth metric of zero, so average quality is very low. Afrikaans is in good shape, steady increase, a relatively high number of active editors, and a good depth.

Hausa and Igbo have shown rapid increases in article count, but at the expense of depth.

Overall, some languages are showing rapid growth, while a few have stagnated, but the picture is far more positive than last time I looked. One of the figures I’m most happy about is increase in the number of Ndebele articles. Still in the Incubator, and stuck for years on 11 articles, it’s now up to 139. I’m not sure how close it is to meeting the criteria for leaving the Incubator (UPDATE: I’ve heard it’s very close!), but signs of life are welcome, and hopefully, with some inspiration from Thando Mahlangu, the Ndebele activist who gave the keynote on Saturday morning, others will feel moved to contribute.

Related posts

2 comments

  1. Thank you for the information it will keep us on our toes to do more. The conference was amazing for me, my first time ever. I hope to attend more🤔. It is not easy to be involved in Wikimedia projects however, hope is not lost.

Comments are closed.