Automatic Speech Recognition for Maltese - Part 2
September 26, 2024
This is part 2 of my effort to create an open-weights Automatic Speech Recognition model for my native language Maltese.
Before the how, let’s start with why: Put simply, the Maltese language is dying.
With a fertility rate of 1.1 and an ageing population, every concievable trend shows that the native Maltese population is slowly but surely trending towards 0.
As a result, there is a day in the future where the last Maltese speaker dies. On that day, the language will die with them. Anything that isn’t written down will become akin to an alien language
In recent years the population has actually grown due to (mainly economic) immigation. About 25% of the population is now foreign born. With this increase, the use of English (also an official language of Malta) is becoming the de-facto standard for everyday communication.
These two factors provide a consistent downward pressure on the use of the Maltese language.
So for me this is not just a personal project, but also a national one. We might all pass away, but if a computer can mathematically understand Maltese, then the language will live on for generations to come.
People have asked me whether I’m doing this as part of my Master’s degree in AI. Apart from exploring the area in an assignment, the answer is actually no. There already are researchers much better than me working on this problem, who have had better results than I have.
Academia’s aim tends to be to produce knowledge. This is super commendable and should continue! But unfortunately the fact that the MASRI datasets are not available for commercial use limits the impact of this research. Commercial entities should definitely contribute back to academia to make it sustainable, but without being able to use the result of this research, interest is very quickly lost and in the end, everyone loses and progress stalls.
If we could get a decent ASR model into everyone’s hands, we’d experience a cambrian explosion of applications and services centered around Maltese, offering an upward pressure against the decline of its use:
- Transcription of old Reddifusion shows to be able to enjoy them in written form
- Accelerate research by speeding up transcription of interviews for theses
- Improve the quality of Maltese subtitles for YouTube videos and other media
- Provide real time subtitles for live events and podcasts
- Build tools around learning Maltese and its proper pronunciation
- and much much more!
I’m not limited by compute, but by data. There are a few readily available transcribed audio corpuses, but copyright issues prevent me from using them. An ideal way forward would be to create a new dataset from scratch, one where every Maltese speaker is encouraged to contribute their voice, so that our machines may actually understand them in Maltese.