ASTT is a software based on artificial intelligence for the conversion of the spoken Albanian language into written language.
The software enables a speech from any audio-visual format or even direct speech to be converted to text in real-time, with a high accuracy (over 90%).
The whole idea behind ASTT was to bring change in the Albanian community by providing support and giving easy accessibilty for people with disabilities, as well as shortening speech transcription time in different institutions.
Why use ASTT?
The benefits of using ASTT are numerous and can be used for individuals, and those in different industries and institutions. These different areas can include:
Media - subtitling of TV shows to provide accessibility for all viewers, and transcription of interviews for journalists.
Institutions - (parliament, ministry, municipal assemblies) - transcription of meetings and sessions, intelligent data (insights) for participation and discussion.
Law enforcement institutions - transcription of conversations with text search options through audio-visual content, other - innovative solutions for language learning, reading, and the like.
Other functionalities of ASTT include support for various content formats, text structuring (punctuation marks, standard language), automatic speaker recognition, and flexibility in implementation.
- Support for various content formats.
- Text structuring (punctuation marks, standard language).
- Automatic speaker recognition.
- Flexibility in implementation
Why was ASTT developed?
Albanian language was among the few languages that machines did not recognize before ASTT, and considering how technology is helpful in human activities, especially for people with disabilities, ASTT aims to solve these problems by using the latest technologies.
Technologies we used
ASTT consists of a deep neural network, its core engine is a recurrent neural network built on top of TensorFlow, and a language model. By extensive training of both the DNN and language model separately, ASTT combines the two to achieve excellent speech recognition.