Machine learning for te reo Māori 

Organisation

Te Hiku Media

Location

Wellington or Auckland, location negotiable.

Project description

Te Hiku Media’s Kōrero Māori project aims to develop natural language processing tools for te reo Māori. Te Hiku has over 370 hours of spoken corpus and have built the first Māori speech to text engine, available at koreromaori.io. As an intern, you will have the opportunity to study their corpus and apply it to other applications, improve current machine learning (ML) models by investigating the parameter space of the frameworks they use, and test new ideas focused around building ML tools from a te reo Māori perspective (e.g. many frameworks are built for English which isn’t very effective for phonetic languages). Your work will help create digital te reo Māori tools such as Māori speaking personal assistants, language learning apps that use ML to provide immediate feedback to learners, and transcription tools that will enhance access to thousands of hours of native speaker archives.

Skills required

Computer programming skills