Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Interest
Topics
Details

Details

  • Name

    Gabriel Jesus
  • Role

    Research Assistant
  • Since

    29th September 2021
  • Nationality

    Timor leste
  • Contacts

    +351222094000
    gabriel.jesus@inesctec.pt
Publications

2023

Text Information Retrieval in Tetun

Authors
de Jesus, G;

Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III

Abstract
Tetun is one of Timor-Leste's official languages alongside Portuguese. It is a low-resource language with over 932,000 speakers that started developing when Timor-Leste restored its independence in 2002. Newspapers mainly use Tetun and more than ten national online news websites actively broadcast news in Tetun every day. However, since information retrieval-based solutions for Tetun do not exist, finding Tetun information on the internet and digital platforms is challenging. This work aims to investigate and develop solutions that can enable the application of information retrieval techniques to develop search solutions for Tetun using Tetun INL and focus on the ad-hoc text retrieval task. As a result, we expect to have effective search solutions for Tetun and contribute to the innovation in information retrieval for low-resource languages, including making Tetun datasets available for future researchers.