Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
  • Menu

INESC TEC wants to digitise the largest documental collection of Portugal

Integrate the largest collection of sources for the History of Portugal in the digital world. This is the main goal of the new INESC TEC's project entitled EPISA—Entity and Property Inference for Semantic Archives, which will start in January 2019.

10th December 2018

Based on the analysis of the existing records of the National Archive of Torre do Tombo (ANTT), EPISA focuses on producing new representations of the documents that interconnect them to the open data networks, considering the increase of online accesses. For that, the project will develop tools for the production of new records by archivists and for the research carried out by citizens.

ANTT is responsible for preserving the documents of the Portuguese State, covering the entire history of the country, and integrates, in addition to the National Archives, the majority of the District Archives. It manages a collection of about 20 millions digital representations and analogue documentation, which, if aligned, corresponds to about 100 km. The documentary heritage of ANTT has been progressively digitised and incorporates born-digital documents, with a total of 1,3 million records of available documents. All this information is systematically recorded and described according to internationally established rules for a context dominated by paper support.

EPISA uses natural language processing, entity recognition, and automatic learning methods in order to explore the records of documents and documents themselves if they are in digital form. From the descriptions produced by archivists, entities and relationships will be extracted that will populate a description model, semantically richer than the current one and easier to deal with automatically.

The project was submitted together with another 48 proposals in May 2018 to Foundation for Science and Technology (FCT) in the Call for Projects of Scientific Research and Technological Development in Data Science and Artificial Intelligence in Public Administration- 2018, a joint initiative of the Ministry of Science, Technology and Higher Education and the Ministry of Presidency and Administrative Modernisation. From all projects, fifteen were approved for funding. EPISA received an approximate budget of EUR 300,000.

In addition to INESC TEC’s Centre for Information Systems and Computer Graphics (CSIG) (tenderer), EPISA has as partners the University of Évora and the General Directorate for Book, Archives and Libraries (DGLAB) that is responsible for the National Archive of Torre do Tombo.

Credits photo: ANTT