NLP Tools for Knowledge Extraction from Italian Archaeological Free Text

Achille Felicetti, Daniel Williams, Ilenia Galluccio, Douglas Tudhope, Franco Niccolucci

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

149 Downloads (Pure)


This paper deals with the development of advanced tools and technologies for creating relevant information and suitable metadata out of textual documentation produced by Italian archaeological research. A set of Natural Language Processing tools were developed to recognize and annotate various archaeological entities in Italian language textual reports. The CIDOC CRM is the ontology chosen for encoding resulting output, allowing for a maximum degree of standardisation of the produced metadata to guarantee interoperability with archaeological information already existing in other semantically enabled digital archives. The work took place as part of the development for the TEXTCROWD platform for the European Open Science Cloud for Research Pilot Project.
Original languageEnglish
Title of host publication2018 3rd Digital Heritage International Congress (DigitalHERITAGE) held jointly with 2018 24th International Conference on Virtual Systems & Multimedia (VSMM 2018)
EditorsAlonzo C. Addison, Harold Thwaites
PublisherInstitute of Electrical and Electronics Engineers
Number of pages8
ISBN (Electronic)978-1-7281-0292-4 , 978-1-7281-0293-1
Publication statusPublished - 11 Dec 2018
EventDigital Heritage 2018 - 3rd International Congress & Expo: New Realities: Authenticity & Automation in the Digital Age - San Francisco , United States
Duration: 26 Oct 201830 Oct 2018


ConferenceDigital Heritage 2018 - 3rd International Congress & Expo
Abbreviated titleDH2018
Country/TerritoryUnited States
CitySan Francisco


  • NLP
  • NER
  • Italian language archaeology
  • textual documents
  • Grey Literature
  • Metadata integration
  • Standards


Dive into the research topics of 'NLP Tools for Knowledge Extraction from Italian Archaeological Free Text'. Together they form a unique fingerprint.

Cite this