Pembangkitan Graf Semantik Teks Bahasa Indonesia dengan Pembelajaran Mesin
Nama Peneliti (Ketua Tim)

Masayu Leylia Khodra



Ringkasan Kegiatan

Abstract Meaning Representation (AMR) is a robust semantic representation that can store many semantic concepts in a sentence condensed into one graph, rather than doing each of the task (e.g. coreference resolution, named entity detection) one-by-one. The current state-of-the-art AMR parsing system was developed by Zhang et al. that used deep learning approach to parse English to its AMR form. It achieved SMATCH score of 76.3% on LDC2017T10 dataset that has 39260 sentences. This amount of data is very large compared to the current Indonesian AMR dataset. However, research on AMR parsing for Indonesian sentence is fairly limited. This research proposed a system that aims to parse an Indonesian sentence using a machine learning approach. Based on Zhang et al. work, our system consists of three steps: pair prediction, label prediction, and graph construction. Pair prediction uses dependency parsing component to get the edges between the words for the AMR. The result of pair prediction is passed to the label prediction process which used a supervised learning algorithm to predict the label between the edges of the AMR. We conclude that an AMR parsing system for Indonesian using machine learning approach can be built using three steps that is inspired by Zhang et al. work. Those steps are pair prediction, label prediction, and the postprocess. Our proposed system is able to produce decent result in a simple structured sentence, but still suffers in a more complex structured sentences.



Capaian

Penerapan Karya Tulis



Testimoni Masyarakat

Research on AMR parsing for Indonesian sentence is fairly limited.