Volume 6, Issue 1, 2016

New Automatic Search and Update Algorithms of Vietnamese Abbreviations

Nguyen Nho Tuy, Vietnam Posts and Telecommunications Group, (VNPT) Danang Branch, Danang City, Vietnam.
Phan Huy Khanh, University of Science and Technology, – The University of Danang, Danang City, Vietnam.

Abstract- Abbreviations in documents are widely used in various fields and in many languages including Vietnamese. In fact, currently, abbreviations are regularly repeated and unclearly used, demand for abbreviation use is increasing, which requests a plentiful source of abbreviations which is conveniently saved and used, easily updated and consistently exploited. In this article, we propose some abbreviation search algorithms on the Internet in order to automatically update into database of Vietnamese abbreviations for many purposes during language processing and database exploitation.

Keywords- abbreviation; acronym; database; abbreviation searching programs; automatic search Vietnamese abbreviations.


Speculation and Negation Annotation for Arabic Biomedical Texts: BioArabic Corpus

Fatima T. AL-Khawaldeh
Department of Computer Science, Al-Albayt University, Al-Mafraq, Jordan.

Abstract—Negation and speculation are two common linguistic concepts in natural language processing field, need more semantic understanding of texts. They are used to definite factuality of text. Negation is used to express the opposite of the text and the Speculation is used to determine the degree of certainty. Biomedical text mining is the main natural language processing application concerns with negation and speculation to distinguish between facts and uncertain or negated information in biomedical text. To our knowledge, there is no previous research on annotating Arabic biomedical text to identify the negative or speculative expression and no publicly available standard corpora of suitable size that are practical for evaluating the automatic detection of negation and speculation tools and scope determination. This paper presents produced corpus handling negation and speculative in Arabic biomedical texts with the main annotation (we call this corpus the BioArabic corpus). The goal of building BioArabic corpus is to help biologists and computational linguistics, who develop tools for identifying the negation and speculation, to train and evaluate these tools since in
biomedical texts language, assumptions, experimental results and negative results are used extensively. We will report our statistics on corpus size and the consistency of annotations.

Keywords-Arabic NLP; negation; speculation; biomedical (medical and biological); cues; certainty.

