https://arxiv.org/abs/1803.07828v1
https://github.com/AKSW/KG2Vec

Tommaso Soru
1
, Stefano Ruberto
2
, Diego Moussallem
1
, Edgard Marx
1
, Diego
Esteves
3
, and Axel-Cyrille Ngonga Ngomo
4
1 AKSW, University of Leipzig, D-04109 Leipzig, Germany
{tsoru,moussallem,marx
}@informatik.uni-leipzig.de
2 Gran Sasso Science Institute, INFN, I-67100 L’Aquila, Italy
stefano.ruberto@gssi.infn.it
3
SDA, University of Bonn, D-53113 Bonn, Germany
esteves@cs.uni-bonn.de
4 Data Science Group, Paderborn University, D-33098 Paderborn, Germany
axel.ngonga@upb.de
Abstract. Knowledge Graph Embedding methods aim at representing entities
and relations in a knowledge base as points or vectors in a continuous vector
space. Several approaches using embeddings have shown promising results on
tasks such as link prediction, entity recommendation, question answering, and
triplet classification. However, only a few methods can compute low-dimensional
embeddings of very large knowledge bases. In this paper, we propose KG2VEC
,
a novel approach to Knowledge Graph Embedding based on the skip-gram model.
Instead of using a predefined scoring function, we learn it relying on Long ShortTerm
Memories. We evaluated the goodness of our embeddings on knowledge
graph completion and show that KG2VEC is comparable to the quality of the
scalable state-of-the-art approach RDF2Vec and can process large graphs by parsing
more than a hundred million triples in less than 6 hours on common hardware.\
Continue reading

林开开1,沉士起1,刘志远1,2 *,栾波1,孙茂松1,2 1清华大学计算机科学与技术系,国家智能技术与系统国家重点实验室,清华大学信息科学与技术国家重点实验室,北京,中国2江苏省语言能力协作创新中心

抽象

远程监督关系提取已被广泛用于从文本中找到新的关系事实。然而,遥远的监督不可避免地伴随着错误的标签问题,这些嘈杂的数据将严重损害关系提取的性能。为了缓解这个问题,我们提出了一个关系抽取的句子级关注模型。在这个模型中,我们使用卷积神经网络来嵌入语句的语义。之后,我们在多个实例上构建语句级注意力,这样可以动态减少那些噪音实例的权重。实际数据集的实验结果表明,我们的模型可以充分利用所有信息句子,并有效减少错误标记实例的影响。与基线相比,我们的模型在关系提取方面取得了显着且一致的改进。本文的源代码可以从https://github.com/thunlp/NRE获取。

1介绍

近年来,Freebase(Bollacker et al。,2008),DBpedia(Auer et al。,2007),YAGO(Suchanek et al。,2007)等大型知识库已经建成并得到广泛应用在许多自然语言处理(NLP)任务中,包括网络搜索和问题回答。这些知识库主要由三重格式的关系事实组成,例如(微软,创始人比尔盖茨)。尽管现有的KB包含*

 

通讯作者:刘志远(liuzy@tsinghua.edu.cn)

 

与大量事实相比,与无限的现实世界事实相比,它们还远未完成。为了丰富知识库,已经投入了很多努力来自动发现未知的关系事实。因此,关系抽取(RE)是从纯文本生成关系数据的过程,是NLP中的关键任务。

 

现存最多
Continue reading