Language Models are Open Knowledge Graphs (Paper Explained)

Yannic Kilcher
Yannic Kilcher
36 هزار بار بازدید - 4 سال پیش - #ai
#ai #research #nlp

Knowledge Graphs are structured databases that capture real-world entities and their relations to each other. KGs are usually built by human experts, which costs considerable amounts of time and money. This paper hypothesizes that language models, which have increased their performance dramatically in the last few years, contain enough knowledge to use them to construct a knowledge graph from a given corpus, without any fine-tuning of the language model itself. The resulting system can uncover new, unknown relations and outperforms all baselines in automated KG construction, even trained ones!

OUTLINE:
0:00 - Intro & Overview
1:40 - TabNine Promotion
4:20 - Title Misnomer
6:45 - From Corpus To Knowledge Graph
13:40 - Paper Contributions
15:50 - Candidate Fact Finding Algorithm
25:50 - Causal Attention Confusion
31:25 - More Constraints
35:00 - Mapping Facts To Schemas
38:40 - Example Constructed Knowledge Graph
40:10 - Experimental Results
47:25 - Example Discovered Facts
50:40 - Conclusion & My Comments

Paper: https://arxiv.org/abs/2010.11967

Abstract:
This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e.g., BERT, GPT-2/3), without human supervision. Popular KGs (e.g, Wikidata, NELL) are built in either a supervised or semi-supervised manner, requiring humans to create knowledge. Recent deep language models automatically acquire knowledge from large-scale corpora via pre-training. The stored knowledge has enabled the language models to improve downstream NLP tasks, e.g., answering questions, and writing code and articles. In this paper, we propose an unsupervised method to cast the knowledge contained within language models into KGs. We show that KGs are constructed with a single forward pass of the pre-trained language models (without fine-tuning) over the corpora. We demonstrate the quality of the constructed KGs by comparing to two KGs (Wikidata, TAC KBP) created by humans. Our KGs also provide open factual knowledge that is new in the existing KGs. Our code and KGs will be made publicly available.

Authors: Chenguang Wang, Xiao Liu, Dawn Song

Links:
YouTube: yannickilcher
Twitter: Twitter: ykilcher
Discord: Discord: discord
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: LinkedIn: yannic-kilcher-488534136

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: Patreon: yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
4 سال پیش در تاریخ 1399/08/12 منتشر شده است.
36,014 بـار بازدید شده
... بیشتر