Need help with Open-IE-Papers?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

NPCai
149 Stars 14 Forks MIT License 30 Commits 0 Opened issues

Description

Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.

Services available

!
?

Need anything else?

Contributors list

# 354,692
informa...
relatio...
27 commits
# 565,906
informa...
relatio...
3 commits

Table of Contents

  1. General
  2. Literature Reviews
  3. Papers - Neural Networks
  4. Papers - Parse-based and statistical
  5. Papers - Older papers and legacy systems
  6. Training and Testing Data

General

This README containts OpenIE and ORE papers and resources. Summaries are by @jbecke and @TheodoreChristakis, to the best of our abilities after reading each paper or testing the system (when available). We welcome pull requests with additional resources, papers, or data.

Literature Reviews

Papers - Neural Networks

*Learning Open Information Extraction of Implicit Relations from Reading Comprehension Datasets extracting more implied ("common sense") relations.

Papers - Parse-based and statistical

  • Graphene generates n-ary extractions with semantically linking-labels like "TEMPORAL", "CAUSE", etc. as well as open relations
  • Stanford Open IE: produces maximally-shortened tuples. It seems to often produce tuples for which the reported confidience is often 1.0. GPL or proprietary available as part of Stanford Core NLP.
  • OpenIE-X (v4, v5, allen institute version). Works well with simple statements (see examples in this dataset). Outputs context for extractions and gives good confidence predictions that can be used to balance precision-recall. Note the restrictive license (research purposes only).
  • Open Relation Extraction and Grounding: Extracts argument pairs of relation tuples and forms weighted dependency trees between two arguments. It shows promising results in determining relative importance of each argument in the tree.
  • Unsupervised Open Relation Extraction: Used for unsupervised relation extraction from free text by using pretrained word embeddings while using a sentence's dependency parse tree as a foundation.

Papers - Older papers and legacy systems

  • From University of Washington
    • TextRunner - One of the earliest papers addressing open information extraction
    • Reverb - Improved the extraction to better form the tuple of (argument, relation, argument)
    • OLLIE - Addressed the issue of misleading propositions and non-verb mediated relations
  • CSD-IE - Generation of nested contractions which is especially effective in sentences using subordinating clauses
  • PropS: Syntax Based Proposition Extraction
  • ClausIE - Formed a strong relation between grammatical clauses, propositions, and OIE extractions by defining seven grammatical patterns
  • ReNoun - Used predominantly for noun-mediated relations.

Training and Testing Data

  • 35M sentence-tuple pairs: from the paper Neural Open Information Extraction. It was generated by OpenIE-4, removing any tuples less then 0.9 confidence. Because there is no sample data, I've copied a bit below. As you can see, the data is somewhat noisy. It might be useful for extra training data, but not as a gold dataset. ``` * moving and handling '' ' - a comprehensive course that covers safe handling and transport of casualties . '' ' - a comprehensive course covers safe handling and transport of casualties

this word , adjectival magavan meaning `` possessing maga - '' , was once the premise that avestan maga - and median magu - were co-eval . - '' , was once the premise that avestan maga - and median magu - were co-eval

melora walters as candy ' - a hooker who works for the motel where john person is staying , as a complimentary service to the guests . ' - a hooker works for the motel

  • - a hunter who uses bows and arrows instead of guns . - - a hunter uses bows and arrows instead of guns ```
  • TupleInf Open IE Dataset: OpenIE-4 extractions of 8th grade and 4th grade questions. By inspection, these tend to be cleaner than the above dataset because of the simplicity of the language. Confidence-values are retained so you can make your own tradeoff between precision and recall. Note suitable for a gold dataset. ``` 01 April 1969 The ATM would be a manned solar observatory making measurements of the Sun by telescopes and instruments above 0.96 (The ATM; would be; a manned solar observatory making measurements of the Sun by telescopes and instruments) 0.93 (a manned solar observatory; making; measurements of the Sun)

01 April 1969 The ATM would be a manned solar observatory making measurements of the Sun by telescopes and instruments above the Earth's atmosphere. 0.96 (The ATM; would be; a manned solar observatory making measurements of the Sun by telescopes and instruments above the Earth's atmosphere) 0.93 (a manned solar observatory; making; measurements of the Sun)

01 - Compare the physical properties of ice, liquid, water, and vapor.

01 Earthly Seasons PURPOSE: To show that the seasons are the consequence of the tilt of earth.

0.1% water can lower the melting temperature of peridotite by 100 C. 0.91 (0.1% water; can lower; the melting temperature of peridotite)

( 020 ) Celsius °C The international temperature scale where water freezes at 0 (degrees) and boils at 100 (degrees). 0.89 (water; freezes; at 0 (degrees)

* [Squadie](https://github.com/NPCai/Squadie) (not yet published, expect changes): this is our dataset derived from Squad. It uses a similar JSON format to SQuAD and contains 50,000 tuples. This tuple can then be matched with the corresponding sentence in the training corpus. Not suitable as a gold corpus. Squadie is useful for extracting implied relations. We have also converted Maluuba NewsQA.
{ "question": "Which film did Beyoncé star in 2001 with Mekhi Phifer?", "id": "56d4831f2ccc5a1400d83155", "answer": "Carmen: A Hip Hopera", "tuple": "" }, { "question": "What was the name of Destiny Child's third album?", "id": "56d4831f2ccc5a1400d83156", "answer": "Survivor", "tuple": "" }, { "question": "Who filed a lawsuit over Survivor?", "id": "56d4831f2ccc5a1400d83157", "answer": "Luckett and Roberson", "tuple": "" }, { "question": "When did Destiny's Child announce their hiatus?", "id": "56d4831f2ccc5a1400d83158", "answer": "October 2001", "tuple": "" } ```

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.