Brainmaker

Nanos gigantium humeris insidentes!

Note Of Universal Networking Language-based Relation Extraction

  • August 6, 2010 4:01 pm

paper: A statistical Approach for Universal networking Language-based Relation Extraction


Abstract– With the assumption that the positions of phrases in a sentence between which there exists a relation have been identified, we focus on the problem of classifying the relation between the given phrases.

The UNL relation classifier was developed by using statistical techniques applied on several lexical and syntactic features. In addition to the common used features, we also propose a new feature that reflects the actual semantic relation of two phrases independent on words in the between.

1. Introduction

Universal Networking Language(UNL) was defined as an artificial language that is able to represent information and knowledge described in natural languages[13].

  • Universal Words (UWs): is UNL’s vocabulary.
  • Relations:defines relationships between pairs of UWs.
  • Attributes: describes the subjectivity of sentences including
    • time with respect to the speaker,
    • speaker’s view of aspect,
    • speaker’s view of reference,
    • speaker’s focus,
    • speaker’s attitudes and
    • speaker’s view point.

One of the important tasks to create UNL representations from natural language text is to extract relations between pairs of UWs. Relation extraction task can be divided into two subtasks:

  1. identifying pairs of UWs between which it is likely to have a relation
  2. identifying the relation label for the pairs.

In this paper, we focus on the second problem. That means, we develop a label classifier given pairs of UWs between which there exists a relation.

The problem: given an English text and a pair of phrases in the text between which there exists a relation, identify the type of relation for the pair.

In this work, we apply statistical techniques to train our classifier.

  1. a number of features of the two phrases will be extracted from the text to create some feature vectors.
  2. count the occurrences of the relations for each feature vector
  3. estimate the probability of each relation type given a feature vector

Dataset: Universal networking Digital Language foundation(UNDL)

3. Method

source phase and destination phrase

For each pair of source and destination phrases in a relation of the training set, we extracted linguistic features and created a feature vector.

  1. The number of occurrences of each relation label for each feature vector will be counted in the whole training set.
  2. In testing time, the probability for each relation label will be estimated given the feature vector associated with a pair of phrases.
  3. The relation label with highest probability will be chosen and assigned to the phrases.

A. Feature Selection

Phrase Type, Head word, voice, dependency path(by Minipar parser), syntactic cross path

detail for Syntactic Cross Path

The string representing a path from the source position through the syntactic parse tree to the destination position. procedures:

  • For source and destination phrases, specify the highest nodes which still receive the head words from the phrases. (H1 H2)
  • specify the lowest common node that covers both of the phrases. (C)
  • Assume that the source phrase and the destination phrase are separated, we have three cases
    • The node C is higher than the nodes H1 and H2, trace upwards from H1 to C, then trace downwards to H2
    • The node H1 is higher or equals to the node C: trace downwards from C to H2
    • The node H2 is higher or equals to the node C: trace upwards from H1 to C

The advantage of syntactic cross path feature: it reflect the syntactic structure of the two concepts represented by the two phrases that is independent on the words lying between the phrases.

Example:

————————————————

Adopted a recommendation to this effect in 1964

Phrase type VBD(POS tag of “adopted”), NN(POS tag of “effect”)
Head word adopt, effect
Voice Active, Unspecified
Dependency path (V)obj v(N)mod v(Prep)pcomp-n v(N)
Syntactic cross path VP v NP v PP(to) NP

————————————————

B Probability predication

we will count number of relations for each feature vector in the training data and estimate the probability for each relation given a feature vector in testing.

The conditional probability can be estimated as the following:

P(R|F) = \frac{\#(R,F)}{\#(F)}

where:

  • R: relation label
  • F: feature vector
  • #(R,F): number of relations with label R counted for feature vector F in the training.
  • #(F): number of relations which receive F as a feature vector counted in the training.

To estimate the probability for such the cases, we reduce the restriction of the condition in the probability formula by dividing the general feature vector into smaller vectors. In other words, we reduce the dimension of feature vector, we estimate the conditional probability by using linear interpolation method as proposed in[1].


Training:

  • Input: a training set including full information of relations
  • Ouput: estimated conditional probabilities P1, …, P8
  1. For each training fragment S; source phrase and destination phrase; relation label R, do:
    1. Receive syntactic parse tree T from Charniak parser and dependency tree D from Minipar parser from the fragment S.
    2. Based on the trees T and D, extract the features for the two phrases: pt1, pt2, hw1, hw2, voice1, voice2,depPath, SynCrossPath
    3. Increase the following counters by 1:
      1. c1(R, pt1), c2(R, pt2), …, C8(R, pt1, pt2, hw1, hw2, voice1, voice2, depPath, synCrossPath)
      2. t1(pt1), t2(pt2), …,  t8(pt1, pt2, hw1, hw2, voice1,  voice2, depPath,  synCrossPath)
        where ci(R,Fi) = #(R,Fi) and ti(Fi)=#(Fi)
  2. calculate all the conditional probabilities
    P_1(R|pt_1)=\frac{c_1(R,pt_1)}{t_1(pt_1)}, ...,
    P_8(R|pt_1, pt_2, hw_1, hw_2, voice_1, voice_2, depPath, synCrossPath) = \frac{c_8(pt_1, pt_2, hw_1, hw_2, voice_1, voice_2, depPath, synCrossPath)}{t_8(pt_1, pt_2, hw_1, hw_2, voice_1, voice_2, depPath, synCrossPath)}


Testing:

  • Input: a fragment S; source phrase and destination phrase; all the conditional probabilities P1, …, P2
  • Output: relation label R
  1. Receive syntactic parse tree T from Charniak parser and dependency D from Minipar parser for the fragment S.
  2. Based on the trees T and D, extract the features for the two phrases: pt1, pt2, hw1, hw2, voice1, voice2, depPath, synCrossPath
  3. Estimate probabilities for  all relation labels by using linear interpolation as listed above.
  4. Choose the relation label R for the two phrases:
    R = \arg\displaystyle\max_{r} P(r| the two phrases)
Print Friendly