Brainmaker

Nanos gigantium humeris insidentes!
You are currently browsing the Plan category

论文进度

  • July 17, 2010 7:12 pm

17日晚之前:完成 reduplication,并完整地写如何生成synopis 信息

18日下午之前:完成document vector generation 的两个基本方案(基于语素近邻、基于co-ocurrence),要求必需是成熟方案

(看多那三篇论文)

18日晚开始写论文,要求完成论文的主体

19日完善,如有可能,最好真的实现,论文完成

20日投稿

20日之后:要求将系统真地实现

一个较验结果的特别方法

  • July 12, 2010 1:42 am

使用google的搜索结果来判断一个句子是否有意义

the dog barks            About 1,010,000 results

the flower barks         1 result

detail on semi supervised project

  • July 11, 2010 11:41 pm

方案

  • Apply semi-supervised learning to certain corpus, and compare the result– basically similar to what we did in ML course
  • Improve others’ supervised learning to semi-supervised
    • 2007-Semi-Supervised Learning for Semantic Parsing using support vector machine.pdf
  • apply the same semi-supervised on general wiki to simple wiki

整理

  • 整理corpus
  • 整理semi-supervised的方法

好书

Semi-supervised learning

问题

无法找到合适的免费的corpus —如果要参加那个会议,就不能用corpus,要用noisy text

如果确定不要用wiki,那么可以把搜到的论文整理一下,看他们都用什么corpus

http://www.d.umn.edu/~tpederse/data.html

想法

Name Entity Recognition (NER) 是一个相对易于操作的方向


两个Final Projects的方案

  • July 11, 2010 4:02 pm
  1. Equivalence between Reductive English and FOPC
  • Labeled as: Eq RE & FOPC
  • Description: Reduce the expressive power of natural language to the level that FOPC is equivalent.
  • Related knowledge:
    • Theory of FOPC / logic
    • Psychology
    • Language
    • Turing machine
  • Problems
    • 不确定这个思路是不是正确的
    • 涉及到的句型数量可能很多
    • 难以证明
  • 方案
    • 弄清FOPC表现能力不足的地方 By July 11
      • 看FOPC不足的论文,并从NLP来分析
    • Reduce Natural English to the same level
  • 一些想法
    • 我之前得出的结论并不是reductive english 和 FOL的等价性,而是 利用reductive english 来解决semantic role 歧义问题
  1. Semi-supervised Learning on NLP
  • Labeled as: ML on NLP
  • Description: Improvement or Implementation of Semi-supervised Semantic Role Labeling Using the Latent Words Language Model
  • Related Knowledge
    • Machine learning
    • NLP
  • 方向
    • Apply semi-supervised learning to certain corpus, and compare the result– basically similar to what we did in ML course
    • Improve others’ supervised learning to semi-supervised
      • 2007-Semi-Supervised Learning for Semantic Parsing using support vector machine.pdf
  • 方案:
    • 阅读2005-2010的Survey By Jul 11
      • 寻找improvement or implemantation,确定方向

Two project topics

  • July 11, 2010 3:49 pm
  1. Equivalence between Reductive English and FOPC
  • Labeled as: Eq RE & FOPC
  • Description: From the aspect of logic or math, prove that the FOL has the same expressive power of the converter I design
  • Related knowledge:
    • Theory of FOPC / logic
    • Psychology
    • Language
    • Turing machine
  • Pros and Cons
    • It’s a theoretical problem
    • It’s a good novel aspect
    • It might be too difficult
    • It might be even not correct
  • Evaluation
    • Academic Value: ▲▲▲▲△
    • Feasibility:▲▲△△△
    • Difficulty: ▲▲▲△△
  1. Semi-supervised Learning on NLP
  • Labeled as: ML on NLP
  • Description: Improvement or Implementation of Semi-supervised Semantic Role Labeling Using the Latent Words Language Model
  • Related Knowledge
    • Machine learning
    • NLP
  • Pros and cons
    • It’s popular research field and match lots of researcher.
    • The potential professors options are greater.
    • I don’t have enough background knowledge for semi-supervised Learning.
  • Evaluation
    • Academic value: ▲▲▲△△
    • Feasibility: ▲▲▲▲△
    • Difficulty:▲▲▲▲△
  1. Probabilistic disambiguation:
  • Labeled as: PD
  • Description: P1, the context-free probability of a meaning to a word W; if given probability P2 to word W2. Then the joint probability is P(E1|E2)
  • Related knowledge
    • Probability
    • Syntax and Grammar
  • Pros and Cons
    • It’s a theoretical problem
    • It looks like obvious works
    • There is some work.
  • Evaluation
    • Academic value: ▲▲▲△△
    • Feasibility: ▲▲▲▲△
    • Difficulty:▲▲▲△△

Comparision: Project 2 vs Project 3 Projct3 is vague and would be part of Project 2, thus ruling out Project 3.

Some remainder

  • July 9, 2010 10:53 am

About Probabilistic and Statistic

  • Principles of knowledge representation / edited by Gerhard Brewka

About the expressive power equivalence

About the implementation of the FOL base

Other project

  • UPenn Tree bank
  • stanford Part-of-speech tagger
  • Raphael SIR system

概率

  • July 9, 2010 2:29 am

既然很多NLU问题都是imperical, 为什么不直接表达为概率呢

  • 一个子问题,由当前概率确定
  • 如果联系上下文,则为在给定事件下,当前事件的概率

借概率与自然语言相关的书来看后,先静一天,把先前的思路重新整理

combination of statistics and predicate calculus

prolog reasons with every possible rule, but here, we attach probability to each rule, when reasoning, we apply with probability.Aquist, Hoepelman, and Rohrer 1980

问题方案2

  • July 6, 2010 10:35 pm
  1. 分析出指代结构,然后将一篇文章按句号拆成句子
  2. 分析出Semantic Role, 确定句子逻辑成分
    1. 用parser tree分析出每句的syntactic function(syntactic categories)
    2. 想办法由parser tree映射出semantic role(grammatical function)–目前没有已知的一一映射模型–这个问题称为Semantic Role Annotation
      1. 使用更细致的句式来判断
      2. 使用统计概率来判断
  3. 预选依据English Syntax,生成所有的句法对应的一阶逻辑语句
  4. 用3的规则去生成2中句子的对应fact和rule


有用的书 http://codetopper.com/others/175/reading-plan.html

Syntax:

  • English syntactic structures : functions and categories in sentence analysis / Flor Aarts and Jan Aar
    • Part II: Structure 非常好的句法结构分析

Verb Syntax: 关于predicate部分,以下两本书详细地包含了

  • An empirical grammar of the English verb system / Dieter Mindt
    • The verb system
  • The grammar of English predicate complement constructions [by] Peter S. Rosenbaum       PE1380 .R6 c.2
    • The complement for predicate

Logic & Predicate:谓词逻辑,可能有帮助

  • The semantic foundations of logic / Richard L. Epstein V.1       BC71 .E57 1994
  • Subject and predicate in logic and grammar / P.F. Strawson       B1667.S383 S83 2004

Reading Plan

  • July 6, 2010 5:46 pm

Syntax:

  • An introduction to English sentence structure / Andrew Radford
    • As a reference
  • English syntactic structures : functions and categories in sentence analysis / Flor Aarts and Jan Aar
    • Part II: Structure

Verb Syntax:

  • An empirical grammar of the English verb system / Dieter Mindt
    • The verb system
  • The grammar of English predicate complement constructions [by] Peter S. Rosenbaum       PE1380 .R6 c.2
    • The complement for predicate

Logic & Predicate

  • The semantic foundations of logic / Richard L. Epstein V.1       BC71 .E57 1994
  • Subject and predicate in logic and grammar / P.F. Strawson       B1667.S383 S83 2004


problems-updated to 7-5

  • July 5, 2010 1:39 am

Updated on 7/5

问题:

  1. 归纳谓词的各种synatic categories

——————————————————-

Updated on 7/5 morning

回答先前问题

项目陈述:Build a knowledge base, represent wikipedia as Cyc-like base

问题陈述:

  1. 如何定义termlogical language and facts
    1. 如何处理复杂句:按照已归纳的rule处理
    2. parsing tree 能划分句子成分吗
    3. 划分后如何处理成FOL:按照parser tree来syntax来分
    4. FOL如何转化成CG:暂不做该步
  2. 如何建立rule and inference method:rule与fact同时由wiki生成

新问题陈述:

现在的方案是:

  1. 分析出指代结构,然后将一篇文章按句号拆成句子
  2. 用parser tree分析出每句的成分
  3. 预选依据English Syntax,生成所有的句法对应的一阶逻辑语句
  4. 用3的规则去生成2中句子的对应fact和rule

两个要务

  1. 指代结构
  2. English Syntax 生成一阶

问题2方案

  1. 参考A synopsis of English Syntax by Eugene A. Nida, 拆出所有句型
  2. 所有句型对应的一阶逻辑
  3. 写出能用于编程的规则


---------

7/1

回答先前问题:

  • 是否需要处理成简单句,是否有可能处理成简单句
  • 如何画分出句子成分
    • parsing tree 可以,但不是自己要的
  • 如何去除不重要的句子成份
  • 是用语义网还是谓词逻辑;应该用一阶逻辑还是二阶逻辑,哪个的描述能力足够;如果是二阶逻辑的话,其演绎系统完善、切实可行否
    • CGs and FOL are equivalent
    • 当前转化成一阶逻辑的技术都不成熟
  • 文章所在的分类是micro-theory,文章名称是constant,知识集本身就是常识

修改问题:

7.1 问题陈述

项目陈述:Build a knowledge base, represent wikipedia as Cyc-like base

问题陈述:

  1. 如何定义termlogical language and facts
    1. 如何处理复杂句
    2. parsing tree 能划分句子成分吗
    3. 划分后如何处理成FOL
    4. FOL如何转化成CG
  2. 如何建立rule and inference method

方案:

先处理问题1中的2和3

1.2:参考 natural language understanding的书

1.3:参考Sowa 的CG原作,继续查找关于把natural language 转化成FOL的书

—————————————————–

6.30

项目陈述:把wikipedia转化成一个类似Cyc的结构化知识库

问题陈述

  • 是否需要处理成简单句,是否有可能处理成简单句
  • 如何画分出句子成分
  • 如何去除不重要的句子成份
  • 是用语义网还是谓词逻辑;应该用一阶逻辑还是二阶逻辑,哪个的描述能力足够;如果是二阶逻辑的话,其演绎系统完善、切实可行否
  • 文章所在的分类是micro-theory,文章名称是constant,知识集本身就是常识
    Page 1 of 212