<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>http://wiki.christophchamp.com/index.php?action=history&amp;feed=atom&amp;title=Paraphrase_algorithm</id>
		<title>Paraphrase algorithm - Revision history</title>
		<link rel="self" type="application/atom+xml" href="http://wiki.christophchamp.com/index.php?action=history&amp;feed=atom&amp;title=Paraphrase_algorithm"/>
		<link rel="alternate" type="text/html" href="http://wiki.christophchamp.com/index.php?title=Paraphrase_algorithm&amp;action=history"/>
		<updated>2026-04-30T20:44:21Z</updated>
		<subtitle>Revision history for this page on the wiki</subtitle>
		<generator>MediaWiki 1.26.2</generator>

	<entry>
		<id>http://wiki.christophchamp.com/index.php?title=Paraphrase_algorithm&amp;diff=4127&amp;oldid=prev</id>
		<title>Christoph: /* External links */</title>
		<link rel="alternate" type="text/html" href="http://wiki.christophchamp.com/index.php?title=Paraphrase_algorithm&amp;diff=4127&amp;oldid=prev"/>
				<updated>2007-06-18T03:53:35Z</updated>
		
		<summary type="html">&lt;p&gt;‎&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;External links&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;tr style='vertical-align: top;' lang='en'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;Revision as of 03:53, 18 June 2007&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l57&quot; &gt;Line 57:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 57:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*[http://www.cs.cornell.edu/home/llee/papers/statpar-informal.draft.html An informal explanation of &amp;quot;Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment&amp;quot;]&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*[http://www.cs.cornell.edu/home/llee/papers/statpar-informal.draft.html An informal explanation of &amp;quot;Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment&amp;quot;]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*[http://ejohn.org/projects/javascript-diff-algorithm/ Javascript Diff Algorithm]&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*[http://ejohn.org/projects/javascript-diff-algorithm/ Javascript Diff Algorithm]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[http://aclweb.org/aclwiki/index.php?title=DIRT_Paraphrase_Collection DIRT Paraphrase Collection]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[http://aclweb.org/aclwiki/index.php?title=Distributional_Hypothesis Distributional Hypothesis]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[http://aclweb.org/aclwiki/index.php?title=Statistical_Semantics Statistical Semantics]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;*[http://semantics.isi.edu/ocean/ VerbOcean]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Linguistics]]&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Linguistics]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Christoph</name></author>	</entry>

	<entry>
		<id>http://wiki.christophchamp.com/index.php?title=Paraphrase_algorithm&amp;diff=4125&amp;oldid=prev</id>
		<title>Christoph at 03:32, 18 June 2007</title>
		<link rel="alternate" type="text/html" href="http://wiki.christophchamp.com/index.php?title=Paraphrase_algorithm&amp;diff=4125&amp;oldid=prev"/>
				<updated>2007-06-18T03:32:14Z</updated>
		
		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;This article will describe my work in developing a '''paraphrase algorithm''' using Project Gutenberg as my corpora. It is a form of natural language processing (NLP) in computational linguistics.&lt;br /&gt;
&lt;br /&gt;
The idea is to first extract comparable sub-corpora from my main corpus and use this as my training set. To start with, I will first build a basic sub-subset of parallel corpus and use this for sentence clustering. I am trying to collect data for inferring templates from sentences that appear to be similar on a word-by-word level.&lt;br /&gt;
&lt;br /&gt;
==Examples==&lt;br /&gt;
===Sem Experiment===&lt;br /&gt;
See: [http://www.cs.cornell.edu/Info/Projects/NLP/statpar/instr1.html Statistical Paraphrasing Project] from the [http://www.cs.cornell.edu/Info/Projects/NLP/ Cornell Natural Language Processing Group]&lt;br /&gt;
&lt;br /&gt;
===Swords to ploughshares===&lt;br /&gt;
Let &amp;lt;code&amp;gt;A1 = Isaiah 2:4&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;B1 = Micah 4:3&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;C1 = Joel 3:10&amp;lt;/code&amp;gt;. With,&lt;br /&gt;
;A1: And he shall judge among the nations, and shall rebuke many people: and they shall beat their swords into plowshares, and their spears into pruninghooks: nation shall not lift up sword against nation, neither shall they learn war any more.&lt;br /&gt;
;B1: And he shall judge among many people, and rebuke strong nations afar off; and they shall beat their swords into plowshares, and their spears into pruninghooks: nation shall not lift up a sword against nation, neither shall they learn war any more.&lt;br /&gt;
;C1: Beat your plowshares into swords, and your pruninghooks into spears: let the weak say, I am strong.&lt;br /&gt;
&lt;br /&gt;
*sentence clustering:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
A1a: And he shall judge among the nations,&lt;br /&gt;
B1a: And he shall judge among many people,&lt;br /&gt;
&lt;br /&gt;
A1b: and shall rebuke many people:&lt;br /&gt;
B1b: and rebuke strong nations afar off;&lt;br /&gt;
&lt;br /&gt;
A1c: and they shall beat their swords into plowshares,&lt;br /&gt;
B1c: and they shall beat their swords into plowshares,&lt;br /&gt;
&lt;br /&gt;
A1d: and their spears into pruninghooks:&lt;br /&gt;
B1d: and their spears into pruninghooks:&lt;br /&gt;
&lt;br /&gt;
A1e: nation shall not lift up sword against nation,&lt;br /&gt;
B1e: nation shall not lift up a sword against nation,&lt;br /&gt;
&lt;br /&gt;
A1f: neither shall they learn war any more.&lt;br /&gt;
B1f: neither shall they learn war any more.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*inducing patterns (arguments in square brackets):&lt;br /&gt;
 {A1c,A1d,A1f} = {B1c,B1d,B1f}&lt;br /&gt;
 A1a: And he shall judge among [the nations],&lt;br /&gt;
 B1a: And he shall judge among [many people],&lt;br /&gt;
 A1b: and [shall rebuke] [many people]:&lt;br /&gt;
 B1b: and [rebuke] [strong nations] [afar off];&lt;br /&gt;
 A1e: nation shall not lift up [sword] against nation,&lt;br /&gt;
 B1e: nation shall not lift up [a sword] against nation,&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
*Barzilay R, Lee L (2003). &amp;quot;[http://www.cs.cornell.edu/Info/Projects/NLP/statpar.html Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment]&amp;quot;. ''Proceedings of HLT-NAACL, pp 16-23''.&lt;br /&gt;
&lt;br /&gt;
==See also==&lt;br /&gt;
*[[wikipedia:Natural language processing]]&lt;br /&gt;
*[[wikipedia:Computational linguistics]]&lt;br /&gt;
*[[wikipedia:Corpus linguistics]]&lt;br /&gt;
*[[wikipedia:Part-of-speech tagging]] (POS tagging or POST also called grammatical tagging)&lt;br /&gt;
*[http://www.mitpressjournals.org/loi/coli?cookieSet=1 Computational Linguistics (journal)]&lt;br /&gt;
&lt;br /&gt;
==External links==&lt;br /&gt;
*[http://www.gelbukh.com/clbook/ COMPUTATIONAL LINGUISTICS: Models, Resources, Applications] &amp;amp;mdash; free online book&lt;br /&gt;
*[http://www.cs.cornell.edu/home/llee/papers/statpar-informal.draft.html An informal explanation of &amp;quot;Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment&amp;quot;]&lt;br /&gt;
*[http://ejohn.org/projects/javascript-diff-algorithm/ Javascript Diff Algorithm]&lt;br /&gt;
&lt;br /&gt;
[[Category:Linguistics]]&lt;/div&gt;</summary>
		<author><name>Christoph</name></author>	</entry>

	</feed>