Difference between revisions of "Reencoding MediaWiki pages"

From the Linux and Unix Users Group at Virginia Teck Wiki
Jump to: navigation, search
imported>Cov
imported>Cov
Line 1: Line 1:
The following script is a work in progress. The end goal is to produce [[w:WikiMarkup]] from the HTML of a downloaded [[w:MediaWiki]] page.
+
The following script is a work in progress. The end goal is to produce [[w:Wikitext]] from the HTML of a downloaded [[w:MediaWiki]] page.
  
 
<pre>
 
<pre>

Revision as of 03:08, 9 November 2009

The following script is a work in progress. The end goal is to produce w:Wikitext from the HTML of a downloaded w:MediaWiki page.

sed -rn -e '/<!-- start content -->/,/<!--/p' page.html | \
sed -r -e '/<!--/d' -e 's%</?p>%%g' \
	-e 's%<br>%<br />%' \
	-e 's|<a href="([^"]*)" class="external text" title="[^"]*" rel="nofollow">([^<]*)</a>|[\1 \2]|'