Live Space Mover

As Microsoft has announced Live Space to WordPress.com migration, the recommended way now is to use the official function. If you want to move Live Space to a self-hosted WordPress, create a blog on WordPress.com as a bridge (export from WordPress.com blog then import to your self-hosted WordPress).

Thank all guys who cared about this script ;)

A python script for importing blog entries from live space to WordPress.

With Google blog converter you may also be able to move from live space to blogger/TypePad/Moveable Type/Live Journal blog. (See note 2 if you want to use this script with Google blog converter).

Tested on Python 2.5/2.6 and Windows XP/Ubuntu Linux.

Based on the wonderful HTML parser library BeautifulSoup.

Hosted on Google Code, Source code svn is

svn checkout http://live-space-mover.googlecode.com/svn/trunk/ live-space-mover

If there are any problems when using this script, feel free to contact me. weiwei9 AT gmail dot com

User Guide

  1. Install Python runtime and Beautiful Soup. There are 2 combinations tested by me:
    1. Python Runtime 2.5.2 and Beautiful Soup 3.0.6
    2. Python Runtime 2.5.1 and Beautiful Soup 3.0.4 and a small fix in note 1

    Place the file BeautifulSoup.py in the same directory of live-space-mover.py, or install it into Python runtime by yourself

  2. Download the newest release zip from the hosted page, extract it. (Older versions may become unusable because of the HTML changes of Live Space).
  3. Change your live space settings
    1. Make sure it is open to anyone (not only to your contacts)
    2. Set time zone to the same with your wordpress blog
    3. Set date format to yyyy/mm/dd, or mm/dd/yyyy. This probably depends on the locale setting of your system or browser, the point is to make the “YEAR” appear in your date. If the program fails and complains about date parsing, try to use the option -t to specify date time format. For example, the time on my space is shown like “9:45 PM”, but if your time is shown like “9:45:15 PM”, you may want to use a command line like below

      python live-space-mover.py -s http://yourspaceid.spaces.live.com/ -t "%m/%d/%Y %I:%M:%S %p"

      An introduction for the time format parameters are available here.

    4. Set “Blog entry date display” to “Show the blog entry date in the header”
    5. From some users’ feedback, I noticed themes of live space differ slightly in structure, which may lead to failure of this program. So please change your live space theme to “Journey” (the same as my experiment space).
  4. Run the live-space-mover.py script. In Windows, open the command line (win+R, enter “cmd” and return), change to the directory (use “c:”/”d:” to change disk, use “cd” command to change directory, please google it for help if need) of live-space-mover.py, run command like this

    python live-space-mover.py -s http://yourspaceid.spaces.live.com/

    Replace the example parameter with your own. This will generate an XML file named “export_xxxxx.xml” in the same directory of this script, which is in WordPress export file format.
  5. Use the import function in WordPress to import the XML file generated in the last step, remember to choose “WordPress” type in the import page, rather than “LiveJournal” or something else.

Notes

  • A known limitation: can’t fetch comments after the first page!
  • If you met an “UnicodeDecodeError”, that’s probably because your live space contains Italian or other languages. There is a bug in Python 2.5, you need to fix it. Yes, fix Python library by your own hands :P
    If you installed Python to it’s default path on Windows, what you need to do is to change the file C:\Python25\Lib\sgmlib.py, in line 394
    if not 0 <= n <= 255:
    should be changed to
    if not 0 <= n <= 127:
    That’s all, I learned this from here
  • If you want to use Google blog converter with this script, the recommended way is to open a new blog on WordPress.com or any other wordpress powered BSP, import the XML generated by this script and export a new XML with the built-in exporting function of WordPress, then feed Google blog converter with this new XML, because I can’t make sure XML exported by this script will meet Google blog converter’s requirement. Another thing to remember is you should change timezone of every place to UTC, including live space, wordpress blog, and the machine used to run this script. Thank 1nm for sharing experiences about Google blog converter.
  • This mover heavily depends on some very weird and sucking patterns of HTML and JavaScript codes in live space. So it may become unusable at any time….in that case please inform me
  • As I studied, the metaWeblog API in WordPress seems not to support comments? WordPress supports other two kinds of XML-RPC interfaces, too, blogger and MovableType. The blogger API has been updated to GData, and the old API looks not supporting comments, too. The documentation of MovableType API is so complex….I can’t understand yet.So maybe it would be much easier to write a mover with PHP which can handle comments.
  • This script may generate log file and cache file in the working directory. If you met some errors, it would be very helpful to send the log file and error message to me. Thank you.
  • You can use command

    python live-space-mover.py -help

    to check other options of this script
  • Since version 1.0, the suggested usage method is to export an xml file then import it. The directly posting method with MetaWeblog interface has been deprecated but left in the release package for anybody’s needs.

Change Log

* Version 1.8
– CHG: Catch up with changes of live space

* Version 1.7.6
– CHG: Modify exported date format to be compatible with WordPress2Blogger converter.

* Version 1.7.5
– BUG: Handled the weird format of comment date box

* Version 1.7.4
– BUG: Fixed the comments order problem reported by Sun Yue

* Version 1.7.3
– BUG: Fixed the problem when comment author name contains emoticons

* Version 1.7.2
– BUG: Fixed the pubDate of post item for WP 2.7

* Version 1.7.1
– BUG: Fixed the comment author missing

* Version 1.7
– CHG: Catch up with changes of live space in Dec 2008

* Version 1.6
– CHG: Catch up with changes of Live Space
– BUG: Fix the bug “can’t scan domain name with hyphen when comments are more than 20”

* Version 1.5
– CHG: Catch up with changes of Live Space
– BUG: Escaped special chars for XML
– NEW: Improved error logging when parsing error

* Version 1.4
– BUG: Converted unicode numbers (in category name, entry title and comment author) to unicode string. The bug of duplicate categories in WP 2.3 was solved by this.

* Version 1.3
– NEW: Support category exporting by setting header field ‘User-Agent’ to Firefox

* Version 1.2
– Catch up with some changes of live space

* Version 1.1
– BUG: Error when title is empty
– NEW: Add cache and resume ability

* Version 1.0
– Use XML file and import function of WordPress, instead of MetaWeblog and post
– Change some fetching codes according to the code changes of live space
– Fixed a bug of extracting email address of comment author

* Version 0.93
– Add Donate Link

* Version 0.92
– Fix some bugs

* Version 0.9
– NEW: Support moving comments. Add file “my-wp-comments-post.php” for posting comments
– NEW: Add running modes, for only moving posts/comments, or both

* Version 0.2
– BUG: Error when reading live space in Italian or other languages. Actually it’s a bug of Python 2.5.
– BUG: Doesn’t jump out loop after moving the oldest entry.
– NEW: Support date format pattern specifying, added -t option
– NEW: Support starting from a specified entry, added -f option

* Version 0.1
– NEW: Starting, used to move my own live space

Thanks

Great Thanks for Michele Nasti and Oliver Diaz Herrera, they used this script, reported bugs to me and helped me to solve them. I’m not a patient guy and I don’t have many blogs to test this script too much. It’s them, the nice users, who made this script really usable.

It’s so wonderful to cooperate with guys all around the world ;-p

204 comments

  1. 用1.75版从live space到blogger搬家成功了,来感谢一下。

    说几点遇到的问题:
    第一,一点奇怪的问题,用-t “%m/%d/%Y %I:%M:%S %p”
    指定了格式,导出也很顺利,Google blog converter的wordpress2blogger.sh转换格式的时候报错:
    ValueError: time data did not match format: data=2009-03-11 09:10 fmt=%Y-%m-%d %H:%M:%S
    用sed -i ‘s/\(\|\)\(.*\)\(\|\)/\1\2:00\3/g’ export.xml
    转换成功,解决。
    第二,live space mover导出的文件直接转换,在blogger里无法导入。一筹莫展的时候,想到了wordpress.com,新开一个blog,导入,再导出,转换,去blogger导入,成功!
    第三,问题又来了,时间不对,看了一下Google blog converter的issues,果然issue 22:
    Wordpress2Blogger ignoring timezones on posts/comments.
    把space的时间改成UTC,再导一遍,发现日志时间变成UTC了,评论时间还是JST(评论的时间抽取是根据系统时间和解析所得XX hours ago来算的?我太不懂python,见笑了),于是把运行live space mover系统的时区也改成UTC,再来一遍,一切OK了。

    所以综上,想从live space搬到blogger的同学们,请注意以下2点:

    1. 需要一个wordpress的blog来中继一下。
    2. 搬家时live space,wordpress,blogger,运行live space mover机器的时区都调到UTC。

    最后再感谢一下Wei Wei。

  2. broom9 said,
    March 6, 2009 @ 12:10 am
    @chaos 我导出了你的blog了,没有问题

    能不能麻烦你发到我的邮箱呢,或者提供一个下载地址,多谢了

  3. Help me with it

    LINE 232 INFO connectiong to source blog http://spyxochavez.spaces.live.com/
    LINE 234 INFO connect successfully, look for 1st Permalink
    LINE 564 ERROR Unexpected error
    Traceback (most recent call last):
    File “live-space-mover.py”, line 562, in
    main()
    File “live-space-mover.py”, line 475, in main
    permalink = find1stPermalink(srcURL)
    File “live-space-mover.py”, line 235, in find1stPermalink
    soup = BeautifulSoup(page)
    File “C:\Archivos de programa\Python25\BeautifulSoup.py”, line 1499, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
    File “C:\Archivos de programa\Python25\BeautifulSoup.py”, line 1230, in __init__
    self._feed(isHTML=isHTML)
    File “C:\Archivos de programa\Python25\BeautifulSoup.py”, line 1263, in _feed
    self.builder.feed(markup)
    File “C:\Archivos de programa\Python25\lib\HTMLParser.py”, line 108, in feed
    self.goahead(0)
    File “C:\Archivos de programa\Python25\lib\HTMLParser.py”, line 150, in goahead
    k = self.parse_endtag(i)
    File “C:\Archivos de programa\Python25\lib\HTMLParser.py”, line 314, in parse_endtag
    self.error(“bad end tag: %r” % (rawdata[i:j],))
    File “C:\Archivos de programa\Python25\lib\HTMLParser.py”, line 115, in error
    raise HTMLParseError(message, self.getpos())
    HTMLParseError: bad end tag: u””, at line 747, column 188

  4. I can´t do it!!! in the cmd appear a error message”

    File “live-space-mover.py”, line 70
    raise Exception, “Can´t parse comment data string ” + datestr
    Syntax error: invalid syntax

    what do i can do?

  5. 你好,在导出live space 的时候还是不懂得怎么操作,我的博客设置也都按照要求该过了,时间显示以及开放范围,但还是不懂怎么导出xml,不知道能不能麻烦你帮我导出下,我的博客地址是http://chaostee.spaces.live.com/
    谢谢你了,还有,因为是想导入到blogger里面去,不知道这个xml 格式的文件是否符合要求

  6. Wei wei 人真好。。
    我弄不明白他亲自帮我弄好了
    可惜我理解错了。这个只能导到worldpress里,我是要往blogbus里弄。
    结果白白麻烦Wei wei 一趟。(*^__^*)

  7. Hi Broom, thanks for your help. The files are in the same path. The name of the file shows up as BeautifulSoup-3.0.6.tar.gz. Is this the right file? It won’t open on my computer. I feel I am close to this working and don’t want to give up.

  8. @Onjeinika this script won’t work on Py3.0. For your case about Python 2.5, the reason should be you didn’t put the BeautifulSoup at the right place, make sure the file BeautifulSoup.py is in the same path as live-space-mover.py

  9. Hi, I’ve tried to do this move using your script. I keep getting errors. I used Python2.5.4 and Beautiful Soup 3.0.6 and got an error message on line 21 of live-space-mover saying

    Import Error :no module named Beautiful Soup.

    I Also tried the move using Python 3.0 and Beautiful Soup 3.1.0.1. I received an error on line 70 of the live-space-mover saying “raise Exception, can’t parse comment date string” +dateStr.
    Can you help me please?

  10. I very much love summer :)
    Someone very much loves winter :(
    I Wish to know whom more :)
    For what you love winter?
    For what you love summer? Let’s argue :)

  11. @Betty: thanks, this is exactly what I had been thinking, however I don’t have time to accomplish it yet. Hmm.. maybe I need a schedule

  12. Hi people!
    The interesting name of a site – b2.broom9.com
    I yesterday 7 hours
    looked in the Internet So I have found your site :)
    The interesting site but does not suffice several sections!
    However this section is very necessary!
    Best wishes for you!
    Forgive I is drunk :))

  13. Problem: python is not recognised as a internal or external command, operable program or batch file
    OS: Windows Vista SP1
    Python 2.5.2 (although 2.6 is out) as default c:\Python25 and the BeautifulSoup-3.0.6.py in the LiveSpace (user created) folder on desktop (a program config that supposely tested to work)

    location of the .py files for live-space-mover.py and BeautifulSoup-3.0.6.py is in the C:\Users\Jamie\Desktop\LiveSpace directory (registered in windows .py = python file?

    where next like Shrey (September 12, 2008 @ 8:17 am) i stuck following your instructions above

Leave a Reply to broom9 Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.