Solution to my Encoding Problem

When you analyse over 300K commit messages, changing problematic encodings gets a bit tiresome. This got me motivated to look for a solution and here it is:

obj.text = parsed_log_message.encode('utf-8','replace')

Saving obj will stop DjangoUnicodeDecodeError from squealing and replace problematic characters with a ‘?’. I guess this is part of the zen of Python:

Errors should never pass silently. Unless explicitly silenced.

Add post to:   Delicious Reddit Slashdot Digg Technorati Google
Make comment

Comments

No comments for this post

Required. 30 chars of fewer.

Required.

captcha image Please, enter symbols, which you see on the image