[PonyORM-list] MYSQL Unicode / Latin-1 Encoding Problem

Alexander Kozlovsky alexander.kozlovsky at gmail.com
Sun Aug 17 15:19:54 UTC 2014


Hi Jake!

I think that your attribute is defined with "str" type. Because of this,
Pony returns an attribute value in Python as str and not as unicode. By
default a str attribute values are represented using latin-1 encoding. And
the actual value stored in the database cannot be represented in latin-1
encoding in your case.

To solve your problem in the current version of Pony you should use unicode
attribute type instead of str. With unicode type you will get actual
unicode value from the database without any re-encoding.

Pony ORM future release 0.6 will support Python 3. Starting this release,
str attributes will behave exactly as current unicode attributes, because
in Python 3 str type is equivalent to Python 2 unicode type (and we want to
make source-compatible entity definitions for Python 2 and Python 3).
Starting with Pony ORM 0.6 str and unicode types in attribute definition
will mean the same. But currently they behave differently, and unicode type
is what you need in your case.

Regards,
Alexander Kozlovsky


On Sunday, August 17, 2014, Jake Austwick <jake at serpiq.com> wrote:

> Hey,
>
> I am trying to store some data that I'm scraping with Scrapy in MySQL,
> usually I use Postgres however not available this time due to client
> requirements.
>
> The error I'm receiving is:
> File "/Library/Python/2.7/site-packages/pony/orm/core.py", line 1263, in
> check
>     else: return converter.validate(val)
>   File "/Library/Python/2.7/site-packages/pony/orm/dbapiprovider.py", line
> 387, in validate
>     elif isinstance(val, unicode): val = val.encode(converter.encoding)
> exceptions.UnicodeEncodeError: 'latin-1' codec can't encode character
> u'\u2019' in position 1001: ordinal not in range(256)
>
> However I'm unsure why it's trying to encode as latin-1, I've tried
> everything to try and get rid of it:
>
>
>    - My MYSQL database and columns are set to UTF8
>    - I have the following at the top of my python file:
>
>
> import sys
> reload(sys);
> sys.setdefaultencoding("utf8")
>
>
>    - I'm creating my DB connection like this:
>
> db = Database('mysql', db="xxx", user="root", charset='utf8',
> use_unicode=True)
>
> Was wondering if you had any insight on the issue?
>
> Thanks,
> --
> Jake Austwick
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </ponyorm-list/attachments/20140817/c242c6ef/attachment.html>


More information about the ponyorm-list mailing list