Cyrillic text extraction in Python/Django -

- June 15, 2015

i'm using urllib2 open russian website , extract text it. however, instead of coming out "Беллона" it's coming out "Áåëëîíà". what's easiest way around this?

figure out encoding webpage uses (probably utf-8 or iso 8859-5), , convert text unicode this:

ustring = unicode(read_string, encoding=...)

if need determine encoding of webpage dynamically, see this answer.

Search This Blog

OSX

Cyrillic text extraction in Python/Django -

Comments

Post a Comment

Popular posts from this blog

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -

c# - Getting per connection bandwidth statistics -

security - SQL injection and web log files -