"how to store arabic text in mysql database using python?" Code Answer

3

to clarify a few things, because it will help you along in the future as well.

txt = u'arabic (u0627u0644u0637u064au0631u0627u0646)'

this is not an arabic string. this is a unicode object, with unicode codepoints. if you were to simply print it, and if your terminal supports arabic you would get output like this:

>>> txt = u'arabic (u0627u0644u0637u064au0631u0627u0646)'
>>> print(txt)
arabic (الطيران)

now, to get the same output like arabic (الطيران) in your database, you need to encode the string.

encoding is taking these code points; and converting them to bytes so that computers know what to do with them.

so the most common encoding is utf-8, because it supports all the characters of english, plus a lot of other languages (including arabic). there are others too, for example, windows-1256 also supports arabic. there are some that don't have references for those numbers (called code points), and when you try to encode, you'll get an error like this:

>>> print(txt.encode('latin-1'))
traceback (most recent call last):
  file "<stdin>", line 1, in <module>
unicodeencodeerror: 'latin-1' codec can't encode characters in position 8-14: ordinal not in range(256)

what that is telling you is that some number in the unicode object does not exist in the table latin-1, so the program doesn't know how to convert it to bytes.

computers store bytes. so when storing or transmitting information you need to always encode/decode it correctly.

this encode/decode step is sometimes called the unicode sandwich - everything outside is bytes, everything inside is unicode.


with that out of the way, you need to encode the data correctly before you send it to your database; to do that, encode it:

q = u"""
    insert into
       tab1(id, username, text, created_at)
    values (%s, %s, %s, %s)"""

conn = mysqldb.connect(host="localhost",
                       user='root',
                       password='',
                       db='',
                       charset='utf8',
                       init_command='set names utf8')
cur = conn.cursor()
cur.execute(q, (id.encode('utf-8'),
                user_name.encode('utf-8'),
                text.encode('utf-8'), date))

to confirm that it is being inserted correctly, make sure you are using mysql from a terminal or application that supports arabic; otherwise - even if its inserted correctly, when it is displayed by your program - you will see garbage characters.

By avs on May 14 2022

Answers related to “how to store arabic text in mysql database using python?”

Only authorized users can answer the Search term. Please sign in first, or register a free account.