• Register
0 votes

Problem :

Please find below my code for your reference.

import os
for root, dirs, files in os.walk('Path'):
     for file in files:
         if file.endswith('.c'):
             with open(os.path.join(root, file)) as f:
                    for line in f:
                        if 'word' in line:

Currently I am getting below error

“UnicodeDecodeError: 'cp932' codec can't decode byte 0xfc in position 6616: illegal multibyte sequence”

I think my file needs the shift jis encoding. Can i set the encoding at start only? I have already tried setting it with the open(os.path.join(root, file),'r',encoding='cp932') as f: but got the same above error

7 5 2
3,870 points

1 Answer

0 votes

Solution :

You could pass the errors='ignore', as given below but you need to make sure to check what is a encoding of your files.

open(os.path.join(root, file),'r', encoding='cp932', errors='ignore')

It will not ignore a file completely, but just required characters that cannot be decoded inside your file. Maybe there are only few files or lines incorrectly encoded. You could check that how many of these errors you have by catching your exception and printing a filename.


You can also try using the io library as given below:

io.open(os.path.join(root, file), mode='r', encoding='cp932')

 I am very sure that the above mentioned solutions will be the great help in fixing your error.

9 7 4
38,600 points

Related questions

0 votes
2 answers 3.8K views
Problem : I have encountered the following error while compiling "process.py" python tools/process.py --input_dir data -- operation resize --outp ut_dir data2/resize data/0.jpg -> data2/resize/0.png Traceback (most recent call last): File "tools/process.py", line 235, in <module ... 0xff in position 0: invalid start byte What may be the cause of the error? I am using Python's version as 3.5.2.
asked Nov 22, 2019 peterlaw 6.9k points
0 votes
1 answer 1.4K views
Problem : I am new to the Python, I am using Python-2.6 CGI scripts but facing following error in the server log while I was doing json.dumps(), Traceback (most recent call last): File "/etc/mongodb/server/cgi-bin/getstats.py", line 135, in <module> print json.dumps ... = datetime.datetime.strftime(now, '%Y-%m-%dT%H:%M:%S.%fZ') print json.dumps({'current_time': now}) // I guess this is the culprit
asked Nov 24, 2019 alecxe 7.5k points
0 votes
1 answer 589 views
Problem : Getting bellow error while executing numpy arrays unicodedecodeerror: 'ascii' codec can't decode byte 0x90 in position 614: ordinal not in range(128)
asked Nov 7, 2019 peterlaw 6.9k points
0 votes
2 answers 296 views
Problem : I want to read my .csv file into Python (Spyder) but I am facing the error. Please find below my code : import csv mydata = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener") mydata = csv.reader(mydata) print(mydata) I face the following error: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
asked Dec 6, 2019 alecxe 7.5k points
0 votes
1 answer 6 views
Problem : I just installed Python 3.5 and upon running the following code pip install mysql-python I am getting an error, which says: error: Microsoft Visual C++ 14.0 is required (Unable to find vcvarsall.bat) I have used the following lines of code to my PATH C:\ ... (x86)\Microsoft Visual Studio 12.0\VC; C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC My PC has 64bit window 7 setup.
asked Oct 8 sikandar 2.4k points