Problem :

Please find below my code for your reference.

import os
for root, dirs, files in os.walk('Path'):
     for file in files:
         if file.endswith('.c'):
             with open(os.path.join(root, file)) as f:
                    for line in f:
                        if 'word' in line:

Currently I am getting below error

“UnicodeDecodeError: 'cp932' codec can't decode byte 0xfc in position 6616: illegal multibyte sequence”

I think my file needs the shift jis encoding. Can i set the encoding at start only? I have already tried setting it with the open(os.path.join(root, file),'r',encoding='cp932') as f: but got the same above error

1 Answer

Solution :

You could pass the errors='ignore', as given below but you need to make sure to check what is a encoding of your files.

open(os.path.join(root, file),'r', encoding='cp932', errors='ignore')

It will not ignore a file completely, but just required characters that cannot be decoded inside your file. Maybe there are only few files or lines incorrectly encoded. You could check that how many of these errors you have by catching your exception and printing a filename.


You can also try using the io library as given below:

io.open(os.path.join(root, file), mode='r', encoding='cp932')

 I am very sure that the above mentioned solutions will be the great help in fixing your error.

