• Register
0 votes
3.7k views

Problem :

I have encountered the following error while compiling "process.py"

 python tools/process.py --input_dir data --            operation resize --outp

ut_dir data2/resize

data/0.jpg -> data2/resize/0.png

Traceback (most recent call last):

File "tools/process.py", line 235, in <module>

  main()

File "tools/process.py", line 167, in main

  src = load(src_path)

File "tools/process.py", line 113, in load

  contents = open(path).read()

      File"/home/user/anaconda3/envs/tensorflow_2/lib/python3.5/codecs.py", line 321, in decode

  (result, consumed) = self._buffer_decode(data, self.errors, final)

UnicodeDecodeError: 'utf-8' codec can't decode     byte 0xff in position 0: invalid start byte

What may be the cause of the error? I am using Python's version as 3.5.2.

6 5 3
6,930 points

2 Answers

0 votes

Solution :

Here Python is trying to convert the byte-array the bytes which it assumes to be a utf-8-encoded string to a unicode string (str). This process of decoding is according to utf-8 rules. When it is trying this it is encountering a byte sequence which is not allowed in utf-8-encoded strings (Mainly the 0xff at position 0).

As you did not provide any code that we could look at, we can only guess on the rest.

From the stack trace we can guess that the triggering action was at the reading from a file (e.g. contents = open(path).read()). Please recode this in a fashion as shown below:

with open(path, 'rb') as f:
contents = f.read()

The b in the mode specifier in the open() states that the file must be treated as binary, so contents will remain as bytes. And so No decoding attempt will happen in this way.

9 7 4
38,600 points
0 votes

Solution:

Python attempt to convert a byte-array (a bytes which it receive to be a utf-8-encoded string) to a unicode string (str). This method of course is a decoding according to utf-8 rules. At the time it attempts this, it encounters a byte sequence which is not allowed in utf-8-encoded strings (for example this 0xff at position 0).

Because you did not provide any code we could look at, we only could guess on the rest.

From the stack trace we can accept that the triggering action was the reading from a file (contents = open(path).read()). I propose to recode this in a fashion like this:

with open(path, 'rb') as f:
  contents = f.read()

That b in the mode specifier in the open() condition that the file shall be employed as binary, so contents will keep a bytes. No decoding attempt will occur this way.

Practice this solution it will strip out (ignore) the characters and return the string without them. Just employ this in case your require is to strip them not convert them.

with open(path, encoding="utf8", errors='ignore') as f:

Employ errors='ignore' You'll only lose some characters. however in case your don't care about them as they appear to be extra characters created from a the bad formatting and programming of the clients linking to my socket server. Then its a simple direct solution. 

Had an problem same to this, Ended up employing UTF-16 to decode. my code is below.

with open(path_to_file,'rb') as f:
    contents = f.read()
contents = contents.rstrip("\n").decode("utf-16")
contents = contents.split("\r\n")

this would take the file contents as an import, however it would return the code in UTF format. from there it would be decoded and seperated by lines.

Employ only

base64.b64decode(a) 

instead of

base64.b64decode(a).decode('utf-8')

Inspect the path of the file to be read. My code kept on providing me errors until I altered the path name to present performing directory. The error was:

newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

It easily implies that one chose the wrong encoding to read the file.

On Mac, employ file -I file.txt to trace the accurate encoding. On Linux, employ file -i file.txt.

You have to exercise the encoding as latin1 to read this file as there are few special character in this file, employ the below code snippet to read the file,

import pandas as pd

data=pd.read_csv("C:\\Users\\akashkumar\\Downloads\\Customers.csv",encoding='latin1')

print(data.head())

The exact error is here:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

You can't solve it in that sense of middling with the code and solve it. Is a bug which IMO will be quite simple to solve from the developer perspective (modify the encoding of the file). Presently, the only method to remove the package is forcefully, which I don't recommend for any instance.

I view that /usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here appears to be a dummy file, and possibly the cause of problems. You must check with file /usr/share/ubuntu-drivers-common/quirks/* whenever there files are not UTF-8, like this:

$ file /mnt/usr/share/ubuntu-drivers-common/quirks/*
/mnt/usr/share/ubuntu-drivers-common/quirks/dell_latitude:        ASCII text
/mnt/usr/share/ubuntu-drivers-common/quirks/lenovo_thinkpad:      ASCII text
/mnt/usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here: empty

In case those files are not ASCII text, consider removing them all, then attempt to remove the package again.

10 6 4
31,120 points

Related questions

0 votes
1 answer 1.3K views
1.3K views
Problem : I am new to the Python, I am using Python-2.6 CGI scripts but facing following error in the server log while I was doing json.dumps(), Traceback (most recent call last): File "/etc/mongodb/server/cgi-bin/getstats.py", line 135, in <module> print json.dumps ... = datetime.datetime.strftime(now, '%Y-%m-%dT%H:%M:%S.%fZ') print json.dumps({'current_time': now}) // I guess this is the culprit
asked Nov 24, 2019 alecxe 7.5k points
0 votes
1 answer 317 views
317 views
Problem : Please find below my code for your reference. import os for root, dirs, files in os.walk('Path'):      for file in files:          if file.endswith('.c'):              with open(os.path.join(root, file)) as f:                     for line in f: ... already tried setting it with the open(os.path.join(root, file),'r',encoding='cp932') as f: but got the same above error
asked Jan 31 jwilliam 3.9k points
0 votes
1 answer 576 views
576 views
Problem : Getting bellow error while executing numpy arrays unicodedecodeerror: 'ascii' codec can't decode byte 0x90 in position 614: ordinal not in range(128)
asked Nov 7, 2019 peterlaw 6.9k points
0 votes
0 answers 17 views
17 views
I'm applying Laravel to write a service for mobile. I am trying to figure the solution. Can someone give me the hint?
asked Sep 14 Daniel Anderson 4k points
0 votes
2 answers 291 views
291 views
Problem : I want to read my .csv file into Python (Spyder) but I am facing the error. Please find below my code : import csv mydata = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener") mydata = csv.reader(mydata) print(mydata) I face the following error: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
asked Dec 6, 2019 alecxe 7.5k points