Solution:
Python attempt to convert a byte-array (a bytes
which it receive to be a utf-8-encoded string) to a unicode string (str
). This method of course is a decoding according to utf-8 rules. At the time it attempts this, it encounters a byte sequence which is not allowed in utf-8-encoded strings (for example this 0xff at position 0).
Because you did not provide any code we could look at, we only could guess on the rest.
From the stack trace we can accept that the triggering action was the reading from a file (contents = open(path).read()
). I propose to recode this in a fashion like this:
with open(path, 'rb') as f:
contents = f.read()
That b
in the mode specifier in the open()
condition that the file shall be employed as binary, so contents
will keep a bytes
. No decoding attempt will occur this way.
Practice this solution it will strip out (ignore) the characters and return the string without them. Just employ this in case your require is to strip them not convert them.
with open(path, encoding="utf8", errors='ignore') as f:
Employ errors='ignore'
You'll only lose some characters. however in case your don't care about them as they appear to be extra characters created from a the bad formatting and programming of the clients linking to my socket server. Then its a simple direct solution.
Had an problem same to this, Ended up employing UTF-16 to decode. my code is below.
with open(path_to_file,'rb') as f:
contents = f.read()
contents = contents.rstrip("\n").decode("utf-16")
contents = contents.split("\r\n")
this would take the file contents as an import, however it would return the code in UTF format. from there it would be decoded and seperated by lines.
Employ only
base64.b64decode(a)
instead of
base64.b64decode(a).decode('utf-8')
Inspect the path of the file to be read. My code kept on providing me errors until I altered the path name to present performing directory. The error was:
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
It easily implies that one chose the wrong encoding to read the file.
On Mac, employ file -I file.txt
to trace the accurate encoding. On Linux, employ file -i file.txt
.
You have to exercise the encoding as latin1 to read this file as there are few special character in this file, employ the below code snippet to read the file,
import pandas as pd
data=pd.read_csv("C:\\Users\\akashkumar\\Downloads\\Customers.csv",encoding='latin1')
print(data.head())
The exact error is here:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
You can't solve it in that sense of middling with the code and solve it. Is a bug which IMO will be quite simple to solve from the developer perspective (modify the encoding of the file). Presently, the only method to remove the package is forcefully, which I don't recommend for any instance.
I view that /usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here
appears to be a dummy file, and possibly the cause of problems. You must check with file /usr/share/ubuntu-drivers-common/quirks/*
whenever there files are not UTF-8, like this:
$ file /mnt/usr/share/ubuntu-drivers-common/quirks/*
/mnt/usr/share/ubuntu-drivers-common/quirks/dell_latitude: ASCII text
/mnt/usr/share/ubuntu-drivers-common/quirks/lenovo_thinkpad: ASCII text
/mnt/usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here: empty
In case those files are not ASCII text
, consider removing them all, then attempt to remove the package again.