• Register
1 vote
76 views

Problem:

I am getting the error “typeerror: unhashable type: 'slice'’ when I run the program below

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd


dataset = pd.read_csv('50_Startups.csv')
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]


from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 3] = labelencoder_X.fit_transform(X[:, 3])

Your solution would be much appreciated.
Thanks

12 7 7
15,250 points

2 Answers

2 votes

Solution:

In your program  is a dataframe and you can’t access a dataframe via the Slice terminology. To solve this problem you must access the dataframe via iloc or X.values
Let’s try with X.vvalues

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('50_Startups.csv')
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()

#changed this line
X.values[:, 3] = labelencoder_X.fit_transform(X[:, 3])

 

Now it looks good and should work fine.

13 9 6
94,240 points
0 votes

Solution:

There is more possible solutions, however output is not same:

loc selects by labels, however iloc and slicing without function, the brgin bounds is added, while the upper bound is excluded, docs - select by positions:

test_inputs = pd.DataFrame(np.random.randint(10, size=(28, 7)))

print(test_inputs.loc[10:20])
    0  1  2  3  4  5  6
10  3  2  0  6  6  0  0
11  5  0  2  4  1  5  2
12  5  3  5  4  1  3  5
13  9  5  6  6  5  0  1
14  7  0  7  4  2  2  5
15  2  4  3  3  7  2  3
16  8  9  6  0  5  3  4
17  1  1  0  7  2  7  7
18  1  2  2  3  5  8  7
19  5  1  1  0  1  8  9
20  3  6  7  3  9  7  1
print(test_inputs.iloc[10:20])
    0  1  2  3  4  5  6
10  3  2  0  6  6  0  0
11  5  0  2  4  1  5  2
12  5  3  5  4  1  3  5
13  9  5  6  6  5  0  1
14  7  0  7  4  2  2  5
15  2  4  3  3  7  2  3
16  8  9  6  0  5  3  4
17  1  1  0  7  2  7  7
18  1  2  2  3  5  8  7
19  5  1  1  0  1  8  9

print(test_inputs[10:20])
    0  1  2  3  4  5  6
10  3  2  0  6  6  0  0
11  5  0  2  4  1  5  2
12  5  3  5  4  1  3  5
13  9  5  6  6  5  0  1
14  7  0  7  4  2  2  5
15  2  4  3  3  7  2  3
16  8  9  6  0  5  3  4
17  1  1  0  7  2  7  7
18  1  2  2  3  5  8  7
19  5  1  1  0  1  8  9

Indexing in pandas is actually confusing, as it seems like list indexing however it is not. You require to use .iloc, which is indexing by position

print(test_inputs.iloc[100:200, :])

And in case you don't exercise column selection you can omit it

print(test_inputs.iloc[100:200])

P.S. employing .loc (or just []) is not what you want, as it would seem not for the row number, however for the row index (which can be filled we anything, not also numbers, not even unique). Ranges in .loc will trace rows with index value 100 and 200, and return the lines between. In case you only made the DataFrame .iloc and .loc may give the similar result, however employing .loc in this instance is a very bad exercise as it will guide you to difficult to persue the problem at the time the index will alter for few cause (for example you'll choose few subset of rows, and from that moment the row number and index will not be the similar).

X is a dataframe and can't be accessed thriugh slice terminology like X[:, 3]. You should access through iloc or X.values. However, the method you created X made it a copy. so. I'd employ values

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
# dataset = pd.read_csv('50_Startups.csv')

dataset = pd.DataFrame(np.random.rand(10, 10))
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]

# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()

#  I changed this line
X.values[:, 3] = labelencoder_X.fit_transform(X.values[:, 3])

Employ Values either while making variable X or while encoding as mentioned above

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
# dataset = pd.read_csv('50_Startups.csv')

dataset = pd.DataFrame(np.random.rand(10, 10))
y=dataset.iloc[:, 4].values
X=dataset.iloc[:, 0:4].values

At the time making the matrix X and Y vector employ values.

X=dataset.iloc[:,4].values
Y=dataset.iloc[:,0:4].values

In case you employ .Values while making the matrix X and Y vectors it will fix the problem.

y=dataset.iloc[:, 4].values

X=dataset.iloc[:, 0:4].values

To get rid of that error you must employ Values either while creating variable X or while encoding as mentioned above:-

import numpy as np 

import matplotlib.pyplot as plt 

import pandas as pd 

dataset = pd.read_csv('50_Startups.csv') 

dataset = pd.DataFrame(np.random.rand(10, 10)) 

y=dataset.iloc[:, 4].values X=dataset.iloc[:, 0:4].values

 

10 6 4
31,120 points

Related questions

0 votes
1 answer 4 views
4 views
1 I worked on making functions for K Nearest Neighbors. I have tested each function separately and they all work well. However whenever I put them together and run KNN_method, it shows unhashable type: 'numpy.ndarray'. Here is my code: def distance(p,point): ... list_of_points , outcomes , k = 3): ind = find_neighbors(p , list_of_points , k) Final = majority_votes(outcomes[ind]) return(Final)
asked 4 days ago psandprop 2.3k points
1 vote
1 answer 14 views
14 views
Problem: Hello Kodlogs, I am getting an error TypeError: unhashable type: 'list' while I was doing this: my_dict = {'name': 'John', [1,2,3]:'values'} print(my_dict) As you can see I have declared a simple list and trying to print all the values. I think the ... find out the issue and help me to fix the error. I’ve already spent an hour on this. Please save my day. Thanks for your concern.
asked Jun 28 adamSw 11.3k points
0 votes
1 answer 4 views
4 views
I am creating a cookie clicker game, where there is a surface that displays how many cookies I have. Here is my code for drawing text. def draw_text(self, text, font_name, size, color, x, y, align="nw"): font = pg.font.Font(font_name, size) text_surface = ... does self.draw_text('Cookies: {}'.format(len(self.cookie_count)) give me an error? How come the length of self.cookie_count is not printed?
asked 4 days ago psandprop 2.3k points
0 votes
1 answer 3 views
3 views
New to programming and am unsure why I am getting this error count=int(input ("How many donuts do you have?")) if count <= 10: print ("number of donuts: " ) +str(count) else: print ("Number of donuts: many")
asked 4 days ago psandprop 2.3k points