Problem  :

I am new to Python, I want to do the simple nearest neighbors classification but I am always facing below error while trying to execute following code.

ValueError: Found input variables with inconsistent numbers of samples: [489, 1890]
My code snippet as below :
myneigh = KNeighborsClassifier(n_neighbors=3)
myneigh.fit(X_bus, y_bus)

How can I fix above error?

1 Answer

Solution :

I have faced such error earlier. Your error wants to tell you that a size of your X_bus and y_bus samples are not at all same. So my suggestion to you is to revisit your bus test split and then you need to make sure that you are executing it currectly. You can do it as shown below:

X_bus, X_test, y_bus, y_test = bus_test_split(X, y)
myneigh = KNeighborsClassifier(n_neighbors=3)
myneigh.fit(X_bus, y_bus)
Wrong order will again produce the same error "ValueError: Found input variables with inconsistent numbers of samples”

Also note the Scikit-Learn will not accept the rank 1 array if you try to call the shape method on the x as below:


Then it will surely return you something which is similar to this (30,), where 30 is your number of rows so it should be exactly like (30,1).

So to make it work you can try using reshape as shown below:

x = dataset.iloc[:,0]
x = x.reshape((len(x),1))
