• Register
0 votes
354 views

Problem  :

I am new to Python, I want to do the simple nearest neighbors classification but I am always facing below error while trying to execute following code.

ValueError: Found input variables with inconsistent numbers of samples: [489, 1890]
My code snippet as below :
myneigh = KNeighborsClassifier(n_neighbors=3)
myneigh.fit(X_bus, y_bus)

How can I fix above error?

8 4 2
2,300 points

Please log in or register to answer this question.

1 Answer

0 votes

Solution :

I have faced such error earlier. Your error wants to tell you that a size of your X_bus and y_bus samples are not at all same. So my suggestion to you is to revisit your bus test split and then you need to make sure that you are executing it currectly. You can do it as shown below:

X_bus, X_test, y_bus, y_test = bus_test_split(X, y)
myneigh = KNeighborsClassifier(n_neighbors=3)
myneigh.fit(X_bus, y_bus)
Wrong order will again produce the same error "ValueError: Found input variables with inconsistent numbers of samples”

Also note the Scikit-Learn will not accept the rank 1 array if you try to call the shape method on the x as below:

x.shape

Then it will surely return you something which is similar to this (30,), where 30 is your number of rows so it should be exactly like (30,1).

So to make it work you can try using reshape as shown below:

x = dataset.iloc[:,0]
x = x.reshape((len(x),1))
 
 
5 2 1
4,980 points

Related questions

0 votes
1 answer 4 views
4 views
Problem? Hi please some body help me with this issue found input variables with inconsistent numbers of
asked Mar 19 PkGuy 23.5k points
1 vote
1 answer 1.3K views
1.3K views
Problem : I am facing bellow strange error undefinedmetricwarning: precision is ill-defined and being set to 0.0 due to no predicted samples.
asked Nov 8, 2019 peterlaw 6.9k points
0 votes
1 answer 27 views
27 views
Problem: The following code from sklearn import metrics import numpy as np y_true = np.array([[0.2,0.8,0],[0.9,0.05,0.05]]) y_predict = np.array([[0.5,0.5,0.0],[0.5,0.4,0.1]]) metrics.log_loss(y_true, y_predict) produces the ... not supported with label binarization I am curious why. I am trying to re-read definition of log loss and cannot find anything that would make computations incorrect.
asked Apr 30 muktaa 34.6k points
0 votes
1 answer 208 views
208 views
Problem: I'm trying to fit an SGDRegressor to my data and then check the accuracy. The fitting works fine, but then the predictions are not in the same datatype(?) as the original target data, and I get the error ValueError: Can't handle mix of multiclass and continuous
asked Jan 9 Mashhoodch 13k points
0 votes
1 answer 17 views
17 views
Problem: Getting this error .. anyone knows the solution of it? valueerror: empty vocabulary; perhaps the documents only contain stop words
asked Apr 21 Ifra 37.2k points
1 vote
2 answers 1.6K views
1.6K views
Problem : Currently I am trying to develop the tweet classifier. I have already trained the knn classifier with the tfidf dataset. In this dataset each and every row has the length of 3.173. After training a model it will load it into the file to ... my training data efficiently. Please find below the error which I am facing. ValueError: query data dimension must match training data dimension.
asked Apr 23, 2020 stewart 4k points
0 votes
1 answer 4 views
4 views
Problem: So I have 20 different nominal categorical variables which are independent variables. Each of these variables 2-10 categories.These independent variables are string type and will be used to predict a dependent variable called price, which is a ... the group itself. I've found correlation between continuous variables for both independent and dependent variables. Help is much appreciated.
asked May 2 anika11 32.2k points
0 votes
1 answer 7 views
7 views
Problem: I am stuck with this ... Please help me that how to deal with this? How to use control variables in regression?
asked Apr 4 Ifra 37.2k points
0 votes
1 answer 42 views
42 views
Problem input contains nan, infinity or a value too large for dtype('float64').
asked Feb 11 charles mathews 5.5k points
0 votes
1 answer 1 view
1 view
Problem: I have performed GaussianNB classification using sklearn. I tried to calculate the metrics using the following code: print accuracy_score(y_test, y_pred) print precision_score(y_test, y_pred) Accuracy score is working correctly but precision score calculation is showing ... choose another average setting. As target is multiclass, can i have the metric scores of precision, recall etc.?
asked Apr 24 Humaira ahmed 50.7k points