• Register
0 votes

Problem :

I have the pandas data frame with some of the categorical predictors or variables as 0 & 1, and some of the numeric variables. When I fit that to a stasmodel like below :

est = sm.OLS(y, X).fit()

It throws the below error :

Pandas data cast to numpy dtype of object. Check input data with np.asarray(data). 

I tried to convert all of the the dtypes of the DataFrame using below code:


After this all the dtypes of dataframe variables appeaerd as int32 or int64. But at the end of it, it still shows the dtype: object, like below :

5516        int32
5523        int32
5525        int32
5531        int32
5533        int32
5542        int32
5562        int32
sex         int64
race        int64
dispstd     int64
age_days    int64
dtype: object

Here 5516, 5523 are variable labels.

Any clue? I just need to build the multi-regression model on more than the hundreds of variables. For that I have concatenated the 3 pandas DataFrames to come up with the final DataFrame to be used in the model building.

6 5 3
7,540 points

Please log in or register to answer this question.

1 Answer

0 votes

Solution :

If X is your dataframe, then try to use the .astype method to convert to the float when running your model as shown below:

est = sm.OLS(y, X.astype(float)).fit()


If both the y(dependent) and X are taken from the data frame then type cast both as shown below :-

est = sm.OLS(y.astype(float), X.astype(float)).fit()
9 7 4
38,600 points

Related questions

0 votes
1 answer 1.8K views
Problem : I am getting bellow error attributeerror: can only use .str accessor with string values, which use np.object_ dtype in pandas
asked Nov 7, 2019 peterlaw 6.9k points
0 votes
1 answer 2.3K views
Problem : I have the two DataFrames which I would want to merge. I have referred many documents and also tried to perform many operations but I am not sure what to do now. Please find my two DataFrames as below: DataFrame1: id name type currency 0 BTTA.S Apple ... here I met with the exception as below : ValueError: can not merge DataFrame with instance of type <class 'pandas.core.series.Series'>
asked Dec 24, 2019 alecxe 7.5k points
0 votes
1 answer 1.8K views
Problem : Currently I am trying to learn NumPy. I am trying to execute my code but I am facing following error while trying to use my code. TypeError: Cannot cast array data from dtype('float64')            to dtype('S32') according to the rule 'safe' Please Note : My NumPy version is 1.11.0. How can I fix the above error ?
asked Feb 17, 2020 mphil 2.3k points
0 votes
0 answers 32 views
when I tried to use str.replace it gave this message dc_listings['price'].str.replace(',', '') AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas Here are the top 5 rows of my price column. This stack overflow thread ... error-can-only-use-str-accessor-with-string-values to check if my column has NAN values but non of the values in my column are NAN
asked Oct 27, 2020 psandprop 2.4k points
0 votes
1 answer 9 views
Problem: I wrote some code where I find common key-value pairs between two dictionaries as follows: d_inter = dict(set(message.iteritems()).intersection(v.iteritems())) This works fine, but when messagethere keyis a type in dictionaries list, I get an error TypeError: ... when we try to use listas keyin any dictionary, but I am not doing anything like this here. Please help me fix this.
asked Dec 24, 2020 sasha 6.4k points
0 votes
1 answer 17 views
Could anyone provide a sample of the column data that you're trying to replace? That would help a lot.
asked Dec 14, 2020 TeamScript 11.1k points
0 votes
1 answer 27 views
Problems I am trying to update selected datetime64 values in a pandas data frame using the loc method to select rows satisfying a condition. However, instead of assigning the new date-time value it results in NaT. Here is a simplification of my code that shows the problem: ... as the second element in the new_date column. Any ideas on how this should be done or why this is not working as intended?
asked Sep 15, 2020 Marivoke 530 points
0 votes
1 answer 281 views
Problem: I have currently started learning about using the pandas in ipython notebook: import pandas as pd But I have encountered the below error on my above line of code: AttributeError  Traceback (most recent call last) <ipython-input-17-c7ecb2b0a99d> in <module>() ----> 1 from ... ' I have no knowledge on how to fix the above error, what is a problem here? My python's version is currently 3.6
asked Aug 10, 2020 Raphael Pacheco 4.9k points
0 votes
1 answer 116 views
Problem : I have below error for trying to load the saved SVM model. I have tried uninstalling the sklearn, NumPy and SciPy, and reinstalling a latest versions all-together again (using pip). I am still facing below error. &ldquo;Runtimewarning : Numpy.dtype size changed, may indicate binary incompatibility&rdquo; How to get rid of the above mentioned issue?
asked Jan 21, 2020 jwilliam 3.9k points
0 votes
1 answer 130 views
Problem : Help needed with this error runtimewarning: numpy.dtype size changed, may indicate binary incompatibility. expected 96, got 88
asked Nov 8, 2019 peterlaw 6.9k points