Problem :

I am facing following error :

formula(formula, data = data) :

 invalid model formula in ExtractVars

I am using the below code:



# you must  change the below from windows to work on your linux box:

mydata <- read.csv(file="c:/Users/md79068/downloads/winequality-red.csv")

# To grow the tree

fit <- rpart(YouSweetMan ~ "residual sugar" + "citric acid", method = "class", data = mydata

Please note that I have changed the delimiters in my CSV file to commas.

I guess it is unable to read the data correctly. I am very new to the R and also a very new programmer.

Solution :

Please have a look at the names(mydata). When you tried to create the data.frame, read.table() will turn the "bad" column names into the good column names. You must not have the space in a column name so the R changes spaces to periods. Also, you should never keep the quoted strings in the formula. Please try below approach :

fit <- rpart(quality ~ residual.sugar + citric.acid, method = "class", data = mydata)

I don’t know what the "YouSweetMan" was supposed to do so I just changed it to "quality").

If you follow the above mentioned approach you will get rid of all the issues.

