• Register

Recent questions tagged dataframe

0 votes
1 answer 26 views
Problem: I want to count the number of times a word is being repeated in the review string I am reading the csv file and storing it in a python dataframe using the below line reviews = pd.read_csv("amazon_baby.csv") The code in the below lines work when I apply it to a single review. print reviews["review"][1] a = reviews["review"][1].split("disappointed") print a b = len(a) print b
asked Feb 23 Mashhoodch 9.9k points
0 votes
1 answer 17 views
Problem: attributeerror: 'dataframe' object has no attribute 'sort'
asked Feb 18 charles mathews 3.8k points
0 votes
1 answer 9 views
PROBLEM how to add a column to pandas dataframe
asked Jan 24 waji 1.9k points
0 votes
1 answer 13 views
Looking for a help that how to rename a column in python?
asked Jan 24 waji 1.9k points
0 votes
1 answer 12 views
Problem: I tried to combine two lists: Names and Age. But I wanted to do this by adding their index [i + 1] to a different list each time. So instead of ['John', '17', 'Mike', '21'], my goal was for each pair to have its own index and be a list item. ... the path in the attached code, I achieve what I am trying to do. Can anyone explain why this works? I couldn't catch it. Thank you in advance.
asked Dec 24, 2020 sasha 13.2k points
0 votes
1 answer 12 views
Problem: For every row in a pandas dataframe, I need to get the cell / cells with the least value and recently return its row and column identity. I also want to check if this is less than the minimum value of one. For example., NAMES, Oil, Fat, Salt Salad, 0.2, 0.1, 0.8 ... NAMES').apply(lambda row: [[row.name, l] for l in row[row == row.min()].index], axis=1).values.tolist() Please help me.
asked Dec 24, 2020 sasha 13.2k points
0 votes
1 answer 12 views
Problem: I want to find a way to change the name of a specific column in a layered dataframe. With this data: data = { ('A', '1', 'I'): [1, 2, 3, 4, 5], ('B', '2', 'II'): [1, 2, 3, 4, 5], ('C', '3', 'I'): [1, 2, 3, 4, 5], ('D', '4', 'II'): [1, 2, 3, 4, 5], ('E', '5', 'III'): ... : Z B C D E 100 2 3 4 5 Z II I II III 0 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 Is this a Pandas bug?
asked Dec 24, 2020 sasha 13.2k points
0 votes
1 answer 20 views
Problem: I have the following dataframe time X Y X_t0 X_tp0 X_t1 X_tp1 X_t2 X_tp2 0 0.002876 0 10 0 NaN NaN NaN NaN NaN 1 0.002986 0 10 0 NaN 0 NaN NaN NaN 2 0.037367 1 10 1 1.000000 0 NaN 0 NaN 3 0.037374 2 10 2 0.500000 1 1.000000 0 ... value too large for dtype('float32').whenever I try to fit the regression modelfit(X_train, y_train) How can we remove both values NaNand -infat the same time?
asked Dec 24, 2020 sasha 13.2k points
0 votes
1 answer 24 views
Problem: I have large json data that reads into a python dataframe and creates a list of dicts for each line. I need to convert it to a different data format. The data format is as follows: { "data": [{ "item": [{ "value": 0, "type": "a" }, { " ... ) df = df_formatted.fillna(0) The number of items in a list is often in the thousands. Are there pointers or examples on how to do this efficiently?
asked Dec 23, 2020 sasha 13.2k points
0 votes
1 answer 13 views
Problem: I am having a DataFrame using pandas and column labels that I want to edit to replace the correct column labels. I would like to replace the column names in the DataFrame A where the original column names are below: ['$a', '$b', '$c', '$d', '$e'] TO ['a', 'b', 'c', 'd', 'e']. I have corrected column names that are saved in a list, please help me with it.
asked Dec 23, 2020 sasha 13.2k points
0 votes
1 answer 23 views
Problem: I have the following indexed DataFrame with named columns and strings that are not contiguous numbers: a b c d 2 0.671399 0.101208 -0.181532 0.241273 3 0.446172 -0.243316 0.051767 1.577318 5 0.614758 0.075793 -0.451460 -0.012493 I would like to ... different versions join, append, mergebut I did not get the desired result, most only errors. How do I add a column eto the above example?
asked Dec 23, 2020 sasha 13.2k points
0 votes
1 answer 37 views
0 votes
1 answer 428 views
Here's what I've got. I have two data frames. One is a set of financial data that already exists in the system and another set that has some that may or may not exist in the system. I need to find the difference and add the ... pandas\core\frame.py", line 3571, in _compare_frame raise ValueError('Can only compare identically-labeled ' ValueError: Can only compare identically-labeled DataFrame
asked Oct 28, 2020 psandprop 2.4k points
0 votes
1 answer 31 views
0 votes
1 answer 28 views
Problems I am trying to update selected datetime64 values in a pandas data frame using the loc method to select rows satisfying a condition. However, instead of assigning the new date-time value it results in NaT. Here is a simplification of my code that shows the problem: ... as the second element in the new_date column. Any ideas on how this should be done or why this is not working as intended?
asked Sep 15, 2020 Marivoke 530 points
0 votes
0 answers 48 views
I have a dataframe of almost 120000 records as follows. Also I have a mongoDB collection which looks exacly same as below dataframe ItemID ParentID ItemRating ItemPrice Qty A1 ItemA1 0 12 100 A2 ItemA2 0 15 200 B1 ItemB1 0 20 300 B2 ItemB2 0 25 400 B3 ItemB3 0 30 ... PyMongo update_many method by setting upsert=true. but I am not sure how can I do that ? how should I write my filter condition ?
asked Sep 13, 2020 NguyenTram 1k points
0 votes
1 answer 114 views
0 votes
1 answer 10 views
The process cannot access the file 'file path' because it is being used by another process, What does this mean, and what can I do about it?
asked Aug 28, 2020 Aliza313 720 points
0 votes
1 answer 1.8K views
Problem: I have only fundamental knowledge related to python, pandas and dataframe.I have tried to write the below code: df = pd.DataFrame(np.random.rand(12,2), columns=['Apples', 'Oranges'] ) df['Categories'] = pd.Series(list('AAAABBBBCCCC')) pd.options.display. ... I will be more than glad to get more ideas on the above error regarding why it is occurring and also how to fix this error?
asked Aug 24, 2020 Raphael Pacheco 4.9k points