How do I identify what is causing error with .idxmax function in pandas? | Sololearn: Learn to code for FREE!
Nouvelle formation ! Tous les codeurs devraient apprendre l'IA générative !
Essayez une leçon gratuite
+ 2

How do I identify what is causing error with .idxmax function in pandas?

Error message suggests some non numeric data causing problem with .idxmax function when applied to the dataframe. Only 1 column non numeric for day of the week, the other 10 columns numeric and index is non numeric using date. I have used chat gpt and Google search to help identify the understanding above, but I can't get it to work. I know the function works as I tested on sample data to find highest value for each column, identified by row index name. Dataframe.idxmax(axis=0) In theory it should work, my code is correct but I don't know how to identify where I messed up in generating my dataframe. That's where I need help in identifying the fault in the data type of dataframe. Any advice welcome.

3rd Apr 2023, 10:21 PM
Rob Newman
Rob Newman - avatar
5 Réponses
+ 3
Problem solved: I was over thinking the error message and the answer was essentially a slice selection. Of the 11 columns in the dataframe, I only needed to select one column of the dataframe to code in the .idxmax() function. The task was to retrieve the row index ID for the data item in the named column with the highest value. I had got confused by an example which pulled back the max values for a few columns at once for a dataframe and an older version panda library wouldn't accept the numeric only=false toggle. What I hadn't understood was that an indexed dataframe, always includes the index alongside whenever you select a single series in the dataframe. So the answer was as simple as: df.['Avg change%'].idxmax() note: 'Avg change%' being the data column of interest in the data frame. Another non-numeric data column caused the original error. Thank you for the advice, my understanding is improving!
4th Apr 2023, 1:52 PM
Rob Newman
Rob Newman - avatar
+ 6
According to the latest Pandas documentation there is a parameter for this function to exclude non-numeric columns. Check which version of pandas you have installed, to make sure it's available. You could also try excluding the non-numeric column by applying the function to only a slice of the dataframe. df[['A', 'B']] https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.idxmax.html numeric_only bool, default False Include only float, int or boolean data. New in version 1.5.0. In any case, to diagnose "where you messed up in generating your dataframe", one would have to see your actual code. There are no miraculous psychics here who can read your mind.
4th Apr 2023, 4:34 AM
Tibor Santa
Tibor Santa - avatar
+ 4
I'll try applying to a slice excluding the non numeric column , thanks Tibor Santa
4th Apr 2023, 6:26 AM
Rob Newman
Rob Newman - avatar
+ 3
Thanks Tibor Santa I did try the numeric only code switch, error message said the version of pandas does not recognise the numeric inly code switch. I appreciate you're not psychic, I wondered if there was any commands can run to check key attributes of the dataframe to help identify what may be wrong with it The code is in a jupyter notebook in various steps. I'll try to get it copied out together. It reads a csv data file, renames some column headers, adds some columns, filters the data frame and makes a subset copy of the data frame to then run the idxmax function on it. I couldn't see anything wrong in my steps and the AI system confirmed my code and answers correct. I will ask the course tutor today for advice and share back what went wrong.
4th Apr 2023, 6:24 AM
Rob Newman
Rob Newman - avatar
0
Hello dear Rob. Would you mind checking your messages?
19th Sep 2023, 5:41 AM
Azade Asghari
Azade Asghari - avatar