COVID Data Analysis - from course data science (python) | Sololearn: Learn to code for FREE!

+1

COVID Data Analysis - from course data science (python)

Did any one pass this project? "You are working with the COVID dataset for California, which includes the number of cases and deaths for each day of 2020. Find the day when the deaths/cases ratio was largest. To do this, you need to first calculate the deaths/cases ratio and add it as a column to the DataFrame with the name 'ratio', then find the row that corresponds to the largest value." Link to this project: https://www.sololearn.com/learning/eom-project/1161/1162 I used the code below print(df[df['ratio'] == df['ratio'].max()]) and got the output as follows cases deaths ratio date 10.03.20 7 1 0.142857 Unfortunately, the test case 1 doesn't let me pass. Could any one help me on this?

3/17/2021 4:46:10 PM

Thoi Nhan NGO

6 Answers

New Answer

+2

Thanks Derrickee for your answer. Actually, your code is the same as mine with 1 redundant ' before the second df. I found out what is wrong with this project: - Do not change the link of the data source (the csv file) or you will get wrong answer. At first, I thought the link was wrong and I replaced it by the one we met during the course about Covid data in California. This act helps me to test the result on a python IDE but in fact it is the reason for the whole thing.

+14

import pandas as pd df = pd.read_csv("/usercode/files/ca-covid.csv") df.drop('state', axis=1, inplace=True) df.set_index('date', inplace=True) df["ratio"] = df["deaths"] / df["cases"] max_ratio = df.loc[df["ratio"] == df["ratio"].max()] print(max_ratio)

+5

df['ratio'] = df['deaths'] / df['cases'] print (df['df['ratio'] == df['ratio'].max()]) Read the question carefully😁

+3

""" sorry but why wouldn't this work? can someone explain? df['ratio'] = df['deaths'] / df['cases'] print (df['df['ratio'].max()]) """ Because df['ratio'].max() returns only max value of ratio so it looks like this: df[max_value_of_ratio] so it is mistake. When you want to get row from df you should equal it to the whole column. I did it with .loc method: df["ratio"] = df["deaths"] / df["cases"] max_ratio = df.loc[df["ratio"] == df["ratio"].max()] print(max_ratio)

0

sorry but why wouldn't this work? can someone explain? df['ratio'] = df['deaths'] / df['cases'] print (df['df['ratio'].max()])

0

Could anyone comment here what is the error here? import pandas as pd df = pd.read_csv("/usercode/files/ca-covid.csv") df.drop('state', axis=1, inplace=True) df.set_index('date', inplace=True) ratio=(deaths/cases)*100 df.insert('ratio') max_ratio=df.groupby('ratio')['ratio'].max() print(max_ratio)