Python for Data Science - House Prices | Sololearn: Learn to code for FREE!

+6

# Python for Data Science - House Prices

You are given an array that represents house prices. Calculate and output the percentage of houses that are within one standard deviation from the mean. To calculate the percentage, divide the number of houses that satisfy the condition by the total number of houses, and multiply the result by 100. I stuck at this question, can anyone help out? This is my code : https://code.sololearn.com/cA255a72a21A

3/22/2021 1:18:49 PM

Lai Kai Yong

-1

you rather need to output percentage of data than count: print(100*count/data.size)

+21

My code import numpy as np data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) m = np.mean(data) s = np.std(data) low = m-s high = m+s print( len(data[(low < data) & (data < high)]) / len(data) *100 )

+7

import numpy as np data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) mean = np.mean(data) std = np.std(data) low = mean-std high = mean+std count = 0 for i in data: if low < i < high: count += 1 result = (count / len(data))*100 print(result)

+4

Hi! here's my answer m = np.mean(data) d = np.std(data) y1 = m-d y2 = m+d s = len(data [(data > y1) & (data < y2)]) r = (s/len(data))*100 print(r) I hope it helped!

+4

import numpy as np data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) one line answer✌️ print(len([i for i in data if i > (np.mean(data) - np.std(data)) and i < (np.mean(data) + np.std(data))]) / len(data) * 100)

+3

A short and simple answer for the problem: mean = np.mean(data) std = np.std(data) x=(data[(data <= mean+std) & (data >= mean-std)]) print(x.size/data.size*100)

+1

+1

import numpy as np data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) mean = np.mean(data) std = np.std(data) result = data[np.logical_and(data <= mean + std, data >= mean - std)] print(result.size/data.size*100)

+1

import numpy as np data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) strd=np.std(data) means=np.mean(data) alt=means-strd ust=means+strd a=(data>alt)&(data<ust) print((len(data[a])/len(data))*100)

0

you must compute the mean of your data, then compute the standard deviation and finally count how many data are in the range mean-deviation, mean+deviation...

0

@visph, I'm still confuse. I had modified my codes but it shows an output but still wrong.

0

mean = np.mean(data) std = np.std(data) low = mean-std high = mean+std count = 0 for i in data: if low < i < high: count += 1 print(count)

0

Lai Kai Yong check again my previous edited post: np.std should get data as argument ^^

0

@visph, it does not work...

0

Bruno it havent helped

0

Here Is the solution https://code.sololearn.com/cG4jST4TYGm5/?ref=app

0

import numpy as np data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) mean = np.mean(data) std = np.std(data) low = mean-std high = mean+std count = 0 for i in data: if low < i < high: count += 1 result = (count / len(data))*100 print(result)

0

data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) mean = np.mean(data) std = np.std(data) a=mean+std b=mean-std c=(data[(data <= a) & (data >= b)]) print(c.size/data.size*100)

0

My way: import numpy as np data = np.array([150000, 125000, 320000, 540000, 200000, 120000, 160000, 230000, 280000, 290000, 300000, 500000, 420000, 100000, 150000, 280000]) std = np.std(data) mean = np.mean(data) total = np.size(data) left = mean - std right = mean + std z = (data > left) & (data < right) y = data[z] k = np.size (y) res = (k / total) *100 print(res)

0

COVID Data Analysis You are working with the COVID dataset for California, which includes the number of cases and deaths for each day of 2020. Find the day when the deaths/cases ratio was largest. To do this, you need to first calculate the deaths/cases ratio and add it as a column to the DataFrame with the name 'ratio', then find the row that corresponds to the largest value.