Dataframe group by a specific column, aggerage ratio of some other column?


Dataframe group by a specific column, aggerage ratio of some other column?



I have a Data Frame with columns: Year and Min Delay. Sample rows as follows:


Year


Min Delay


2014 0
2014 2
2014 0
2014 4
2015 4
2015 4
2015 2
2015 2



I want to group this dataframe by year and find the delay ratio per year (i.e. number of non-zero entries that year divided by total number of entries for that year). So if we consider the data frame above, what I am trying to get is:


2014 0.5
2015 1



(There are 2 delays in 2014, total 4, 4 delays in 2015 total 4. A delay is defined by Min Delay > 0)



This is what I tried:


def find_ratio(df):
ratio = 1 - (len(df[df == 0]) / len(df))
return ratio


print(df.groupby(["Year"])["Min Delay"].transform(find_ratio).unique())



which prints: [0.5 1]


[0.5 1]



How can I get a data frame instead of an array?





I think the problem is with the.unique() call - it returns a NumPy array - ndarray (source)
– Bill Armstrong
Jul 1 at 16:21


.unique()


ndarray




1 Answer
1



First I think unique is not good idea use here. Because if need assign output of function to years, it is impossible.


unique



Also transform is good idea if need new column to DataFrame, not aggregated DataFrame.


transform



I think need GroupBy.apply, also function should be simplify by mean of boolean mask:


GroupBy.apply


def find_ratio(df):
ratio = (df != 0).mean()
return ratio

print(df.groupby(["Year"])["Min Delay"].apply(find_ratio).reset_index(name='ratio'))

Year ratio
0 2014 0.5
1 2015 1.0



Solution with lambda function:


print (df.groupby(["Year"])["Min Delay"]
.apply(lambda x: (x != 0).mean())
.reset_index(name='ratio'))

Year ratio
0 2014 0.5
1 2015 1.0



Solution with GroupBy.transform return new column:


GroupBy.transform


df['ratio'] = df.groupby(["Year"])["Min Delay"].transform(find_ratio)
print (df)
Year Min Delay ratio
0 2014 0 0.5
1 2014 2 0.5
2 2014 0 0.5
3 2014 4 0.5
4 2015 4 0.0
5 2015 4 0.0
6 2015 2 0.0
7 2015 2 0.0





Downvoter, if there's something wrong with my answer, please let me know, so I can correct it. Thanks.
– jezrael
Jul 1 at 16:14






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Rothschild family

Cinema of Italy