I want to populate value in Dataframe based on matching values in other Dataframe
I want to populate value in Dataframe based on matching values in other Dataframe
I am editing the question. I dont want to use groupby to use group values.
I would appreciate if someone could help with just the query to transform data in the following way:
I have one dataframe given as follows:
df1:
col1 col2
------------
VG 12
G 11
A 10
P 06
VP 0
I want the new dataframe such as:
df2:
VG G A P VP
---------------------
12 11 10 06 0
I tried achieving this using if condition and I got following error:
Code:
if df1.Score=='VG':
df2['VG']=df1.loc[df1['col1'] == 'VG', 'col2']
The truth value of a Series is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all()
3 Answers
3
agg
and transform
would work :)
agg
transform
df.groupby('col1').agg(list).col2.transform(pd.Series).T.fillna(0)
A G P VG VP
0 10.0 11.0 6.0 12.0 0.0
1 50.0 0.0 0.0 53.0 0.0
Thank you @ user2285236 for improve answer:
s = df1.groupby('col1').cumcount()
df = (df1.set_index(['col1', s])['col2']
.unstack(level=0, fill_value=0)
.rename_axis(None, 1))
print (df)
A G P VG VP
0 10 11 6 12 0
1 50 0 0 53 0
Explanation:
MultiIndex
set_index
Series
GroupBy.cumcount
unstack
rename_axis
I think 'VG' was just an example there. You can generalize this with cumcount instead (
s = df1.groupby('col1').cumcount()
).– user2285236
Jul 1 at 18:24
s = df1.groupby('col1').cumcount()
Here's the long way, if all else fails:
s = [pd.Series(g.loc[g['col2'] != 0, 'col2'].values, name=k)
for k, g in df.groupby('col1')]
res = pd.concat(s, axis=1, ignore_index=False).fillna(0)
print(res)
A G P VG VP
0 10 11.0 6.0 12 0.0
1 50 0.0 0.0 53 0.0
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Thank you for your answer. I have to populate to a different dataframe for the operation.
– Marcelo BD
Jul 1 at 18:28