create two dataframes based on regex

Multi tool use
create two dataframes based on regex
Here is a small sample of dataframe I would like to split into two separate dataframes.
No Code Name Rem Last Done LACP Chg % Chg Vol ('00)
0 1 0012 3A [S] s 0.940 0.940 - - 20
1 2 7054 AASIA [S] s - 0.205 - - -
2 3 5238 AAX [S] s 0.345 0.340 0.005 +1.47 37,806
3 4 5238WA AAX-WA [S] s 0.135 0.135 - - 590
4 5 7086 ABLEGRP [S] s 0.095 0.100 -0.005 -5.00 300
I want to filter on the "Code" column based on matching or not matching the following python regular expression:
"^[0-9]{1,5}$"
1 Answer
1
Use str.contains
with boolean indexing
, ~
is for inverting boolean mask:
str.contains
boolean indexing
~
m = df['Code'].str.contains("^[0-9]{1,5}$")
df1 = df[m]
print (df1)
No Code Name Rem Last Done LACP Chg % Chg Vol ('00)
0 1 0012 3A [S] s 0.940 0.940 - - 20
1 2 7054 AASIA [S] s - 0.205 - - -
2 3 5238 AAX [S] s 0.345 0.340 0.005 +1.47 37,806
4 5 7086 ABLEGRP [S] s 0.095 0.100 -0.005 -5.00 300
df2 = df[~m]
print (df2)
No Code Name Rem Last Done LACP Chg % Chg Vol ('00)
3 4 5238WA AAX-WA [S] s 0.135 0.135 - - 590
Detail:
print (m)
0 True
1 True
2 True
3 False
4 True
Name: Code, dtype: bool
print (~m)
0 False
1 False
2 False
3 True
4 False
Name: Code, dtype: bool
is it possible to an "or" condition? something like m = df['Code'].str.contains("^[0-9]{1,5}$" | "5235SS" )
– Timothy Lombard
2 days ago
@TimothyLombard - You can use
m = df['Code'].str.contains("^[0-9]{1,5}$|5235SS")
– jezrael
2 days ago
m = df['Code'].str.contains("^[0-9]{1,5}$|5235SS")
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
I was having trouble wrapping my head around how boolean indexing would work for this application. Thank you for the illustration.
– Timothy Lombard
Jul 2 at 5:35