Fuzzy matching using Python Pandas

Multi tool use
Multi tool use


Fuzzy matching using Python Pandas



I have two set of data:



A is the source data contain columns [company_name], [company_id]



B is test data contains only column [company_name]



I want to apply some fuzzy match function (soundex, levenshtein_distance etc.) to compare each [company_name] in B against all [company_name] in A, the aggregate the score to find out which company names in B has a match in A and output their correct ID from A.



Because the size of A and B data-frame are quite large, I cannot full join A and B.



Can anyone help me howto write a loop (or if there is any other way) to achieve my goals here?









By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

qbG kaErFr2W0zQzyF32c,0w6oL,eJk,XGTpFdLUhOdCLTRPxk9U7bLnxUIsdA
7DsXcPSB1,i

Popular posts from this blog

Rothschild family

Boo (programming language)