Extracting data from a complex MongoDB database using PyMongo and converting it to a .csv file
Extracting data from a complex MongoDB database using PyMongo and converting it to a .csv file
I have a complex MongoDB database, consisting of documents nested upto 7 levels deep. I need to use PyMongo to extract the data, and then convert the extracted data to a .csv file.
Can you use mongoexport?
– Astro
Jul 2 at 4:11
So far I am able to extract the entire database and store it as a Python object. I am then able to convert this object to a .csv file. However the .csv file has thousands of columns. I need to know how I can extract the data in a clean manner.
– pack24
Jul 2 at 4:11
@Astro I can use mongoexport, but the .csv file has thousands of columns. I need to extract the data in an organized manner. I'm ok with extracting the data in multiple csv files and then combining all csv files into one. I'm not sure how to proceed with that though; I just know how to extract data as a whole.
– pack24
Jul 2 at 4:13
1 Answer
1
You can try using json_normalize.
It is used to flatten the json.Reads data to a dataframe which can be stored in csv later.
For eg:
from pandas.io.json import json_normalize
# mongo_value is your mongo query
mongo_aggregate = db.events.aggregate(mongo_value)
mongo_df = json_normalize(list(mongo_aggregate))
# print(mongo_df)
mongo_columns = list(mongo_df.columns.values)
#just picks the column_name instead of properties.something.something.column_name
for w in range(len(mongo_columns)):
mongo_columns[w] = mongo_columns[w].split('.')[-1].lower()
mongo_df.columns = mongo_columns
For reference read this https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.io.json.json_normalize.html
Thank you for your help!
– pack24
Jul 2 at 12:28
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
What have you tried so far?
– Adam Smith
Jul 2 at 4:08