How to add the mongoDB driver and connector to the jupyter notebook?
How to add the mongoDB driver and connector to the jupyter notebook?
I would like to add the MongoDB Spark Connector v2.2
to the Jupyter notebook to pull data as dataframes from a mongodb database. However I keep on getting that error, I am not sure how to proceed from here.
MongoDB Spark Connector v2.2
An error occurred while calling o40.load. :
java.lang.NoClassDefFoundError: com/mongodb/ConnectionString
The Spark version I am using is 2.3.1
2.3.1
The Python version I am using is 3.6.5 |Anaconda, Inc.|
3.6.5 |Anaconda, Inc.|
I am running the Jupyter notebook on Windows 10
Windows 10
I downloaded the .jar files from the following link, specifically Version: 2.2.3-s_2.11
2.2.3-s_2.11
How to add the mongoDB driver and connector to the jupyter notebook ?
I am not quite sure how to point the jupyter notebook to the correct location in windows 10
The code is as follows
import findspark
findspark.init()
import os
%env os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.mongodb.spark:mongo-spark-connector_2.11:2.2.3 pyspark-shell'
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
df = spark.read.format("com.mongodb.spark.sql.DefaultSource").option("uri",
"mongodb://127.0.0.1/test.db").load()
@mbuechmann updated
– Matthew
Jul 2 at 8:40
1 Answer
1
For Windows 10, I had to physically add the jar files to the apache spark installation
C:apache-sparkspark-2.3.1-bin-hadoop2.7jars
In this case, they were
a) org.mongodb_mongo-java-driver-3.4.2.jar
b) org.mongodb.spark_mongo-spark-connector_2.11-2.2.3.jar
And it worked as a charm.
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Please do not link to screenshots of code or error messages. Post that text directly in your question. You should also provide an actual question.
– mbuechmann
Jul 2 at 8:31