Integrating Apache Spark with MongoDB

In this tutorial, we will learn how to integrate Apache Spark with MongoDB database. We will be using spark-shell for interacting with MongoDB database and will perform read from MongoDB and write to MongoDB using spark-shell. Tools Used: Apache Spark 2.1.0 MongoDB 3.4.2 Pre-requisites: I am assuming you have downloaded and extracted Apache Spark and […]

Read More

Understanding Sqoop Eval Command

In this blog, we will understand Sqoop’s eval command. Sqoop’s eval command parameter allows a user to perform DDL and DML queries against the DB and previews the results in the console. We will see and understand two evaluations, Select Query Eval Insert Query Eval   Assumptions: I am assuming that you are using MySQL […]

Read More

HDFS copyFromLocal v/s put Command

“What’s the difference between copyFromLocal and Put command in HDFS CLI?” A very common interview question, isn’t it? Let’s try to figure out the notable difference between Put and copyFromLocal. Both commands have only one objective i.e. to load data in HDFS. Let’s demonstrate the functionality now. Variation 1: Loading data from local file system and storing the same […]

Read More