[go: up one dir, main page]

0% found this document useful (0 votes)
37 views9 pages

MySQL and Hadoop Integration with Sqoop

Sqoop can be used to integrate MySQL and Hadoop by importing and exporting large datasets between the two systems using MapReduce jobs. It allows importing data from MySQL to HDFS or directly to Hive tables. Incremental imports with options like --check-column and --last-value can keep the systems in sync over time. Exporting from Hive to MySQL is also supported where the target table must pre-exist. The document provides examples of using Sqoop to import and export the MovieLens dataset between MySQL and Hadoop/Hive.

Uploaded by

Nouhaila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views9 pages

MySQL and Hadoop Integration with Sqoop

Sqoop can be used to integrate MySQL and Hadoop by importing and exporting large datasets between the two systems using MapReduce jobs. It allows importing data from MySQL to HDFS or directly to Hive tables. Incremental imports with options like --check-column and --last-value can keep the systems in sync over time. Exporting from Hive to MySQL is also supported where the target table must pre-exist. The document provides examples of using Sqoop to import and export the MovieLens dataset between MySQL and Hadoop/Hive.

Uploaded by

Nouhaila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

INTEGRATING

MYSQL & HADOOP


Fun with Sqoop
What’s MySQL?

■ Popular, free relational database


■ Generally monolithic in nature
■ But, can be used for OLTP – so exporting data into MySQL can be useful
■ Existing data may exist in MySQL that you want to import to Hadoop
Sqoop to the rescue
Sqoop can handle BIG data

■ Actually kicks off MapReduce jobs to handle importing or exporting your data!

MySQL /
PostGres /
whatever

Mapper Mapper Mapper Mapper

HDFS
Sqoop: Import data from MySQL to
HDFS
sqoop import --connect jdbc:mysql://localhost/movielens --driver
com.mysql.jdbc.Driver --table movies
Sqoop: Import data from MySQL
directly into Hive!
■ sqoop import --connect jdbc:mysql://localhost/movielens --driver
com.mysql.jdbc.Driver --table movies --hive-import
Incremental imports

■ You can keep your relational database and Hadoop in sync


■ --check-column and -–last-value
Sqoop: Export data from Hive to
MySQL
■ sqoop export --connect jdbc:mysql://localhost/movielens -m 1 --driver
com.mysql.jdbc.Driver --table exported_movies --export-dir
/apps/hive/warehouse/movies --input-fields-terminated-by '\0001‘

■ Target table must already exist in MySQL, with columns in expected order
Let’s play with MySQL and Sqoop

■ Import MovieLens data into a MySQL database


■ Import the movies to HDFS
■ Import the movies into Hive
■ Export the movies back into MySQL

You might also like