Use this project to join data from multiple csv files. Currently in this project we support one to one and one to many join. Along with this you can find how to use kafka producer efficiently with spark. Metdata for whole joining process is defined in #datasource.json.
Commandline arguments:
- Boolean value (true to unable kafka otherwise false)
- Kafka topic name (If one is true )
Plateforms:
- Spark 2.1
- Kafka and zookeeper (If Kafka is enable)