Driver Behavior Analysis and Prediction is a web service that utilized the Java Spark framework for real-time big data analysis and visualziation, and Java SpringBoot to provide RESTful web services. The requirements are to analyze the given driving behavior dataset of 10 drivers over 10 consecutive days. Detailed requirements include:
- A Website for Driving Behavior Analysis
- Real Time
- Generate a summary to show the driving behavior of each driver.
- Monitor the driving speed of each driver in real time.
- Use Amazon Web Services (AWS) to develop the website.
- Analyze the driving behavior with Spark (Spark-SQL)
- The cumulative number of times for each driver
- Use a diagram to visualize the driving speed
- When the driver is speeding, a warning will be issued
- Automatically Updated every 30 seconds
- Running of the website: no run-time error
- Effective and user-friendly user interface
- Pruduction Env - Operation system: Centos 7.9 (Linux)
- Development Env - Operation system: Macos 14.4.1 (23E224)
- Programming language: Java 17.0.10
- Application Framework: StringBoot 3
- Dependencies/Required packages in Java:
lombok spring-boot-configuration-processor jedis spark-core_2.13 spark-sql_2.13 janino fastjson2 mysql-connector-j mybatis-spring-boot-starter druid-spring-boot-starter
- Software for Building: Maven
- Other Components in used
Mysql Redis Nginx
The data of this project was provided in advance containing 10 files of 10 drivers’ driving behavior over 10 consecutive days. These data are stored in CSV files without headers with a total of 413,450 records. Each record consists of 19 features (columns) vary from driver information to specific driving behavior indications.
For detailed description of Features and Functionalities with figures illustration, please refer to: Section 2 of the application report
- Data Time Simulation
- Real-time Monitoring
- Data Visualization
- Data Query
- Adjustable Display
The program is written in Java with the SpringBoot framework for building RESTful APIs and using Java Spark for data related operations. This project adopts a multi-layer architecture. The multi-layer architecture completely separates the front end and backend, while the backend can further be divided into three different layers.
Since SpringBoot web services are RESTful, the response data are automatically configured into JSON format. As for the backend, it consists of three layers, the Controller layer, Service layer, and Data persistence layer.
The Controller layer manages the input and output data but only interacts with the service layer. According to different input paths, Controller pass the data to different service interfaces in the service layer.
@RestController
@RequestMapping("/api/driver")
public class DriverController {
@Autowired
private DriverService driverService;
@GetMapping("/list")
public Result list() {
return Result.success(driverService.getDriverList());
}
@GetMapping("/info")
public Result info(@RequestParam String driverId) {
return Result.success(driverService.getDriverInfo(driverId));
}
@GetMapping("/behavior")
public Result behavior(@RequestParam String driverId) {
return Result.success(driverService.getDriverBehaviors(driverId));
}
@GetMapping("/diagram")
public Result diagram(@RequestParam String driverId) {
return Result.success(driverService.getDriverDiagram(driverId));
}
}
The service layer holds the interface and implements the core application logic and functionalities. This layer interacts with the Controller, the data, Spring Beans, and different entities (Plain-Old-Java-Objects, POJOs).
@Data
@AllArgsConstructor
@NoArgsConstructor
public class Driver {
// User Info
private String driverID;
private String carPlateNumber;
// Update Time
private Date updateTime;
// Real-time data
private double latitude;
private double longitude;
private int speed;
private int direction;
// History data
private int rapidlySpeedupTimes;
private int rapidlySlowdownTimes;
private int neutralSlideTimes;
private int overspeedTimes;
private int fatigueDrivingTimes;
private int hthrottleStopTimes;
private int oilLeakTimes;
}
@AllArgsConstructor
@NoArgsConstructor
@Data
public class DriverBehaviors {
private String driverID;
// Driving behavior in slot
private String siteName;
private int isRapidlySpeedup;
private int isRapidlySlowdown;
private int isNeutralSlide;
private int isNeutralSlideFinished;
private int neutralSlideTime;
private int isOverspeed;
private int isOverspeedFinished;
private int overspeedTime;
private int isFatigueDriving;
private int isHthrottleStop;
private int isOilLeak;
}
Lastly, the Data Persistence Layer fetches data via SparkSQL and data persistence via MySQL. With the multi-layer design, this project is able to achieve parallel development, improved maintainability, and seamless upgrade in CI/CD.
// Create a Spark session
SparkSession spark = SparkSession.builder().appName("Java Spark SQL").getOrCreate();
// Execute the SQL query
Dataset<Row> df = spark.sql("SELECT window(time, '" + durationInSeconds + " seconds') as time_window, AVG(speed) as avg_speed " +
"FROM driving " +
"WHERE driverID = '" + driverId + "' " +
"AND time >= '" + new Timestamp(initTime.getTime()) + "' " +
"AND time <= '" + new Timestamp(cutOffTime.getTime()) + "' " +
"GROUP BY time_window " +
"ORDER BY time_window");
- AWS Amplify: Hosting Static Websites
- Apache Spark on Amazon EMR
- Amazon ElastiCache for Redis caching
- Amazon RDS for MySQL