It is only used for personal study and technical exchange, and cannot be used for commercial purposes.
This is a spider for 中国裁判文书网.
- Support IP proxy
- Support multiple processes
- Support full crawling
- Divide data according to decision time and province
python spider.py -num_processes 1 -start_time 2016-1-2 -end_time 2016-1-2- raw data
- processed data
If you have any questions, please open an issue.
Welcome to pull requests to improve this project!

