- Base de datos SQLite con información de vehículos - Dashboard web con Flask y Bootstrap - Scripts de web scraping para RockAuto - Interfaz CLI para consultas - Documentación completa del proyecto Incluye: - 12 marcas de vehículos - 10,923 modelos - 10,919 especificaciones de motores - 12,075 combinaciones modelo-año-motor
126 lines
4.2 KiB
Markdown
126 lines
4.2 KiB
Markdown
# Vehicle Database with RockAuto Data Integration
|
|
|
|
## Project Overview
|
|
|
|
This project combines two components:
|
|
1. A comprehensive vehicle database system
|
|
2. A data extraction system for RockAuto.com vehicle information
|
|
|
|
Due to anti-bot measures on RockAuto.com, a manual extraction approach is recommended for collecting vehicle data.
|
|
|
|
## System Components
|
|
|
|
### 1. Vehicle Database
|
|
- SQLite database with normalized schema
|
|
- Tables for brands, models, years, engines, and their relationships
|
|
- Python API for managing the database
|
|
- Interactive query interface
|
|
|
|
### 2. Data Extraction Tools
|
|
- Automated scraper (for sites without anti-bot measures)
|
|
- Manual extraction guide for RockAuto.com
|
|
- Data import functionality
|
|
|
|
## Database Schema
|
|
|
|
The database consists of five main tables:
|
|
|
|
- **brands**: Vehicle manufacturers (Toyota, Ford, etc.)
|
|
- **models**: Vehicle models (Camry, F-150, etc.)
|
|
- **engines**: Engine specifications (2JZ-GTE, EcoBoost, etc.)
|
|
- **years**: Calendar years for vehicle production
|
|
- **model_year_engine**: Junction table linking all entities with trim levels and specifications
|
|
|
|
## Using the System
|
|
|
|
### Initial Setup
|
|
```bash
|
|
cd vehicle_database
|
|
./setup.sh
|
|
```
|
|
|
|
### Querying the Database
|
|
```bash
|
|
python3 scripts/query_interface.py
|
|
```
|
|
|
|
### Adding More Data Manually
|
|
```python
|
|
from ../vehicle_scraper/manual_input import ManualDataInput
|
|
input_tool = ManualDataInput()
|
|
|
|
# Add a single vehicle
|
|
input_tool.add_vehicle_data("Toyota", "Corolla", 2021, "1.8L 4-Cylinder")
|
|
|
|
# Add multiple vehicles
|
|
vehicles = [
|
|
{"make": "Nissan", "model": "Altima", "year": 2020, "engine": "2.5L 4-Cylinder"},
|
|
{"make": "Hyundai", "model": "Elantra", "year": 2019, "engine": "2.0L 4-Cylinder"}
|
|
]
|
|
input_tool.add_multiple_vehicles(vehicles)
|
|
```
|
|
|
|
## Manual Data Extraction from RockAuto.com
|
|
|
|
Since RockAuto has anti-bot measures, follow this process:
|
|
|
|
1. Open your web browser and go to: https://www.rockauto.com
|
|
2. Click on the "Catalog" link in the navigation menu
|
|
3. You will see a list of vehicle manufacturers (makes)
|
|
4. For each manufacturer:
|
|
- Click on the manufacturer name
|
|
- You'll see a page with vehicle models organized by year
|
|
- Note down the models and years you see
|
|
5. To find engine information:
|
|
- Click on a specific model/year combination
|
|
- You'll see parts categories for that vehicle
|
|
- Look for "Engine" or "Engine Mechanical" category
|
|
- Note down the engine type/specifications
|
|
6. Use the ManualDataInput class to add the collected data to your database
|
|
|
|
## File Structure
|
|
```
|
|
vehicle_database/ # Main database system
|
|
├── sql/
|
|
│ └── schema.sql # Database schema
|
|
├── scripts/
|
|
│ ├── database_manager.py # Main database manager
|
|
│ ├── query_interface.py # Interactive query interface
|
|
│ └── csv_importer.py # CSV import functionality
|
|
├── data/ # Sample CSV data files
|
|
├── vehicle_database.db # SQLite database file
|
|
├── setup.sh # Setup script
|
|
├── README.md # Project documentation
|
|
└── GETTING_STARTED.md # Getting started guide
|
|
|
|
vehicle_scraper/ # Data extraction tools
|
|
├── rockauto_scraper.py # Automated scraper (for other sites)
|
|
├── rockauto_scraper_enhanced.py # Enhanced scraper
|
|
├── manual_input.py # Manual input tool
|
|
├── manual_input_simple.py # Simplified manual input
|
|
└── requirements.txt # Python dependencies
|
|
```
|
|
|
|
## Extending the Database
|
|
|
|
To add more vehicle data:
|
|
1. Collect data manually from RockAuto.com using the provided guide
|
|
2. Use the ManualDataInput class to add data to the database
|
|
3. Or prepare CSV files in the required format and use the CSV importer
|
|
|
|
## Future Enhancements
|
|
|
|
- Web scraping capabilities for other automotive parts sites
|
|
- Export functionality to share data
|
|
- Advanced search and filtering options
|
|
- Data validation and cleaning tools
|
|
|
|
## Troubleshooting
|
|
|
|
If you encounter issues:
|
|
1. Check that Python 3.x is installed
|
|
2. Ensure all required packages are installed (`pip3 install -r requirements.txt`)
|
|
3. Verify database file permissions
|
|
4. Check that the schema matches the expected structure
|
|
|
|
The system is now ready to use. You can start by exploring the existing data through the query interface and then add more data as needed. |