- Base de datos SQLite con información de vehículos - Dashboard web con Flask y Bootstrap - Scripts de web scraping para RockAuto - Interfaz CLI para consultas - Documentación completa del proyecto Incluye: - 12 marcas de vehículos - 10,923 modelos - 10,919 especificaciones de motores - 12,075 combinaciones modelo-año-motor
171 lines
6.2 KiB
Python
171 lines
6.2 KiB
Python
"""
|
|
Manual Data Extraction Guide for RockAuto.com
|
|
|
|
Since RockAuto has strong anti-bot measures, here's a manual approach to extract vehicle data:
|
|
|
|
1. Visit https://www.rockauto.com/
|
|
2. Click on "Catalog" in the navigation menu
|
|
3. You'll see a list of vehicle manufacturers (makes)
|
|
4. For each make, manually note down the models, years, and engines
|
|
|
|
This script provides a framework to input the manually collected data into your database.
|
|
"""
|
|
|
|
import sqlite3
|
|
from typing import List, Dict
|
|
|
|
|
|
class ManualDataInput:
|
|
def __init__(self, db_path: str = "../vehicle_database/vehicle_database.db"):
|
|
self.db_path = db_path
|
|
|
|
def add_vehicle_data(self, make: str, model: str, year: int, engine: str = "Unknown"):
|
|
"""Add a single vehicle entry to the database"""
|
|
conn = sqlite3.connect(self.db_path)
|
|
cursor = conn.cursor()
|
|
|
|
try:
|
|
# Insert brand
|
|
cursor.execute(
|
|
"INSERT OR IGNORE INTO brands (name) VALUES (?)",
|
|
(make,)
|
|
)
|
|
cursor.execute("SELECT id FROM brands WHERE name = ?", (make,))
|
|
brand_id = cursor.fetchone()[0]
|
|
|
|
# Insert year
|
|
cursor.execute(
|
|
"INSERT OR IGNORE INTO years (year) VALUES (?)",
|
|
(year,)
|
|
)
|
|
cursor.execute("SELECT id FROM years WHERE year = ?", (year,))
|
|
year_id = cursor.fetchone()[0]
|
|
|
|
# Insert engine
|
|
cursor.execute(
|
|
"INSERT OR IGNORE INTO engines (name) VALUES (?)",
|
|
(engine,)
|
|
)
|
|
cursor.execute("SELECT id FROM engines WHERE name = ?", (engine,))
|
|
engine_id = cursor.fetchone()[0]
|
|
|
|
# Insert model
|
|
cursor.execute(
|
|
"INSERT OR IGNORE INTO models (brand_id, name) VALUES (?, ?)",
|
|
(brand_id, model)
|
|
)
|
|
cursor.execute("SELECT id FROM models WHERE brand_id = ? AND name = ?", (brand_id, model))
|
|
model_id = cursor.fetchone()[0]
|
|
|
|
# Link model, year, and engine
|
|
cursor.execute(
|
|
"""INSERT OR IGNORE INTO model_year_engine
|
|
(model_id, year_id, engine_id) VALUES (?, ?, ?)""",
|
|
(model_id, year_id, engine_id)
|
|
)
|
|
|
|
conn.commit()
|
|
print(f"Added: {year} {make} {model} with {engine}")
|
|
|
|
except Exception as e:
|
|
print(f"Error adding vehicle: {e}")
|
|
finally:
|
|
conn.close()
|
|
|
|
def add_multiple_vehicles(self, vehicles: List[Dict]):
|
|
"""Add multiple vehicles at once"""
|
|
for vehicle in vehicles:
|
|
self.add_vehicle_data(
|
|
make=vehicle.get('make', ''),
|
|
model=vehicle.get('model', ''),
|
|
year=vehicle.get('year', 0),
|
|
engine=vehicle.get('engine', 'Unknown')
|
|
)
|
|
|
|
def show_extraction_guide(self):
|
|
"""Show the manual extraction guide"""
|
|
guide = """
|
|
================================================
|
|
Manual RockAuto Data Extraction Guide
|
|
================================================
|
|
|
|
1. OPEN YOUR WEB BROWSER and go to: https://www.rockauto.com
|
|
|
|
2. CLICK on the "Catalog" link in the navigation menu
|
|
|
|
3. YOU WILL SEE a list of vehicle manufacturers (makes) like:
|
|
- Acura
|
|
- Audi
|
|
- BMW
|
|
- Chevrolet
|
|
- Ford
|
|
- Honda
|
|
- Toyota
|
|
- And many more...
|
|
|
|
4. FOR EACH MANUFACTURER:
|
|
a) Click on the manufacturer name
|
|
b) You'll see a page with vehicle models organized by year
|
|
c) Note down the models and years you see
|
|
d) Example format: 2020 Honda Civic, 2019 Ford F-150, etc.
|
|
|
|
5. TO FIND ENGINE INFORMATION:
|
|
a) Click on a specific model/year combination
|
|
b) You'll see parts categories for that vehicle
|
|
c) Look for "Engine" or "Engine Mechanical" category
|
|
d) Note down the engine type/specifications
|
|
|
|
6. USE THE FOLLOWING COMMANDS to add data to your database:
|
|
|
|
Example Python commands:
|
|
>>> from manual_input import ManualDataInput
|
|
>>> input_tool = ManualDataInput()
|
|
>>> input_tool.add_vehicle_data("Toyota", "Camry", 2020, "2.5L 4-Cylinder")
|
|
>>> input_tool.add_vehicle_data("Honda", "Civic", 2019, "1.5L Turbo")
|
|
|
|
Or add multiple at once:
|
|
>>> vehicles = [
|
|
... {"make": "Ford", "model": "F-150", "year": 2021, "engine": "3.5L V6"},
|
|
... {"make": "BMW", "model": "X3", "year": 2020, "engine": "2.0L 4-Cylinder Turbo"}
|
|
... ]
|
|
>>> input_tool.add_multiple_vehicles(vehicles)
|
|
|
|
7. TIPS FOR EFFICIENT DATA COLLECTION:
|
|
- Focus on popular makes/models first
|
|
- Record data in a spreadsheet as you go
|
|
- Take screenshots of pages for reference
|
|
- Be systematic - go alphabetically or by make popularity
|
|
|
|
================================================
|
|
"""
|
|
print(guide)
|
|
|
|
|
|
def main():
|
|
print("Manual RockAuto Data Extraction Tool")
|
|
print("=====================================")
|
|
|
|
input_tool = ManualDataInput()
|
|
|
|
# Show the extraction guide
|
|
input_tool.show_extraction_guide()
|
|
|
|
# Add sample vehicles to database
|
|
print("\nAdding sample vehicles to database:")
|
|
sample_vehicles = [
|
|
{"make": "Toyota", "model": "Camry", "year": 2020, "engine": "2.5L 4-Cylinder"},
|
|
{"make": "Honda", "model": "Civic", "year": 2019, "engine": "1.5L Turbo"},
|
|
{"make": "Ford", "model": "F-150", "year": 2021, "engine": "3.5L V6"},
|
|
{"make": "BMW", "model": "X3", "year": 2020, "engine": "2.0L 4-Cylinder Turbo"},
|
|
{"make": "Chevrolet", "model": "Silverado", "year": 2022, "engine": "5.3L V8"}
|
|
]
|
|
|
|
input_tool.add_multiple_vehicles(sample_vehicles)
|
|
print("\nSample vehicles added to database!")
|
|
|
|
print("\nYou can now use the ManualDataInput class to add more vehicles manually.")
|
|
print("Import it in Python with: from manual_input import ManualDataInput")
|
|
|
|
|
|
if __name__ == "__main__":
|
|
main() |