This project is a 2D self-driving car simulation developed in Python using Pygame. It features a Q-learning agent that learns to navigate a circuit by interacting with its environment and optimizing its actions through a reward system.
- Reinforcement Learning: Implements Q-learning to train an AI agent to navigate a circuit.
- Sensor System: The vehicle is equipped with sensors that provide information about its surroundings, allowing for informed decision-making.
- Visual Feedback: Real-time visualization of the vehicle's performance, including speed, scores, and sensor values.
- Logging: Tracks the performance of the agent across episodes and stores it for further analysis.
- Dual Mode Operation: Supports both training and simulation modes through the LEARNING_MODE configuration.
To run this project, you'll need Python 3.x along with the required libraries. You can install them using pip:
pip install -r requirements.txt- Clone the repository:
git clone https://github.com/matiascarabella/self-driving-ai.git
cd self-driving-aipython main.pypython train_fast.pyRuns without rendering for much faster training. Perfect for overnight training sessions.
Override config settings from the command line:
python main.py --circuit circuit_1 --episodes 100 --headless
python main.py --eval # Evaluation mode (no training)
python main.py --manual # Manual control with arrow keyspython watch_agent.pyWatch your trained agent perform without any learning or logging. The agent uses its learned knowledge deterministically.
If your agent gets stuck or isn't learning well:
- Increase exploration: Lower
MIN_EXPLORATION_RATEin config.py - Train longer: Use
--episodes 10000or more - Try different circuits: Each circuit teaches different skills
- Adjust rewards: Tune
REWARD_CONFIGvalues in config.py
self-driving-ai/
βββ assets/
β βββ images/
β βββ circuit_1.png # Horizontal circuit (1893x493)
β βββ circuit_2.png # Square circuit (801x601)
βββ logs/
β βββ q_learning/
β β βββ circuit1_v1.txt # Episode scores for circuit 1
β β βββ circuit1_v1_metrics.json
β β βββ circuit2_v1.txt # Episode scores for circuit 2
β β βββ circuit2_v1_metrics.json
β βββ logger.py # Logging utilities
βββ machine_learning/
β βββ q_learning/
β βββ q_tables/
β β βββ circuit1_v1.pkl # Learned Q-table for circuit 1
β β βββ circuit2_v1.pkl # Learned Q-table for circuit 2
β βββ agent.py # Q-learning agent implementation
βββ models/
β βββ checkpoint.py # Checkpoint detection system
β βββ environment.py # Game environment and rendering
β βββ sensor.py # Vehicle sensor system
β βββ vehicle.py # Vehicle physics and state
βββ visualization/
β βββ plot_training.py # Training progress visualization
βββ .gitignore
βββ config.py # All configuration parameters
βββ LICENSE
βββ main.py # Main entry point with CLI support
βββ README.md
βββ requirements.txt # Python dependencies (pinned versions)
βββ train_fast.py # Headless training wrapper
βββ watch_agent.py # Evaluation mode wrapper
The project includes a config.py file where you can adjust various parameters:
SESSION_CONFIG = {
"TRAINING_MODE": True, # Toggle between training and evaluation modes
"NUM_EPISODES": 50, # Number of episodes to run
"EPISODE_DURATION": 20, # Duration of each episode in seconds
"MANUAL_CONTROL": False, # Enable manual control with arrow keys
"HEADLESS": False, # Run without rendering (5-10x faster)
"FRAME_SKIP": 1, # Render every Nth frame (higher = faster)
"CIRCUIT": "circuit_2" # Which circuit to use: "circuit_1" or "circuit_2"
}CIRCUIT_CONFIG = {
"circuit_1": {
"window_size": (1200, 400),
"start_angle": 0, # Point right
"q_table": "circuit1_v1.pkl"
},
"circuit_2": {
"window_size": (800, 600),
"start_angle": 180, # Point left
"q_table": "circuit2_v1.pkl"
}
}-
Training Mode (
TRAINING_MODE = True):- Used for training the agent
- Agent explores new actions using epsilon-greedy strategy
- Updates Q-table based on experiences
- Behavior varies between runs due to exploration
-
Evaluation Mode (
TRAINING_MODE = False):- Used for testing or demonstrating learned behavior
- Agent uses learned knowledge deterministically
- No Q-table updates or exploration
- Consistent behavior between runs
- Vehicle settings (dimensions, speed, acceleration)
- Q-learning parameters (learning rate, discount factor, exploration rate)
- Window and display settings
The training results are logged within the logs/q_learning/ folder:
circuit1_v1.txt/circuit2_v1.txt: Records the final score for each episodecircuit1_v1_metrics.json/circuit2_v1_metrics.json: Detailed metrics including:- Episode number
- Score and distance traveled
- Exploration rate (epsilon)
- Collision status
- Finish line reached
These logs can be used for performance analysis and progress visualization. Each circuit maintains separate logs.
Visualize your agent's training progress:
python visualization/plot_training.pyShows distance traveled over episodes - the primary metric for learning progress. The visualization:
- Displays raw distance data with moving average
- Shows max distance achieved with reference line
- Marks new records with star (β ) indicators
- Includes key statistics (episodes, peak, avg last 100)
- Automatically merges multiple training sessions into a continuous timeline
The plot automatically uses the metrics file for the current circuit in config.py.
This project is licensed under the MIT License.
- OpenAI for inspiring the use of AI and reinforcement learning concepts.
- Pygame for the graphics library used in this project.

