Dataset Information
We performed segmentation using data collected from 10 towns provided by default in CARLA, under three environmental conditions: sunny, rainy, dusty.
The image data has a 1:2 ratio, and we were able to test the generalization performance of segmentation by adjusting to a maximum size of (216 x 512).
Class Definition
We modified the 28 classes obtained from the official CARLA documentation into 12 classes.
original_class = {0: "None", 1 : "Roads", 2 : "Sidewalks", 3 : "Buildings", 4 : "Other", 5 : "Other", 6 : "Poles", 7 : "TrafficLight", 8 : "TrafficSigns", 9 : "Vegetation", 10 : "Roads", 11 : "None", 12 : "Pedestrians", 13 : "Vehicles", 14 : "Vehicles", 15 : "Vehicles", 16 : "Vehicles", 17 : "Vehicles", 18 : "Vehicles", 19 : "Vehicles", 20 : "Other", 21 : "Other", 22 : "Other", 23 : "Other", 24 : "RoadLines", 25 : "Sidewalks", 26 : "Other", 27 : "Other", 28 : "Other"}
remap_class = {0: "None", 1: "Roads", 2: "Sidewalks", 3: "Buildings", 4: "Other", 5: "Poles", 6: "TrafficLight", 7: "TrafficSigns", 8: "Vegetation", 9: "Pedestrians", 10: "Vehicles", 11: "RoadLines"}
You can view the video of the original class and the remapped mask below.
There are several factors in images that influence the training of segmentation.
-
Image Resolution:
High-resolution images provide finer details but increase training time and require more memory.
Reducing resolution simplifies computation but may lead to loss of details.
-
Image Quality:
Poor quality images (e.g., with lots of noise or low contrast) can impact the accuracy of segmentation.
-
Image Augmentation:
Augmentation techniques (like rotation, scaling, flipping, brightness adjustments) help the model generalize better across varied scenarios.
Over-augmenting can risk overfitting the model.
-
Class Imbalance:
If certain classes of pixels vastly outnumber others in an image, it can lead to class imbalance issues. This might degrade segmentation accuracy for some classes.
-
Annotation Quality:
The quality of the ground truth segmentation masks plays a major role in the outcome of the training. Inaccurate masks can decrease training accuracy.
-
Channel Information:
Multi-channel images (e.g., RGB, infra-red, depth, etc.) can provide additional information, potentially improving segmentation accuracy.
-
Variability and Diversity:
A diverse set of images in the training set (varying lighting, angles, backgrounds, object sizes, etc.) ensures the model generalizes well in real-world scenarios.
-
Contextual Information:
Context in images can assist in predicting the position of specific objects or structures, especially crucial in larger images.
-
Spatial Dependencies:
Considering the spatial dependencies between pixels within an image can lead to more accurate segmentation outcomes.
Conclusion