This is a general problem when trying to compare OSM data with aerial imagery. I...

This is a general problem when trying to compare OSM data with aerial imagery. I've worked a lot with orthos from Open Aerial Map, whose stated goal is to provide high quality imagery that's licensed for mapping. If you try and take OSM labels from the bounding boxes of those images and use them for segmentation labels, they're often misaligned or not detailed enough. In theory those images ought to have the best corresponding data, but OAM allows people to upload open imagery generally and not all of it is mapped.

I've spent a lot of time building models for tree mapping. In theory you could use that as a pipeline with OAM to generate forest regions for OSM and it would probably be better than human labels which tend to be very coarse. I wouldn't discount AI labeling entirely, but it does need oversight and you probably want a high confidence threshold. One other thought is you could compare overlap between predicted polygons and human polygons and use that as a prompt to review for refinement. This would be helpful for things like individual buildings which tend to not be mapped particularly well (i.e. tight to the structure), but a modern segmentation model can probably provide very tight polygons.