If the training data contains a lot of copyrighted material, then when the AI could end up trained to favor results that include parts That resemble copyrighted material.
How the copyrighted material was acquired could matter. If scraped with a valid API from social media, odds are the social media company claims a license to redistribute any content uploaded to it. But what if a lot of that content was uploaded illegally to begin with? The AI company could be unwittingly paying for stolen art. Or what if the AI company buys a curated collection of training data a different group put together? Odds are that may include copyrighted work, and the company selling it is unlikely to be licensed to do so…
A third issue is the intent of copyright. There is nothing natural or real about copyright. It is a concept we invented to aid creators so they can profit from their work and continue to produce more content which then benefits society. If AI art threatens that system, a court might decide protected art cannot be included in training data in order to maintain that intent.
Did you see that ludicrous display last night?