Web data restrictions impact AI models like ChatGPT. MIT's study finds 25% of top-quality data restricted. Smaller firms use synthetic data due to licensing costs. Deals with AP and News Corp are struck. Researchers worry about a consent crisis, as robots.txt compliance varies. Perplexity AI addresses this issue.