Congress Wants Tech Companies to Pay Up for AI Training Data

stopthatgirl7@kbin.social · 6 months ago

Congress Wants Tech Companies to Pay Up for AI Training Data

Grimy@lemmy.world · edit-2 6 months ago

“What would that even look like?” asks Sarah Kreps, who directs the Tech Policy Institute at Cornell University. “Requiring licensing data will be impractical, favor the big firms like OpenAI and Microsoft that have the resources to pay for these licenses, and create enormous costs for startup AI firms that could diversify the marketplace and guard against hegemonic domination and potential antitrust behavior of the big firms.”

As our economy becomes more and more driven by AI, legislation like this will guarantee Microsoft and Google get to own it.

TORFdot0@lemmy.world · 6 months ago

And what about the authors whose works were injected without compensation? What should we do for them? I don’t think that these commercial AI models should get to infringe on their copyrights for nothing. If I pay for a ChatGPT subscription and ask it to tell me about the war the Middle East and it basically regurgitates and plagiarizes information it learned from a journalist, then ChatGPT has essentially stolen the copyrighted work from that journalist and the revenue that my click would have earned them.

I don’t see a problem using publicly posted copyrighted data for non-commercial use for training local language models but don’t think its fair to allow copyright infringement for commercial use.

Grimy@lemmy.world · edit-2 6 months ago

I think it’s better be pragmatic then to give everything to the big corporations.

OpenAi isn’t going to takes it tool offline so the loss of revenue isn’t going away. Payments won’t end up in the pockets of any individual journalist. The money the few journalistic sites will receive will be used to pay for the subscription fee to the next big model while cutting off their staff since it will net them more money.

If this goes through, Google and Microsoft will spend the next few years buying data or the companies that have it. The walls will be raised and we will be fucked, legislation will only help them.

And there is simply not enough public domain data to build a competitive product. Better to tax and redistribute tohrough UBI while keeping the field competitive and avoiding monopolies.

be_excellent_to_each_other@kbin.social · 6 months ago

Well fuck all those artists and writers who made the original works then I guess. Licensing is impractical.