Marey by Moonvalley: AI video generation “trained on licensed footage. No scraped content.”

That tagline stopped me in my tracks.

I’ve been following the “where did your training data come from” debate. But this one hit closer to home: if I were an artist, would I be okay with someone training on my work?

How about ripping off my code? We’ve all copy/pasted some code from Stack Overflow (don’t look at me like that!), but my actual proprietary work?

I’ve managed people who got pissed seeing their slides show up in other people’s decks. Me? I was flattered that they needed to rip us off to get ahead.

This is philosophical - not about the law (that’s clearly playing out), but questions of the value people place on ethical solutions.

Two corners of the same ring

In this corner: the “Ethical” Training Approach

  • Anthropic - partnering with publishers for licensed training data and offering revenue sharing
  • Adobe Firefly - exclusively trained on Adobe Stock, openly licensed content, and public domain works

In the other corner: the “Public Internet = Fair Game” Approach

  • OpenAI - facing lawsuits from the NY Times, authors, and artists over unauthorized training data usage
  • Meta’s models - trained on massive web scrapes

If it’s out there for a human to see, read, use, then why can’t the AI do the same? (Goes my simplified version of the thinking.)

Beyond ethics: the commercial question

Here’s where I think the question goes beyond simple ethics.

Fair-game approach gets you more data, at a lower cost. Higher-quality solution plus better business economics equals potential competitive advantage??

I’d argue that what people are after is that competitive advantage.

So shouldn’t we ask if being ethical can be a competitive advantage?

Drawing an analogue, the licensed approach is like ESG investing. We’ve seen a rise across a generation of those that leverage their values in their decisions. If all things are equal, that makes perfect sense - but if the “ethical” solution is sub-par, will the buyer compromise their values?

The entrepreneur’s calculus

You’re an AI entrepreneur. There’s lots of opportunity… but a crowded field. You want to stand out. You’re literally tossing piles of money into a fire on compute costs - never mind sky-high salaries for talent.

I bet 9 out of 10 don’t consider that the source of their training data is a key product criteria.

I bet even fewer than 1 out of 10 believe that buyers will choose on their values for licensing content.

My honest take: I believe creators deserve fair attribution and compensation for their work. Content isn’t free just because it’s accessible.

Yet I’ll admit - I turn a blind eye to whether my AI toolkit is leveraging “unlicensed” content for its training. I don’t ask. Do you?

Moonvalley is betting that transparency and ethical training will become competitive advantages. The question is whether consumers and businesses will eventually vote with their wallets for ethically-trained AI.