Comparing the 13B model here https://huggingface.co/cerebras/Cerebras-GPT-13B to...

potatoman22 · on March 28, 2023

Can the LLaMA weights be used for commercial products?

espadrine · on March 28, 2023

There are two aspects to it.

The first one is whether they would actually sue. The optics would be terrible. A similar situation occurred in the 90s when the RC4 cipher’s code was leaked. Everyone used the leaked code pretending that it was a new cipher called arc4random, even though they had confirmation from people that licensed the cipher that its output was identical. Nobody was sued, and the RSA company never acknowledged it.

The second one is related to the terms. The LLaMA weights themselves are licensed under terms that exclude commercial use:[0]

> You will not […] use […] the Software Products (or any derivative works thereof, works incorporating the Software Products, or any data produced by the Software), […] for […] any commercial or production purposes.

But the definition of derivative works is gray. AFAIK, if LLaMA is distilled, there is an unsettled argument to be had that the end result is not a LLaMA derivative, and cannot be considered copyright or license infringement, similar to how models trained on blog articles and tweets are not infringing on those authors’ copyright or licensing. The people that make the new model may be in breach of the license if they agreed to it, but maybe not the people that use that new model. Otherwise, ad absurdum, a model trained on the Internet will have content that was generated by LLaMA in its training set, so all models trained on the Internet after Feb 2023 will break the license.

IANAL, but ultimately, Meta wins more by benefiting from what the community contributes on top of their work (similar to what happened with React), than by suing developers that use derivatives of their open models.

[0]: https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z...

gpm · on March 28, 2023

Unclear, likely jurisdiction dependent, almost certainly not if you need to operate world wide.

mdagostino · on March 28, 2023

LLaMA is non-commercial

option · on March 28, 2023

it lags behind because according to their blogpost it was trained on <300B tokens. LLaMAs as far as I know were trained on more than trillion

gpm · on March 28, 2023

The LLaMa paper says 1 trillion for the smaller models (7B, 13B) and 1.4 trillion for the larger models (30B, 65B)