Question about models and weights. When an organization says they release the weights, how is that different from when an organization releases a model, say Whisper from OpenAI? In what way are the model releases different in these cases?
OpenAI also released the weights for Whisper. Some model releases, like LLaMa from Meta, just contain the code for training and the data. You can train the weights yourself, but for LLaMa that takes multiple weeks on very expensive hardware.
(LLaMa also released the weights for researchers and they leaked online, but the weights are not open-source.)
When a company releases a model including the weights, you can download their pre-trained weights and run inference with them, without having to train by yourself.
When you download a trained model for use by Python, I'm assuming the model contains both the architecture (the neural net or even a boosted tree) as well as the weights / tree structure that makes the model actually usable in inference. When organizations release a trained model, I'm assuming that the weights are necessary to make use of that model? If not, then are they not really releasing the model, but just the architecture and training data?
For example Lucene, models will be the java library, data is text data like wikipedia and weights are the lucene index. if you have all the 3 you can start searching right away, if you have model+data you have to generate the index which can take a lot of time, training/indexing take more than searching or using the model. if you have just he model you need to get your own data and run training on it