Join the conversation

I explored Deepseek-r1 on Ollama it has multiple options1.5 billion
7 billion
8 billion
14 billion
32 billion
70 billion and also
671 billion parameters
Reply

Q: In traditional application we develop software that gets data from the database.In new versions of the application we update the application logic and sometime database and data also.Now let us consider the training of the LLMs. There is MODEL ( software program ) that we train with the data we give. As a result Parameters , Biases, weights we get that are saved with MODEL and we call it trained MODEL. On new data the trained MODEL make predictions based on Parameters , Biases and weightsQ: Why one LLM is different from other. Does it is based on Model,Q: We give Model the DATA as input and we get Parameters, Biases, Weights as output. Do they become the part of Model and we call it trained model.Q: Does Data becomes a part of trained model. Why trained models become so large in size.
Reply

deepseek
Reply

openai
Reply