Language model type

Beimingwu supports large language models that take text as input and perform text generation tasks. These models are categorized based on the way their capabilities are transferred, as follows:

Base Models: Base models are those that are trained on large-scale, diverse datasets using self-supervised learning. These models are trained extensively and can be fine-tuned for a variety of downstream tasks to adapt to specific requirements.[1]
Representative models: GPT-3, LLaMA, BERT.
Fully Fine-Tuned Models: Fully fine-tuned models are those in which all the parameters of the base model are adjusted through fine-tuning to make the model suitable for a specific task. Compared to the base model, fully fine-tuned models generally exhibit stronger task adaptation capabilities.
Representative models: T5 (fine-tuned version), Fine-tuned GPT models.
Parameter-Efficient Fine-Tuned Models: Parameter-efficient fine-tuned models are based on base models, where most of the parameters are frozen, and only a small number of additional or specific parameters are adjusted to adapt the model to a specific task. This approach is more efficient than full fine-tuning and is suitable for scenarios with limited computational resources.
Representative models: LoRA fine-tuned GPT, Adapter Tuning.

[1] Bommasani, Rishi, et al. "On the opportunities and risks of foundation models." arXiv preprint arXiv:2108.07258 (2021).

Language model type ​

Language model type