| --- |
| datasets: |
| - togethercomputer/RedPajama-Data-V2 |
| language: |
| - de |
| pipeline_tag: text-generation |
| library_name: coremltools |
| license: other |
| tags: |
| - coreml |
| - tinyllama |
| - german-language-model |
| --- |
| |
| # LLäMmlein 1B CoreML |
|
|
| This repository contains the CoreML version of [LLäMmlein 1B](https://huggingface.co/LSX-UniWue/LLaMmlein_1B), a German language model trained from scratch using the [Tinyllama](https://github.com/jzhang38/TinyLlama) codebase on the German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2). |
|
|
| ## Model Details |
|
|
| - **Model Type**: German Language Model based on TinyLlama architecture |
| - **Language:** German |
| - **Framework**: CoreML |
| - **Original Model:** [LSX-UniWue/LLaMmlein_1B](https://huggingface.co/LSX-UniWue/LLaMmlein_1B) |
| - **Size:** 1B parameters |
| - **Format:** CoreML (.mlpackage) |
| - **Minimum Deployment Target:** iOS 16 |
| - **Compute Units:** ALL (CPU + Neural Engine) |
| - **Input Sequence Length:** 512 tokens |
|
|
| ## Conversion Process |
|
|
| The model was converted from PyTorch to CoreML using the following steps: |
|
|
| ```python |
| import torch |
| import numpy as np |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| import coremltools as ct |
| |
| # Load model and convert to TorchScript |
| model = AutoModelForCausalLM.from_pretrained("LSX-UniWue/LLaMmlein_1B") |
| tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/LLaMmlein_1B") |
| |
| # Set model to eval mode |
| model.eval() |
| |
| # Create example input |
| text = "Ein Beispieltext" |
| inputs = tokenizer(text, return_tensors="pt") |
| |
| # Create a wrapper class for tracing |
| class ModelWrapper(torch.nn.Module): |
| def __init__(self, model): |
| super().__init__() |
| self.model = model |
| |
| def forward(self, input_ids): |
| return self.model(input_ids).logits |
| |
| # Wrap and trace model |
| wrapped_model = ModelWrapper(model) |
| traced_model = torch.jit.trace(wrapped_model, inputs.input_ids) |
| |
| # Convert to CoreML |
| model_mlpackage = ct.convert( |
| traced_model, |
| inputs=[ |
| ct.TensorType( |
| name="input_ids", |
| shape=inputs.input_ids.shape, |
| dtype=np.int32 |
| ) |
| ], |
| source="pytorch", |
| minimum_deployment_target=ct.target.iOS16, |
| convert_to="mlprogram", |
| compute_precision=ct.precision.FLOAT16, |
| compute_units=ct.ComputeUnit.ALL, |
| ) |
| |
| model_mlpackage.save("LLaMmlein_1B.mlpackage") |
| ``` |
|
|
| ## Usage |
|
|
| To use this model on Apple devices: |
|
|
| ```swift |
| import CoreML |
| |
| // Load the model |
| let config = MLModelConfiguration() |
| let model = try LLaMmlein_1B(configuration: config) |
| |
| // Prepare input |
| let inputIds = // Your tokenized input as [Int32] |
| |
| // Make prediction |
| let prediction = try model.prediction(input_ids: inputIds) |
| ``` |
|
|
| ## Performance Considerations |
|
|
| - The model is optimized for Apple Neural Engine |
| - Recommended for iOS 16+ devices |
| - Best performance achieved with batch size of 1 |
| - Maximum sequence length is set to 512 tokens |
|
|
| ## Original Model Information |
|
|
| The original model was trained on the German portion of RedPajama V2. For more details about the base model: |
| - Visit the [project page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) |
| - Read the [research paper](arxiv.org/abs/2411.11171) |
| - Check the [SuperGLEBer benchmark](https://lsx-uniwue.github.io/SuperGLEBer-site/) for evaluation results |
|
|
| ## License |
|
|
| This model inherits its license from the original LLäMmlein 1B model. |
|
|
| ## Citation |
|
|
| If you use this model, please cite the original work: |
|
|
| ```bibtex |
| @misc{llammlein2024, |
| title={LLäMmlein: A German Language Model}, |
| author={LSX-UniWue}, |
| year={2024}, |
| publisher={Hugging Face}, |
| journal={Hugging Face Hub}, |
| howpublished={\url{https://huggingface.co/LSX-UniWue/LLaMmlein_1B}}, |
| } |
| ``` |
|
|
| For the original model description and evaluation results, see the [original model card](https://huggingface.co/LSX-UniWue/LLaMmlein_1B). |
|
|