| --- |
| license: mit |
| base_model: |
| - unsloth/DeepSeek-R1-BF16 |
| --- |
| ## Model Details |
|
|
| This model card is for mxfp4 quantization of [unsloth/DeepSeek-R1-BF16](https://huggingface.co/unsloth/DeepSeek-R1-BF16) based on [intel/auto-round](https://github.com/intel/auto-round) saved with llm_compressor format. |
| Please follow the license of the original model. |
| |
| ## Run Inference |
| ``` |
| compressed-tensors 0.14.0.1 |
| transformers 4.57.6 |
| torch 2.10.0 |
| vllm 0.19.0 |
| ``` |
| ``` |
| cd vllm |
| model_name=INCModel/DeepSeek-R1-MXFP4-LLMC |
| python vllm/examples/basic/offline_inference/generate.py \ |
| --model $model_name \ |
| --max-model-len 2048 \ |
| -tp 8 \ |
| --trust-remote-code \ |
| --enforce-eager \ |
| --gpu-memory-utilization 0.8 |
| ``` |
| ## Ethical Considerations and Limitations |
| |
| The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. |
| Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs. |
|
|
| Therefore, before deploying any applications of the model, developers should perform safety testing. |
|
|
| ## Caveats and Recommendations |
|
|
| Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. |
|
|
| Here are a couple of useful links to learn more about Intel's AI software: |
|
|
| - [Intel Neural Compressor](https://github.com/intel/neural-compressor) |
| - [AutoRound](https://github.com/intel/auto-round) |
|
|
| ## Disclaimer |
|
|
| The license on this model does not constitute legal advice. |
| We are not responsible for the actions of third parties who use this model. |
| Please consult an attorney before using this model for commercial purposes. |