Itay Levy's picture

Itay Levy

itlevy

·

AI & ML interests

None yet

Organizations

New activity in nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-NVFP4 4 months ago

Llama-3_3-Nemotron-Super-49B-v1_5-NVFP4/llama_nemotron_toolcall_parser_no_streaming.py missing

#1 opened 4 months ago by

Update README and toolcall_parser

#5 opened 4 months ago by

New activity in nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 8 months ago

_prepare_generation_config bugfix (failed due to version update in transformers)

#14 opened 8 months ago by

New activity in nvidia/Llama-3_1-Nemotron-51B-Instruct 8 months ago

_prepare_generation_config bugfix (failed due to version update in transformers)

#25 opened 8 months ago by

New activity in nvidia/Llama-3_1-Nemotron-Ultra-253B-CPT-v1 8 months ago

_prepare_generation_config bugfix (failed due to version update in transformers)

#2 opened 8 months ago by

New activity in nvidia/Llama-3_3-Nemotron-Super-49B-v1 11 months ago

Nemotron 253B?

#10 opened 11 months ago by

New activity in nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 11 months ago

How come this pruned model has 162 layers

#3 opened 11 months ago by

New activity in nvidia/Llama-3_1-Nemotron-Ultra-253B-CPT-v1 11 months ago

add model card

#1 opened 11 months ago by

New activity in nvidia/Llama-3_1-Nemotron-51B-Instruct over 1 year ago

Patching hf bug that creates wrong cache length if only inputs_embeds are passed to the model

#19 opened over 1 year ago by

DeciLMForCausalLM(DeciLMPreTrainedModel, GenerationMixin) for v4.50

#16 opened over 1 year ago by

add batch_size attribute to VariableCache

#15 opened over 1 year ago by

nvidia-open-model-license

#14 opened over 1 year ago by

nvidia-open-model-license

#13 opened over 1 year ago by

nvidia-open-model-license

#12 opened over 1 year ago by

v4.46 support

#7 opened over 1 year ago by

loading as llama model

#4 opened over 1 year ago by

KnutJaegersberg

v4.45 support

#6 opened over 1 year ago by

fixed flash_attention backward_compat

#3 opened over 1 year ago by

flash_attention_utils_backward_compat

#2 opened over 1 year ago by