Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Models","local":"models","sections":[{"title":"Creating a Transformer (Transformer 생성하기)","local":"creating-a-transformer-transformer-생성하기","sections":[{"title":"Different loading methods (다른 로딩 방법)","local":"different-loading-methods-다른-로딩-방법","sections":[],"depth":3},{"title":"Saving methods (저장 방법)","local":"saving-methods-저장-방법","sections":[],"depth":3}],"depth":2},{"title":"Using a Transformer model for inference (Transformer 모델을 추론에 사용하기)","local":"using-a-transformer-model-for-inference-transformer-모델을-추론에-사용하기","sections":[{"title":"Using the tensors as inputs to the model (텐서를 모델의 입력으로 사용하기)","local":"using-the-tensors-as-inputs-to-the-model-텐서를-모델의-입력으로-사용하기","sections":[],"depth":3}],"depth":2}],"depth":1}"> | |
| <link href="/docs/course/pr_1069/ko/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/entry/start.667ba29b.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/scheduler.37c15a92.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/singletons.8de3d8e2.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/index.18351ede.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/paths.6047fff5.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/entry/app.65e35200.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/index.2bf4358c.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/nodes/0.9169bc8c.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/nodes/15.b336729e.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/Youtube.1e50a667.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/CodeBlock.4e987730.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/DocNotebookDropdown.efc1fb7c.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/FrameworkSwitchCourse.8d4d4ab6.js"> | |
| <link rel="modulepreload" href="/docs/course/pr_1069/ko/_app/immutable/chunks/getInferenceSnippets.24b50994.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Models","local":"models","sections":[{"title":"Creating a Transformer (Transformer 생성하기)","local":"creating-a-transformer-transformer-생성하기","sections":[{"title":"Different loading methods (다른 로딩 방법)","local":"different-loading-methods-다른-로딩-방법","sections":[],"depth":3},{"title":"Saving methods (저장 방법)","local":"saving-methods-저장-방법","sections":[],"depth":3}],"depth":2},{"title":"Using a Transformer model for inference (Transformer 모델을 추론에 사용하기)","local":"using-a-transformer-model-for-inference-transformer-모델을-추론에-사용하기","sections":[{"title":"Using the tensors as inputs to the model (텐서를 모델의 입력으로 사용하기)","local":"using-the-tensors-as-inputs-to-the-model-텐서를-모델의-입력으로-사용하기","sections":[],"depth":3}],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <div class="bg-white leading-none border border-gray-100 rounded-lg flex p-0.5 w-56 text-sm mb-4"><a class="flex justify-center flex-1 py-1.5 px-2.5 focus:outline-none !no-underline rounded-l bg-red-50 dark:bg-transparent text-red-600" href="?fw=pt"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><defs><clipPath id="a"><rect x="3.05" y="0.5" width="25.73" height="31" fill="none"></rect></clipPath></defs><g clip-path="url(#a)"><path d="M24.94,9.51a12.81,12.81,0,0,1,0,18.16,12.68,12.68,0,0,1-18,0,12.81,12.81,0,0,1,0-18.16l9-9V5l-.84.83-6,6a9.58,9.58,0,1,0,13.55,0ZM20.44,9a1.68,1.68,0,1,1,1.67-1.67A1.68,1.68,0,0,1,20.44,9Z" fill="#ee4c2c"></path></g></svg> Pytorch </a><a class="flex justify-center flex-1 py-1.5 px-2.5 focus:outline-none !no-underline rounded-r text-gray-500 filter grayscale" href="?fw=tf"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="0.94em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 274"><path d="M145.726 42.065v42.07l72.861 42.07v-42.07l-72.86-42.07zM0 84.135v42.07l36.43 21.03V105.17L0 84.135zm109.291 21.035l-36.43 21.034v126.2l36.43 21.035v-84.135l36.435 21.035v-42.07l-36.435-21.034V105.17z" fill="#E55B2D"></path><path d="M145.726 42.065L36.43 105.17v42.065l72.861-42.065v42.065l36.435-21.03v-84.14zM255.022 63.1l-36.435 21.035v42.07l36.435-21.035V63.1zm-72.865 84.135l-36.43 21.035v42.07l36.43-21.036v-42.07zm-36.43 63.104l-36.436-21.035v84.135l36.435-21.035V210.34z" fill="#ED8E24"></path><path d="M145.726 0L0 84.135l36.43 21.035l109.296-63.105l72.861 42.07L255.022 63.1L145.726 0zm0 126.204l-36.435 21.03l36.435 21.036l36.43-21.035l-36.43-21.03z" fill="#F8BF3C"></path></svg> TensorFlow </a></div> <h1 class="relative group"><a id="models" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#models"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Models</span></h1> <div class="flex space-x-1 absolute z-10 right-0 top-0"> <a href="https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter2/section3_pt.ipynb" target="_blank"><img alt="Open In Colab" class="!m-0" src="https://colab.research.google.com/assets/colab-badge.svg"></a> <a href="https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter2/section3_pt.ipynb" target="_blank"><img alt="Open In Studio Lab" class="!m-0" src="https://studiolab.sagemaker.aws/studiolab.svg"></a></div> <iframe class="w-full xl:w-4/6 h-80" src="https://www.youtube-nocookie.com/embed/AhChOFRegn4" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> <p data-svelte-h="svelte-21bfoo">이번 섹션에서는 모델을 생성하고 사용하는 방법에 대해 자세히 알아보겠습니다. 체크포인트에서 모델을 인스턴스화하는 데 유용한 <code>AutoModel</code> 클래스를 사용할 것입니다.</p> <p data-svelte-h="svelte-17klgbd">이 <code>AutoModel</code> 클래스와 관련 클래스들은 실제로 라이브러리에 있는 다양한 모델들을 감싸고 있는 간단한 래퍼입니다. 이 래퍼는 체크포인트에 적합한 모델 아키텍처를 자동으로 추측하고, 이 아키텍처를 가진 모델을 인스턴스화하는 것도 똑똑하게 처리합니다.</p> <p data-svelte-h="svelte-rfqpyg">하지만, 만약 모델의 아키텍처를 직접 정의하고 싶다면, 해당 모델의 클래스를 사용할 수 있습니다. BERT 모델을 예로 들어보겠습니다.</p> <h2 class="relative group"><a id="creating-a-transformer-transformer-생성하기" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#creating-a-transformer-transformer-생성하기"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Creating a Transformer (Transformer 생성하기)</span></h2> <p data-svelte-h="svelte-1sy9k0m">BERT 모델을 초기화 하기 위해서는 먼저 모델의 환경설정을 로드해야 합나다.</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> BertConfig, BertModel | |
| <span class="hljs-comment"># Building the config</span> | |
| config = BertConfig() | |
| <span class="hljs-comment"># Building the model from the config</span> | |
| model = BertModel(config)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-4918vz">이 환경설정은 모델을 구축하는데 사용되는 많은 속성들을 포함하고 있습니다:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-built_in">print</span>(config)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->BertConfig { | |
| [...] | |
| <span class="hljs-string">"hidden_size"</span>: <span class="hljs-number">768</span>, | |
| <span class="hljs-string">"intermediate_size"</span>: <span class="hljs-number">3072</span>, | |
| <span class="hljs-string">"max_position_embeddings"</span>: <span class="hljs-number">512</span>, | |
| <span class="hljs-string">"num_attention_heads"</span>: <span class="hljs-number">12</span>, | |
| <span class="hljs-string">"num_hidden_layers"</span>: <span class="hljs-number">12</span>, | |
| [...] | |
| }<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1e8xh3r">아직 이 속성들이 무엇을 의미하는지는 모르겠지만, 몇몇은 익숙할 것입니다: <code>hidden_size</code> 속성은 <code>hidden_states</code> 벡터의 크기를 정의하고, <code>num_hidden_layers</code>는 Transformer 모델이 가지고 있는 레이어의 수를 정의합니다.</p> <h3 class="relative group"><a id="different-loading-methods-다른-로딩-방법" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#different-loading-methods-다른-로딩-방법"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Different loading methods (다른 로딩 방법)</span></h3> <p data-svelte-h="svelte-13ikagc">무작위 값을 통한 기본 환경설정으로 모델 생성</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> BertConfig, BertModel | |
| config = BertConfig() | |
| model = BertModel(config) | |
| <span class="hljs-comment"># Model is randomly initialized!</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1yw7n7u">이 모델은 무작위로 초기화되어 있기 때문에, 아직은 아무런 유용한 정보를 포함하고 있지 않습니다. 이 모델을 훈련시키기 위해서는, 먼저 훈련 데이터를 준비해야 합니다. 우리가 바닥부터 학습을 할 수 있지만, 이 과정은 <a href="/course/chapter1">Chapter 1</a>에서 확인 했듯이, 많은 시간과 많은 데이터가 필요하며, 학습 환경에도 큰 영향을 미칩니다.</p> <p data-svelte-h="svelte-44rdi0">이러한 불필요한 중복된 노력을 피하기 위해서, 이미 훈련된 모델을 공유하고 재사용할 수 있어야 합니다.</p> <p data-svelte-h="svelte-10dy6w6">훈련된 Transformer 모델을 불러오는 것은 매우 간단합니다 - <code>from_pretrained</code> 메소드를 사용하면 됩니다.</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> BertModel | |
| model = BertModel.from_pretrained(<span class="hljs-string">"bert-base-cased"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1fbu2yk">이전에 봤듯이, <code>BertModel</code> 대신 <code>AutoModel</code> 클래스를 사용할 수도 있습니다. 이제부터는 체크포인트에 독립적인 코드를 생성하기 위해 <code>AutoModel</code>를 사용하겠습니다. 만약 코드가 한 체크포인트에서 잘 동작한다면, 다른 체크포인트에서도 잘 동작해야 합니다. 이는 체크포인트의 아키텍처가 다르더라도, 비슷한 작업(예를 들어, 감성 분석 작업)을 위해 훈련된 경우에도 적용됩니다.</p> <p data-svelte-h="svelte-cvzi5b">이 코드 샘플에서는 <code>BertConfig</code>를 사용하지 않았고, 대신 <code>bert-base-cased</code> 식별자를 통해 사전 훈련된 모델을 불러왔습니다. 이는 BERT의 저자들이 직접 훈련시킨 체크포인트입니다. 자세한 내용은 <a href="https://huggingface.co/bert-base-cased" rel="nofollow">모델 카드</a>에서 확인할 수 있습니다.</p> <p data-svelte-h="svelte-17jdgt">이 모델은 체크포인트의 모든 가중치로 초기화되었습니다. 이 모델은 체크포인트에서 훈련된 작업에 대해 직접 추론에 사용할 수 있으며, 새로운 작업에 대해 미세 조정할 수도 있습니다. 사전 훈련된 가중치로부터 학습을 진행하면, 빈 상태에서 훈련을 시작하는 것보다 빠르게 좋은 결과를 얻을 수 있습니다.</p> <p data-svelte-h="svelte-14igyn2">모델을 불러오는 또 다른 방법은 <code>from_pretrained()</code> 메서드를 사용하는 것입니다. 이 메서드는 체크포인트를 다운로드하고, 캐시에 저장합니다(이후 <code>from_pretrained()</code> 메서드를 호출할 때 다시 다운로드하지 않습니다). 캐시 폴더는 기본적으로 <em>~/.cache/huggingface/transformers</em>에 저장됩니다. 캐시 폴더를 사용자 정의하려면 <code>HF_HOME</code> 환경 변수를 설정하면 됩니다.</p> <p data-svelte-h="svelte-9vriyb">모델을 불러오는 식별자는 BERT 아키텍처와 호환되는 경우 모델 허브의 모든 모델의 식별자가 될 수 있습니다. BERT 체크포인트의 전체 목록은 <a href="https://huggingface.co/models?filter=bert" rel="nofollow">여기</a>에서 확인할 수 있습니다.</p> <h3 class="relative group"><a id="saving-methods-저장-방법" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#saving-methods-저장-방법"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Saving methods (저장 방법)</span></h3> <p data-svelte-h="svelte-pdew2h">모델을 저장하는 방법은 불러오는 방법처럼 쉽습니다. <code>save_pretrained()</code> 메서드를 사용하면 됩니다. 이 메서드는 <code>from_pretrained()</code> 메서드와 유사합니다.</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->model.save_pretrained(<span class="hljs-string">"directory_on_my_computer"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1lqi2rw">이는 2가지 파일을 저장하게 됩니다:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->ls <span class="hljs-keyword">directory_on_my_computer | |
| </span> | |
| <span class="hljs-built_in">config</span>.<span class="hljs-keyword">json </span>model.safetensors<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-155d6gs"><em>config.json</em> 파일은 모델 아키텍처를 구축하는 데 필요한 속성을 알려줍니다. 이 파일에는 체크포인트가 어디에서 생성되었는지, 마지막으로 체크포인트를 저장할 때 사용한 🤗 Transformers 버전 등의 메타데이터도 포함되어 있습니다.</p> <p data-svelte-h="svelte-15ps4e">The <em>model.safetensors</em> file is known as the <em>state dictionary</em>; it contains all your model’s weights. The two files go hand in hand; the configuration is necessary to know your model’s architecture, while the model weights are your model’s parameters.</p> <h2 class="relative group"><a id="using-a-transformer-model-for-inference-transformer-모델을-추론에-사용하기" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-a-transformer-model-for-inference-transformer-모델을-추론에-사용하기"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Using a Transformer model for inference (Transformer 모델을 추론에 사용하기)</span></h2> <p data-svelte-h="svelte-6brd07">Now that you know how to load and save a model, let’s try using it to make some predictions. Transformer models can only process numbers — numbers that the tokenizer generates. But before we discuss tokenizers, let’s explore what inputs the model accepts.</p> <p data-svelte-h="svelte-19s153b">이제 모델을 불러오고 저장하는 방법을 알았으니, 모델을 사용하여 예측을 만들어 보겠습니다. Transformer 모델은 토크나이저가 생성하는 숫자만 처리할 수 있습니다. 그러나 토크나이저에 대해 논의하기 전에 모델이 받는 입력에 대해 알아보겠습니다.</p> <p data-svelte-h="svelte-ijlg93">토크나이저는 입력을 적절한 프레임워크의 텐서로 변환할 수 있지만, 이해도를 높이기 위해 모델에 입력을 보내기 전 무엇을 반드시 해야 하는지 간단히 살펴보겠습니다.</p> <p data-svelte-h="svelte-5ur23s">우리가 여러 시퀀스들이 있다고 가정해 봅시다:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->sequences = [<span class="hljs-string">"Hello!"</span>, <span class="hljs-string">"Cool."</span>, <span class="hljs-string">"Nice!"</span>]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-j2rddg">토크나이저는 이를 단어 인덱스로 변환합니다. 이를 <em>input IDs</em>라고 합니다. 각 시퀀스는 이제 숫자 목록입니다! 결과는 다음과 같습니다:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->encoded_sequences = [ | |
| [<span class="hljs-number">101</span>, <span class="hljs-number">7592</span>, <span class="hljs-number">999</span>, <span class="hljs-number">102</span>], | |
| [<span class="hljs-number">101</span>, <span class="hljs-number">4658</span>, <span class="hljs-number">1012</span>, <span class="hljs-number">102</span>], | |
| [<span class="hljs-number">101</span>, <span class="hljs-number">3835</span>, <span class="hljs-number">999</span>, <span class="hljs-number">102</span>], | |
| ]<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1pakz9q">이는 인코딩된 시퀀스의 목록입니다. 텐서는 정사각형 모양만 받을 수 있습니다 (행렬을 생각해 보세요). 이 “배열”은 이미 정사각형 모양이므로 텐서로 변환하는 것은 쉽습니다:</p> <p data-svelte-h="svelte-y3q2fn">This is a list of encoded sequences: a list of lists. Tensors only accept rectangular shapes (think matrices). This “array” is already of rectangular shape, so converting it to a tensor is easy:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> torch | |
| model_inputs = torch.tensor(encoded_sequences)<!-- HTML_TAG_END --></pre></div> <h3 class="relative group"><a id="using-the-tensors-as-inputs-to-the-model-텐서를-모델의-입력으로-사용하기" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-the-tensors-as-inputs-to-the-model-텐서를-모델의-입력으로-사용하기"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Using the tensors as inputs to the model (텐서를 모델의 입력으로 사용하기)</span></h3> <p data-svelte-h="svelte-m9n65x">모델의 텐서를 사용하는 것은 매우 간단합니다. 모델에 입력을 넣기만 하면 됩니다:</p> <div class="code-block relative "><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->output = model(model_inputs)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1ff0t23">모델이 다양한 어규먼트를 받는 중에, 입력은 input IDs 만 필요합니다. 나머지 어규먼트들은 언제 필요한지, 어떤 역할을 하는지는 나중에 설명하겠습니다. 먼저 토크나이저에 대해 좀 더 자세히 알아보겠습니다.</p> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/course/blob/main/chapters/ko/chapter2/3.mdx" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_5pcgyl = { | |
| assets: "/docs/course/pr_1069/ko", | |
| base: "/docs/course/pr_1069/ko", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/course/pr_1069/ko/_app/immutable/entry/start.667ba29b.js"), | |
| import("/docs/course/pr_1069/ko/_app/immutable/entry/app.65e35200.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 15], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 39.1 kB
- Xet hash:
- d26688702652472175c0a782c09c03605495b424e912cf2b8591e3ba035a8b48
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.