PranavGuhan commited on
Commit
07ac7e7
·
verified ·
1 Parent(s): 6e783df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -1
README.md CHANGED
@@ -10,4 +10,49 @@ metrics:
10
  pipeline_tag: text-generation
11
  tags:
12
  - code
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  pipeline_tag: text-generation
11
  tags:
12
  - code
13
+ - gpt2
14
+ - pytorch
15
+ - causal-lm
16
+ ---
17
+
18
+ # python-ds-accelerate (GPT-2 124M)
19
+
20
+ This model is a GPT-2 (124M parameter) causal language model trained from scratch specifically for **Python code completion** in Data Science contexts.
21
+
22
+ ## Model Details
23
+
24
+ ### Model Description
25
+
26
+ This model is an implementation of the GPT-2 architecture optimized for generating functional Python code snippets. It was trained using a custom training pipeline that incorporates a **keytoken weighted loss** function to prioritize important programming keywords (like `plt`, `pd`, `fit`, `predict`), making it more effective at suggesting Data Science-related code.
27
+
28
+ - **Developed by:** [Pranav Guhan R](https://github.com/PranavGuhanR)
29
+ - **Model type:** Transformer-based Causal Language Model
30
+ - **Language(s):** Python (English comments)
31
+ - **License:** Apache 2.0
32
+ - **Finetuned from model:** Trained from scratch
33
+
34
+ ### Model Sources
35
+
36
+ - **Repository:** [GPT-2-124M-pretraining-for-code-completion](https://github.com/PranavGuhanR/GPT-2-124M-pretraining-for-code-completion)
37
+
38
+ ## Uses
39
+
40
+ ### Direct Use
41
+ The model is intended to be used for code completion tasks, specifically for completing Python scripts involving libraries like `pandas`, `matplotlib`, and `scikit-learn`.
42
+
43
+ ### Out-of-Scope Use
44
+ The model is not suitable for general-purpose natural language conversation or generating code in languages other than Python.
45
+
46
+ ## How to Get Started with the Model
47
+
48
+ You can use the model directly with a Hugging Face pipeline:
49
+
50
+ ```python
51
+ from transformers import pipeline
52
+
53
+ pipe = pipeline("text-generation", model="PranavGuhan/python-ds-accelerate")
54
+
55
+ txt = """# create dataframe from x and y
56
+ df = pd.DataFrame({'x':x, 'y':y})
57
+ """
58
+ print(pipe(txt, num_return_sequences=1)[0]["generated_text"])