dystrio commited on
Commit
c5be449
·
verified ·
1 Parent(s): 6459e10

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +50 -1
index.html CHANGED
@@ -18,6 +18,54 @@
18
  <p class="subtitle">GPU Placement Advisor for PyTorch/NCCL Workloads</p>
19
  </header>
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  <!-- Step 1: Authentication -->
22
  <section class="card">
23
  <div class="card-title">
@@ -141,4 +189,5 @@
141
 
142
  <script src="app.js"></script>
143
  </body>
144
- </html>
 
 
18
  <p class="subtitle">GPU Placement Advisor for PyTorch/NCCL Workloads</p>
19
  </header>
20
 
21
+ <!-- How It Works (collapsible) -->
22
+ <details class="help-section">
23
+ <summary class="help-toggle">📖 How It Works</summary>
24
+ <div class="help-content">
25
+ <div class="help-block">
26
+ <h3>What is Dystrio?</h3>
27
+ <p>Dystrio analyzes your PyTorch distributed training communication patterns and generates
28
+ Kubernetes pod affinity rules to co-locate GPUs that talk the most.</p>
29
+ </div>
30
+
31
+ <div class="help-block">
32
+ <h3>How do I get a PyTorch trace?</h3>
33
+ <p>Add this to your training script:</p>
34
+ <pre><code>from torch.profiler import profile, ProfilerActivity
35
+
36
+ with profile(
37
+ activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA],
38
+ record_shapes=True,
39
+ with_stack=True
40
+ ) as prof:
41
+ # Your training step here
42
+ model(inputs)
43
+
44
+ prof.export_chrome_trace("trace.json")</code></pre>
45
+ <p>Upload the resulting <code>trace.json</code> file.</p>
46
+ </div>
47
+
48
+ <div class="help-block">
49
+ <h3>What is Session ID / Multi-Run?</h3>
50
+ <p><strong>Single run:</strong> Leave Session ID empty. You'll get recommendations based on one trace.</p>
51
+ <p><strong>Multi-run (recommended):</strong> Use the same Session ID across multiple uploads.
52
+ Dystrio tracks which communication patterns are <em>stable</em> vs <em>noisy</em>,
53
+ giving you higher-confidence recommendations.</p>
54
+ <p>Example: Upload 3 traces from different training runs with Session ID "llama-70b-training"
55
+ → Dystrio identifies consistent patterns and escalates confidence from LOW → HIGH.</p>
56
+ </div>
57
+
58
+ <div class="help-block">
59
+ <h3>How do I use the output?</h3>
60
+ <ol>
61
+ <li>Copy the generated Kubernetes YAML</li>
62
+ <li>Add the <code>affinity:</code> block to your Pod spec</li>
63
+ <li>Deploy – Kubernetes will schedule communicating pods together</li>
64
+ </ol>
65
+ </div>
66
+ </div>
67
+ </details>
68
+
69
  <!-- Step 1: Authentication -->
70
  <section class="card">
71
  <div class="card-title">
 
189
 
190
  <script src="app.js"></script>
191
  </body>
192
+ </html>
193
+