File size: 1,359 Bytes
942050b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
"""Schema indexer: introspect, chunk, embed, retrieve.

Stage 3 of the v2 pipeline (per docs/02_architecture_v2.md §4):
- two Chroma collections (`schema_chunks`, `fewshot_qsql`),
- one chunk per table with sample values, null %, FK list,
- FK graph kept in memory as a dict-of-sets, **not** as a Chroma collection.

Public surface:
- `introspect(engine)` → `list[TableInfo]`  (introspector)
- `to_chunks(tables, db_id)` → `list[SchemaChunk]`  (chunker)
- `SchemaIndex` (indexer) — wraps a Chroma persistent client + embed provider
- `retrieve_context(...)` → `ContextBundle`  (retriever)
"""

from __future__ import annotations

from nl_sql.schema_index.chunker import SchemaChunk, to_chunks
from nl_sql.schema_index.indexer import (
    FEWSHOT_COLLECTION,
    SCHEMA_COLLECTION,
    FewShotExample,
    FewShotHit,
    SchemaIndex,
    SchemaQueryHit,
)
from nl_sql.schema_index.introspector import (
    ColumnInfo,
    ForeignKeyInfo,
    TableInfo,
    introspect,
)
from nl_sql.schema_index.retriever import ContextBundle, retrieve_context

__all__ = [
    "FEWSHOT_COLLECTION",
    "SCHEMA_COLLECTION",
    "ColumnInfo",
    "ContextBundle",
    "FewShotExample",
    "FewShotHit",
    "ForeignKeyInfo",
    "SchemaChunk",
    "SchemaIndex",
    "SchemaQueryHit",
    "TableInfo",
    "introspect",
    "retrieve_context",
    "to_chunks",
]