NLProxy / nlproxy /docs /utils.md
Luiserb's picture
first commit
2129c29
|
Raw
History Blame Contribute Delete
3.49 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

NLProxy Utilities Module Reference

This document covers shared utility modules in utils/.

Purpose

Utility modules centralize constants, pricing data, and logging configuration used across the NLProxy codebase.

Files

utils/constants.py

Purpose

Provides canonical pricing data, aggressiveness presets, semantic firewall defaults, and other system constants.

Pricing

  • MODEL_PRICING contains per-1,000-token pricing for supported models.
  • Pricing can be overridden using environment variables of the form:
    • NLPROXY_PRICE_{MODEL_NAME}_INPUT
    • NLPROXY_PRICE_{MODEL_NAME}_OUTPUT
  • _normalize_pricing_env_name() sanitizes the model name into a valid env var segment.
  • PROVIDER_PRICING is the runtime pricing table after env overrides.

Aggressiveness Presets

  • AGGRESSIVENESS_MAP
    • legal: 0.25
    • finance: 0.30
    • code: 0.45
    • general: 0.40

Semantic Firewall Defaults

  • SEMANTIC_FIREWALL_CONFIG controls optional semantic attack detection.
  • Default device preference is cpu.
  • Similarity threshold is 0.85 by default.

Semantic Stopwords

  • SEMANTIC_STOPWORDS contains curated low-value phrases used during prompt reconstruction.
  • Includes English and Spanish tokens for bilingual prompt handling.

System Defaults

  • DEFAULT_AGGRESSIVENESS: 0.2
  • DEFAULT_CONFIDENCE_THRESHOLD: 0.6
  • DEFAULT_MAX_TOKENS: 512
  • DEFAULT_TIMEOUT_SECONDS: 30.0
  • DEFAULT_BATCH_SIZE: 32
  • DEFAULT_EMBEDDING_DIM: 384
  • DEFAULT_SIMILARITY_THRESHOLD: 0.92

API Constants

  • API_VERSION: v1
  • CHAT_ENDPOINT: /v1/chat/completions
  • HEALTH_ENDPOINT: /health
  • METRICS_ENDPOINT: /metrics
  • DOCS_ENDPOINT: /docs

Security Constants

  • PLACEHOLDER_PREFIX: __PROT_
  • MAX_PROMPT_LENGTH: 100_000
  • MAX_RESPONSE_LENGTH: 50_000
  • ALLOWED_ROLES: {"system", "user", "assistant"}

Utility Functions

  • get_pricing(model_name: str) -> Dict[str, float]
    • Returns pricing data for a given model name.
    • Falls back to default pricing if unknown.

utils/logger.py

Purpose

Standardizes logging configuration and request-context logging across the project.

Key Components

  • ContextFilter

    • Thread-local context propagation for request_id, user_id, and custom fields.
    • Methods: set_context(), get_context(), clear_context().
  • JSONFormatter

    • Emits structured JSON logs for production observability.
    • Includes timestamp, level, module, function, line, and additional context.
  • PrettyFormatter

    • Emits ANSI-colored human-readable logs for development.
  • setup_logging(level, format_type, log_dir, max_bytes, backup_count, disable_existing)

    • Configures root logger and optional rotating file output.
    • Detects environment via NLPROXY_ENV.
  • get_request_logger(name)

    • Returns a LoggerAdapter bound to current context.

Performance Considerations

  • Logging format is selected based on environment to balance readability and parsing.
  • File rotation prevents unbounded disk growth.
  • Third-party noise is suppressed for stable runtime logs.

Edge Cases

  • setup_logging() is idempotent and no-op after first initialization.
  • Context filter gracefully handles missing context data.

Implementation Notes

  • Utility modules are intentionally low-dependency and safe to import in startup initialization.
  • Logging format selection is automatic unless overridden explicitly.