MDAA TS Docs
    Preparing search index...

    SageMaker-hosted LLM model configuration with auto-scaling for GAIA chatbot backends. Supports Falcon, Mistral, and Llama2 models with configurable instance types and scaling.

    Use cases: SageMaker LLM deployment; Auto-scaling chatbot backends; Custom instance sizing; Production LLM hosting

    AWS: Amazon SageMaker real-time inference endpoints with auto-scaling

    Validation: Required model field; instance counts must satisfy min <= initial <= max when provided

    interface SagemakerLlmModelConfig {
        initialInstanceCount?: number;
        instanceType?: string;
        maximumInstanceCount?: number;
        minimumInstanceCount?: number;
        model: SupportedSageMakerModels;
    }
    Index

    Properties

    initialInstanceCount?: number

    Initial number of instances for the LLM endpoint.

    Use cases: Initial capacity planning; Baseline availability; Deployment sizing

    AWS: SageMaker endpoint initial instance count

    Validation: Optional; Positive integer; Should be between min and max counts

    instanceType?: string

    SageMaker instance type for LLM hosting.

    Use cases: Compute resource sizing; Performance optimization; Cost management

    AWS: Amazon SageMaker endpoint instance type

    Validation: Optional; Must be valid SageMaker instance type (e.g., ml.g5.2xlarge)

    maximumInstanceCount?: number

    Maximum instance count for LLM endpoint auto-scaling.

    Use cases: Peak capacity control; Cost limits; Auto-scaling upper bound

    AWS: SageMaker endpoint auto-scaling maximum

    Validation: Optional; Positive integer; Must be >= initial and min counts

    minimumInstanceCount?: number

    Minimum instance count for LLM endpoint auto-scaling.

    Use cases: Cost optimization; Minimum availability guarantee; Auto-scaling lower bound

    AWS: SageMaker endpoint auto-scaling minimum

    Validation: Optional; Positive integer; Must be <= initial and max counts

    SageMaker LLM model to deploy for conversational AI.

    Use cases: LLM model selection; Chatbot backend model; Text generation endpoint

    AWS: Amazon SageMaker hosted LLM model

    Validation: Required; Must be valid SupportedSageMakerModels enum value (FalconLite, Llama2_13b_Chat, Mistral7b_Instruct2)