Module Values_1.CreateInferenceComponentInputSource

Creates an inference component, which is a SageMaker AI hosting object that you can use to deploy a model to an endpoint. In the inference component settings, you specify the model, the endpoint, and how the model utilizes the resources that the endpoint hosts. You can optimize resource utilization by tailoring how the required CPU cores, accelerators, and memory are allocated. You can deploy multiple inference components to an endpoint, where each inference component contains one model and the resource utilization needs for that individual model. After you deploy an inference component, you can directly invoke the associated model when you use the InvokeEndpoint API action.

Sourcetype nonrec t = {
  1. inferenceComponentName : InferenceComponentName.t;
    (*

    A unique name to assign to the inference component.

    *)
  2. endpointName : Values_0.EndpointName.t;
    (*

    The name of an existing endpoint where you host the inference component.

    *)
  3. variantName : Values_0.VariantName.t option;
    (*

    The name of an existing production variant where you host the inference component.

    *)
  4. specification : InferenceComponentSpecification.t option;
    (*

    Details about the resources to deploy with this inference component, including the model, container, and compute resources.

    *)
  5. specifications : InferenceComponentSpecificationList.t option;
    (*

    A list of specification objects for the inference component, one per instance type. Use this parameter when you want to deploy a different model or resource configuration for the inference component on each instance type. You can use either this parameter or the singular Specification parameter, but not both.

    *)
  6. runtimeConfig : InferenceComponentRuntimeConfig.t option;
    (*

    Runtime settings for a model that is deployed with an inference component.

    *)
  7. tags : Values_0.TagList.t option;
    (*

    A list of key-value pairs associated with the model. For more information, see Tagging Amazon Web Services resources in the Amazon Web Services General Reference.

    *)
}
Sourceval context_ : string
Sourceval make : ?variantName:??? -> ?specification:??? -> ?specifications:??? -> ?runtimeConfig:??? -> ?tags:??? -> inferenceComponentName:InferenceComponentName.t -> endpointName:Values_0.EndpointName.t -> unit -> t
Sourceval to_value : t -> [> `Structure of (string * [> `List of [> `Structure of (string * [> `Enum of string | `String of Values_0.ModelName.t | `Structure of (string * [> `Boolean of EnableCaching.t | `Enum of string | `Float of NumberOfCpuCores.t | `Integer of Values_0.ProductionVariantModelDataDownloadTimeoutInSeconds.t | `Map of ([> `String of string ] * [> `String of string ]) list | `String of Values_0.ContainerImage.t | `Structure of (string * [> `Enum of string | `Integer of Values_0.AvailabilityZoneBalanceMaxImbalance.t ]) list ]) list ]) list ] list | `String of InferenceComponentName.t | `Structure of (string * [> `Enum of string | `Integer of InferenceComponentCopyCount.t | `String of Values_0.ModelName.t | `Structure of (string * [> `Boolean of EnableCaching.t | `Enum of string | `Float of NumberOfCpuCores.t | `Integer of Values_0.ProductionVariantModelDataDownloadTimeoutInSeconds.t | `Map of ([> `String of string ] * [> `String of string ]) list | `String of Values_0.ContainerImage.t | `Structure of (string * [> `Enum of string | `Integer of Values_0.AvailabilityZoneBalanceMaxImbalance.t ]) list ]) list ]) list ]) list ]
Sourceval to_query : t -> Awso.Client.Query.t
Sourceval of_xml : Awso.Xml.t -> t
Sourceval of_string : string -> t
Sourceval of_json : Yojson.Safe.t -> t
Sourceval to_json : t -> Yojson.Safe.t