Values.SemanticChunkingConfigurationSourceSettings for semantic document chunking for a data source. Semantic chunking splits a document into smaller documents based on groups of similar content derived from the text with natural language processing.
type nonrec t = {maxTokens : SemanticChunkingConfigurationMaxTokensInteger.t;The maximum number of tokens that a chunk can contain.
*)bufferSize : SemanticChunkingConfigurationBufferSizeInteger.t;The buffer size.
*)breakpointPercentileThreshold : SemanticChunkingConfigurationBreakpointPercentileThresholdInteger.t;The dissimilarity threshold for splitting chunks.
*)}val make :
maxTokens:SemanticChunkingConfigurationMaxTokensInteger.t ->
bufferSize:SemanticChunkingConfigurationBufferSizeInteger.t ->
breakpointPercentileThreshold:
SemanticChunkingConfigurationBreakpointPercentileThresholdInteger.t ->
unit ->
tval to_value :
t ->
[> `Structure of
(string
* [> `Integer of SemanticChunkingConfigurationMaxTokensInteger.t ])
list ]