Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference
Artificial Intelligence Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference Text-to-SQL generation remains a persistent challenge in enterprise AI applications, particularly when working with custom SQL dialects or domain-specific database schemas. While foundation models (FMs) demonstrate strong performance on standard SQL, achieving production-grade accuracy for specialized dialects requires fine-tuning. However, fine-tuning introduces an operational trade-off: hosting custom models on persistent infrastructure incurs continuous costs, even during periods of zero utilization. The on-demand inference of Amazon Bedrock with fine-tuned Amazon Nova Micro models offers an alternative. By combining the efficiency of LoRA (Low-Rank Adaptation) fine-tuning with serverless and pay-per-token inference, organizations can achieve custom text-to-SQL capabilities without the overhead cost incurred by persistent model hosting. Despite the additional inference time overhead of applying LoRA adapters, testing demonstrated latency suitable for interactive text-to-SQL applications, with costs scaling by usage rather than provisioned capacity. In this post, we demonstrate two approaches to fine-tune Amazon Nova Micro for custom SQL dialect generation to deliver both cost efficiency and production ready performance. Our example workload maintained a cost of $0.80 monthly with a sample traffic of 22,000 queries per month, which resulted in costs savings compared to a persistently hosted model infrastructure. Prerequisites To deploy these solutions, you will need the following: - An AWS account with billing enabled - Standard IAM permissions and role configured to access: - Amazon Bedrock Nova Micro model - Amazon SageMaker AI - Amazon Bedrock Model customization - Quota for ml.g5.48xl instance for Amazon SageMaker AI training. Solution overview The solution consists of the following high-level steps: - Prepare your custom SQL training dataset with I/O pairs specific to your organization’s SQL dialect and business requirements. - Start the fine-tuning process on Amazon Nova Micro model using your prepared dataset and selected fine-tuning approach. - Amazon Bedrock model customization for streamlined deployment - Amazon SageMaker AI for fine-grained training customization and control - Deploy the custom model on Amazon Bedrock to use on-demand inference, removing infrastructure management while paying only for token usage. - Validate model performance with test queries specific to your custom SQL dialect and business use cases. To demonstrate this approach in practice, we provide two complete implementation paths that address different organizational needs. The first uses the managed model customization of Amazon Bedrock for teams prioritizing simplicity and rapid deployment. The second uses Amazon SageMaker AI training jobs for organizations requiring more granular control over hyperparameters and training infrastructure. Both implementations share the same data preparation pipeline and deploy to Amazon Bedrock for on-demand inference. The following are links to each GitHub code sample: The following architecture diagram illustrates the end-to-end workflow, which encompasses data preparation, both fine-tuning approaches, and the Bedrock deployment path that enables serverless inference. 1. Dataset preparation Our demonstration uses the sql-create-context dataset. This dataset is a curated combination of WikiSQL and Spider datasets containing over 78,000 examples of natural language questions paired with SQL queries across diverse database schemas. This dataset provides an ideal foundation for…

