SageMaker Data Agent speeds SQL development in Query Editor

Amazon Web Services published an AWS Big Data Blog post and user documentation describing the SageMaker Data Agent in Query Editor, a feature of SageMaker Unified Studio that turns natural-language prompts into SQL. Per the AWS blog, the agent generates queries for Amazon Redshift and Amazon Athena, reads real table and column metadata from the AWS Glue Data Catalog so generated SQL references actual tables, and retains context across a session. AWS describes four core capabilities: catalog-aware SQL generation, querybook and session context, step-by-step planning that users review before execution, and one-click error recovery via Fix with AI. The blog walks through an example using a public California schools dataset. AWS states the agent operates within existing IAM and AWS Lake Formation permissions and keeps data inside the customer's AWS environment.
What happened
Amazon Web Services published an AWS Big Data Blog post and user documentation describing the SageMaker Data Agent in the Query Editor of SageMaker Unified Studio. Per the AWS blog, the agent converts natural-language prompts into executable SQL, references real table and column metadata from the AWS Glue Data Catalog, and preserves context across a session. The blog lists four core capabilities: catalog-aware SQL generation, querybook and session context, step-by-step planning, and Fix with AI for failed queries.
How it works
According to AWS, the agent targets SQL development against Amazon Redshift and Amazon Athena. For complex questions it proposes a structured plan - specifying which data to retrieve, how to aggregate it, and what filters to apply - that the user reviews and approves step by step before SQL is generated. Generated SQL is added to querybook cells, and the agent uses the active connection, selected cell, and previous results as context. When a query fails, Fix with AI reads the error in context and returns corrected SQL. AWS notes a related Data Agent in notebooks covers Python, SQL, and PySpark for broader analytics and ML workloads.
Security and governance
Per AWS, the agent operates only on data that a user's IAM policies and AWS Lake Formation permissions allow, keeps data within the customer's AWS environment, and does not store querybook context or catalog metadata. AWS also describes content filtering that restricts the agent to AWS-related topics.
Editorial analysis - technical context
Conversational, catalog-aware SQL tooling reduces friction for analysts who spend time locating tables, writing joins, and debugging queries, particularly across large data estates. Industry-pattern observations: the usefulness of such agents depends heavily on catalog hygiene - table and column descriptions and accurate relationships - and on access controls that bound what generated SQL can touch.
What to watch
Indicators to follow include how the agent handles complex joins and performance tuning, whether generated queries respect organizational naming and governance conventions, and how adoption compares with similar natural-language-to-SQL features from other data platforms.
Key Points
- 1AWS documented the SageMaker Data Agent in Query Editor, enabling natural-language-to-SQL generation that is aware of AWS Glue Data Catalog metadata and prior session context.
- 2Catalog-aware generation, step-by-step query plans, and one-click Fix with AI target the time analysts spend finding tables, writing joins, and debugging across Amazon Redshift and Amazon Athena.
- 3Editorial analysis: schema-aware SQL agents typically shift effort toward catalog quality, metadata descriptions, and access governance, which materially affect output accuracy.
Scoring Rationale
This is a useful but vendor-specific tool update: a natural-language-to-SQL agent for SageMaker Unified Studio's Query Editor that matters to data analysts and engineers working with Amazon Redshift, Athena, and Glue. It streamlines real workflows but is not a frontier-model release or a broad platform shift, placing it in the solid-but-niche range.
Sources
Public references used for this report.
Practice with real Retail & eCommerce data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Retail & eCommerce problems