AWS Announces Four Zero-ETL Integrations to Make Data Access and Analysis Faster and Easier Across Data Stores
At AWS re:Invent, Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), today announced new integrations that enable customers to quickly and easily connect and analyze data without building and managing complex extract, transform, and load (ETL) data pipelines. New Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon Relational Database Service (Amazon RDS) for MySQL integrations with Amazon Redshift make it easier to connect and analyze transactional data from multiple relational and non-relational databases in Amazon Redshift. Customers can also now use Amazon OpenSearch Service to perform full-text and vector search on DynamoDB data in near real time. By making it easier to connect to and act on their data, no matter where it lives, these zero-ETL integrations help customers leverage the breadth and depth of AWS’s leading database and analytics services to discover new insights, innovate faster, and make better data-driven decisions. To learn more about unlocking the value of data using AWS, visit aws.amazon.com/data.
“To help customers fuel innovation with data, AWS offers the industry’s broadest and deepest set of data services for storing and querying any type of data at scale,” said Dr. Swami Sivasubramanian, vice president of Data and Artificial Intelligence at AWS. “In addition to having the right tool for the job, customers need to be able to integrate the data that is spread across their organizations to unlock more value for their business and innovate faster. That is why we are investing in a zero-ETL future, where data integration is no longer a tedious, manual effort, and customers can easily get their data where they need it. The new integrations announced today move customers toward this zero-ETL future, and we are continuing to invest in this vision to make it easy for customers to integrate data from across their entire system, so they can focus on driving new insights.”
Data is any organization’s differentiator. However, organizations have different types of data coming from different origins at varying scales and speeds, and the uses for this data are just as varied. For organizations to make the most of their data, they need a comprehensive set of tools that accounts for all of these variables, along with the ability to integrate and combine data spread across multiple sources. For example, a company may store transactional data in a relational database that it wants to analyze in a data warehouse, but use another analytics tool to perform a vector search on data from a nonrelational database. Historically, moving data has required customers to architect their own ETL pipelines, which can be challenging and costly to build, complex to manage, and prone to intermittent errors that delay access to time-sensitive insights. To help customers derive value from their data, AWS offers a comprehensive set of data services, so customers always have the right tool for the job. But to put data at the center of their businesses, customers need to be able to connect all of their data, regardless of where it lives. That is why AWS has invested in zero-ETL capabilities that remove the burden of manually moving data. This includes federated query capabilities in Amazon Redshift and Amazon Athena—which enable customers to directly query data stored in operational databases, data warehouses, and data lakes—and Amazon Connect analytics data lake—which makes it easier for customers to access contact center data for analytics and machine learning. It also includes new zero-ETL integrations between Salesforce Data Cloud and AWS storage, data, and analytics services to enable customers to easily and seamlessly unify their data across Salesforce and AWS for better, faster insights. The integrations announced today build on AWS’s zero-ETL foundation to remove the burden of building and maintaining data pipelines, so customers can quickly and easily connect all of their data, no matter where it lives.
- New Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon RDS for MySQL zero-ETL integrations with Amazon Redshift make it easier to analyze transactional data without building and maintaining data pipelines: To maximize the value they get out of their data, many organizations want to move their transactional data from multiple high-performance databases—including relational databases such as Aurora and Amazon RDS, and non-relational databases such as DynamoDB—into a data warehouse such as Amazon Redshift to run high-performance data warehousing and analytics workloads on petabytes of data. However, this data movement requires customers to create ETL pipelines for every data source. To make it easier to analyze Aurora data with Amazon Redshift, AWS announced the general availability of Aurora MySQL zero-ETL integration with Amazon Redshift earlier this year. This integration processes more than 1 million transactions per minute and makes data available in Amazon Redshift within seconds of being written in Aurora MySQL. To further extend the benefits of zero-ETL, AWS is announcing the preview of new zero-ETL integrations for Aurora PostgreSQL, DynamoDB, and Amazon RDS for MySQL with Amazon Redshift. These integrations help customers quickly and easily access data from popular relational and non-relational databases in Amazon Redshift for comprehensive analysis. Customers simply select the data tables containing the data they want within their databases, and it is automatically replicated to Amazon Redshift. By bringing together data from disparate sources into a single data warehouse, customers can gain a consolidated view of their business and take advantage of advanced Amazon Redshift features, including data sharing, materialized views, and Amazon Redshift ML, to get holistic and predictive insights.
- Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service enables full-text and vector search on transactional data in near real time: To optimize business operations and create a more engaging experience for users, many customers use OpenSearch Service to perform advanced search functions (e.g., full-text and vector search, relevancy ranking, and autocomplete suggestions) on their transactional data in DynamoDB. For example, an ecommerce company might replicate data from DynamoDB to OpenSearch Service to use a vector search to automatically determine if transactions are fraudulent by comparing them with data on similar transactions. Now generally available, the new DynamoDB zero-ETL integration with OpenSearch Service makes it easier for customers to run powerful full-text and vector search queries on their DynamoDB data in near real time. Customers simply choose the DynamoDB tables containing the data they want to analyze, and the data is replicated into OpenSearch Service within seconds of being written in DynamoDB. Customers can synchronize data from multiple DynamoDB tables into one OpenSearch Service managed cluster or serverless collection to gain holistic insights across multiple applications and consolidate their search assets, reducing their costs while increasing operational efficiency.
Grabyo is a live video production platform that makes it faster, simpler, and more efficient to produce, distribute, and monetize professional video output. “Today, we have to build and manage labor intensive and expensive ETL pipelines that create a significant operational load on our data engineering team,” said Mun Wai Kong, chief technology officer at Grabyo. “We use DynamoDB and Amazon Redshift to power our platform and are impressed with the simplicity and scale they provide. We are excited about the new Amazon DynamoDB zero-ETL integration with Amazon Redshift, which will enable our applications to go from transactions to insights within minutes. This will help us focus our data engineering resources on unlocking value for our business and users, instead of building and managing data pipelines.”
iCIMS is a provider of talent acquisition technology that enables organizations everywhere to hire great people. “We are always looking to modernize our applications with AWS through an effective, modern data strategy that utilizes purpose-built data stores, as well as an efficient and scalable data sharing and ingestion process that requires little to no maintenance,” said Ben Barresi, vice president of Cloud Hosting and Engineering at iCIMS. “We look forward to using Amazon Aurora PostgreSQL zero-ETL integration with Amazon Redshift, which will remove the burden of data pipeline management and maintenance for our engineering team. Because the data will be available in Amazon Redshift seconds after it is written in Aurora, we will be able to improve the speed at which we can analyze our data compared to the nightly one-time batch ingestion we currently run.”
Kaplan, Inc. is a global educational services company that supports individuals and institutions in achieving their goals, particularly in a dynamic and evolving environment. “The Kaplan Data Engineering team manages hundreds of data pipelines, which are expensive to maintain and prone to frequent connection errors,” says Naveen Kambhoji, senior manager at Kaplan, Inc. “The new RDS for MySQL zero-ETL integration with Amazon Redshift will allow us to seamlessly transfer transactional data to Amazon Redshift to perform analyses that help us better understand various aspects of our business, including student activities, exam completion rates, and student churn rates. We expect this new offering will help us improve reliability, scalability, and cost efficiency for our data infrastructure, while also freeing engineers to focus on more strategic projects.”
Muzz is a leading global dating app for Muslim communities around the world. “We use Amazon DynamoDB to manage and store profile data and user-created content. Previously, our engineering team used AWS Lambda stream processors to load data from DynamoDB Streams into Amazon OpenSearch Service for our search and analytics workloads,” said Alex Bilbie, head of Engineering at Muzz. “It took weeks to design and build the initial data pipelines, and then we had to devote ongoing engineering resources to maintain and update the custom ETL code whenever we needed to make changes. DynamoDB zero-ETL integration with Amazon OpenSearch Service makes it significantly easier to use our DynamoDB data in OpenSearch Service, helping reduce operational overhead and freeing the team to work on user-facing features that further enhance the Muzz experience.”
Orion Advisor Solutions provides a tech-enabled fiduciary process that transforms the advisor-client relationship by enabling financial advisors with a single, connected, and technology-driven experience. “Data accessibility, delivery, and transparency are critical to our business, which is why we use Amazon Redshift to power our data processing solutions and share data with our advisor clients in real time to power decision making,” said Brian McLaughlin, president of Orion Advisor Technology. “Amazon DynamoDB zero-ETL integration with Amazon Redshift will allow us to quickly and easily analyze additional data sources to provide financial advisors with holistic insights and improve our data-sharing solutions for our clients and partners. We are excited to use this new zero-ETL integration, which will eliminate the need to build and manage custom data pipeline solutions.”
United Airlines operates a large domestic and international route network, spanning cities large and small across the U.S. “Today, our team uses AWS technologies like Amazon RDS and Amazon Redshift to collect and analyze data from our applications, customers, and operations. However, manually managing the data pipeline to connect all of this data at global scale is costly and labor intensive,” said Sanjay Nair, managing director at United Airlines. “With Amazon RDS for MySQL zero-ETL integration with Amazon Redshift, we plan to build our own self-healing data pipelines to automate disaster recovery and pipeline- and data-quality functions while reducing the burden on our data engineers. As a result, we will also be able to capitalize on powerful Amazon Redshift features such as cross-Region data sharing, Amazon Redshift Serverless, and Amazon Redshift Spectrum to unlock insights on our RDS for MySQL data.”
About Amazon Web Services
Since 2006, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud. AWS has been continually expanding its services to support virtually any workload, and it now has more than 240 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, virtual and augmented reality (VR and AR), media, and application development, deployment, and management from 102 Availability Zones within 32 geographic regions, with announced plans for 15 more Availability Zones and five more AWS Regions in Canada, Germany, Malaysia, New Zealand, and Thailand. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs. To learn more about AWS, visit aws.amazon.com.
About Amazon
Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Amazon strives to be Earth’s Most Customer-Centric Company, Earth’s Best Employer, and Earth’s Safest Place to Work. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Career Choice, Fire tablets, Fire TV, Amazon Echo, Alexa, Just Walk Out technology, Amazon Studios, and The Climate Pledge are some of the things pioneered by Amazon. For more information, visit www.amazon.com/about and follow @AmazonNews.