ETL/ELT and Big Data

Beth Adams's picture
 By | December 22, 2016
in iWay , Omni, Omni-Gen, big data, data integration, data lakes, Data Quality, Hadoop, Data Access, Data Cleansing, Data Governance, Data Integration, Data Integrity, Data Profiling, Data Quality, ETL
December 22, 2016

freeimages.com/Svilen Milev

What’s the difference between ETL and ELT and what works best with big data?

ETL stands for “extract, transform, and load”. It’s the traditional set of functions that lets organizations extract data from numerous databases, applications, and systems, transform it as appropriate, and load it into another database, a data mart, or a data warehouse for analysis, or send it along to another operational system to support a business process.

As you may have suspected, ELT stands for “extract, load, and transform”. This is a process whereby the data is first loaded and then transformed, used primarily for big data and working with data lakes. Data lakes are storage repositories that hold large quantities of raw data in their native format. This can include structured or unstructured information, or virtually any kind of data.

For big data scenarios, using the ELT process allows you to create copies of the source data and move them into Hadoop. This is not as resource-intensive as ETL, where the transporting and transforming of data can be cumbersome. In ELT, because the data is in Hadoop and takes advantage of large-scale parallel processing, there is less stress on source systems, which shortens the time frame for transformation.

Few solutions can do this all. Rather than piece together the tools for extracting, loading, and transforming, look for a comprehensive big data integration platform that can execute efficiently. More information on how to make your big data work with your existing applications and processes can be found here.