ETL/ELT and Big Data
What’s the difference between ETL and ELT and what works best with big data?
ETL stands for “extract, transform, and load”. It’s the traditional set of functions that lets organizations extract data from numerous databases, applications, and systems, transform it as appropriate, and load it into another database, a data mart, or a data warehouse for analysis, or send it along to another operational system to support a business process.
As you may have suspected, ELT stands for “extract, load, and transform”. This is a process whereby the data is first loaded and then transformed, used primarily for big data and working with data lakes. Data lakes are storage repositories that hold large quantities of raw data in their native format. This can include structured or unstructured information, or virtually any kind of data.
For big data scenarios, using the ELT process allows you to create copies of the source data and move them into Hadoop. This is not as resource-intensive as ETL, where the transporting and transforming of data can be cumbersome. In ELT, because the data is in Hadoop and takes advantage of large-scale parallel processing, there is less stress on source systems, which shortens the time frame for transformation.
Few solutions can do this all. Rather than piece together the tools for extracting, loading, and transforming, look for a comprehensive big data integration platform that can execute efficiently. More information on how to make your big data work with your existing applications and processes can be found here.