introduction
This study details a project related to designing a data stream processing pipeline for one of our Clients in moving and storage industry. The objective was to extract, transform and load data in real-time into data warehouses, thereby replacing erstwhile ine>icient and batched processing data jobs that heavily relied on outdated legacy systems. The legacy systems had been in place for 25 years and were deemed ine>icient due to their limitations in terms of speed, scalability, and dependency. By implementing a modern data stream processing pipeline, the aim was to leverage the benefits of robustness, fault tolerance, and distributed processing.