Skip navigation

An event-driven serverless ETL pipeline on AWS

An event-driven serverless ETL pipeline on AWS

Pogiatzis, Antreas ORCID logoORCID: https://orcid.org/0000-0001-8887-0139 and Samakovitis, Georgios ORCID logoORCID: https://orcid.org/0000-0002-0076-8082 (2020) An event-driven serverless ETL pipeline on AWS. Applied Sciences, 11 (1):191. ISSN 2076-3417 (Online) (doi:10.3390/app11010191)

[thumbnail of Open Access Article]
Preview
PDF (Open Access Article)
30902 SAMAKOVITIS_An_Event-driven_Serverless_ETL_Pipeline_On_AWS_(OA)_2020.pdf - Published Version
Available under License Creative Commons Attribution.

Download (538kB) | Preview

Abstract

This work presents an event-driven Extract, Transform, and Load (ETL) pipeline serverless architecture and provides an evaluation of its performance over a range of dataflow tasks of varying frequency, velocity, and payload size. We design an experiment while using generated tabular data throughout varying data volumes, event frequencies, and processing power in order to measure: (i) the consistency of pipeline executions; (ii) reliability on data delivery; (iii) maximum payload size per pipeline; and, (iv) economic scalability (cost of chargeable tasks). We run 92 parameterised experiments on a simple AWS architecture, thus avoiding any AWS-enhanced platform features, in order to allow for unbiased assessment of our model’s performance. Our results indicate that our reference architecture can achieve time-consistent data processing of event payloads of more than 100 MB, with a throughput of 750 KB/s across four event frequencies. It is also observed that, although the utilisation of an SQS queue for data transfer enables easy concurrency control and data slicing, it becomes a bottleneck on large sized event payloads. Finally, we develop and discuss a candidate pricing model for our reference architecture usage.

Item Type: Article
Uncontrolled Keywords: serverless, FaaS, event-driven, distributed, AWS, ETL, architecture
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Faculty / School / Research Centre / Research Group: Faculty of Engineering & Science
Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS)
Last Modified: 23 May 2022 10:59
URI: http://gala.gre.ac.uk/id/eprint/30902

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics