Project Title:
UMG Consumer Analysis.
Environment:
Java, MapReduce, Hive, Hue, Spark, AWS EMR, S3, EC2, Step Functions, Lambda, Spring Boot, Data Pipeline, Python, Spark, Redshift, Quoble, Big query, Google cloud storage, Google Data Flow, Apache beam, Cloud Composer (Airflow), Cloud data transfer API, Cloud SQL and Kubernetes.
Project Precise
UMG Enterprise Reporting Service group receives consumer data from their digital salespartners like Spotify, iTunes, Amazon, Google, etc. daily for business intelligence and analytics purposes.The volume of this data per day is approximately 60 GB (compressed) with approximately 300 millionrows. The rate of data growth is accelerating at approximately around 10% month over month. Thisdata is currently stored and processed within UMG data center using Traditional tools and systems.Currently, we use Cloud based platform AWS (S3, EMR & REDSHIFT) to store daily digital sales partnersConsumer Analytics data, which is in-turn, consumed by users using Micro strategy reporting tool