Leveraging AWS services to build a scalable, reliable and cost-effective reporting solution for NVD
Client Profile
Customer Name: NVD
Established: 1980
Customer Since: 2023
Sector: Logistics
Overview of the customer
A family-owned business, established in 1980, NVD was set up to assist vehicle manufacturers in the timely and incident free distribution of their products. They’ve established themselves as a one stop shop providing the three main pillars of outbound logistics, including transporting, storing and enhancing our customers’ vehicles, prior to delivery to their end location.
Since 1980 NVD has been at the forefront of pioneering the digitisation of the FVL industry. As early adopters of emerging technology they have been able to introduce huge efficiencies saving time and money for their customers.
Background
NVD have been a part of the Version 1 managed service since 2023. Since taking on support we have been delivering comprehensive infrastructure support, including proactive monitoring, hosting services, centralised management, robust security, and adherence to SLAs. This ensures optimal performance, reliability, and compliance with industry standards. The solution written out in this case study is now embedded into support.
NVD rely on our team to oversee, maintain, and enhance their cloud infrastructure.
The Challenge
NVD were utilising a manual data preparation process, involving Excel and Google Sheets, to generate multiple reports from their Inform SaaS platform. This platform, critical to their operations, manages vehicle planning, activity, and operations. NVD required a data platform capable of automating data extraction, providing point-in-time data views, flexible reporting and dashboarding, robust data preparation and quality control, and delivering reliable, automated insights.
The Solution
After reviewing the challenges faced it was decided the overall approach should be to build a data pipeline in AWS with INFORM as the data source and AWS QuickSight as the BI platform for analysis and reports
- Data Collection via HTTP Calls:
- Use AWS Lambda to create serverless functions that make HTTP calls to obtain the source data from external http endpoint (public). Leverage Step Functions for interactive controls and management
- Store the retrieved data in Amazon S3 buckets.
- Data Processing and Validation:
- Use AWS Glue for data transformation and validation. Glue job picks up retrieved data from S3 bucket and saves results to Amazon Redshift and/or S3. Glue can clean and validate the data using Python or Spark scripts.
- Data Storage:
- Store the processed and validated data in Amazon Redshift, fully managed data warehouse, suitable for BI and reporting purposes. Leveraging Redshift database helps to address data joining challenges and is also beneficial in the long run as the amount of collected data builds up, to provide drill down capabilities and historical lookup through previous years.
- Aggregation:
- Utilise SQL queries or data transformation scripts in AWS Glue to aggregate the data and perform necessary calculations for stored values.
- Amazon Redshift is optimised for quick and interactive query performance on large-scale datasets and its powerful SQL capabilities can handle aggregations.
- Visualisation:
- Use Amazon QuickSight, a cloud-native business intelligence (BI) service, for visualisation. QuickSight can directly connect to Redshift for real-time data visualisation.
- Create interactive dashboards, graphs, and charts in QuickSight to present the aggregated data in a visually appealing manner.
- Automation and Orchestration:
- Automate the entire process using AWS Step Functions to schedule and orchestrate the data pipeline from data collection to visualisation.
- Monitor the pipeline’s health and performance using AWS CloudWatch for logging and monitoring.
By leveraging these AWS services, we could build a scalable, reliable, and cost-effective reporting solution that meets NVD requirements.
Results and benefits
Addressing the customer’s current data preparation challenges yielded significant benefits, including:
Efficiency and productivity:
- Time savings: Automation of data extraction, preparation, and reporting processes would drastically reduce the time spent on manual tasks
- Error reduction: Automated processes would minimise human error, ensuring data accuracy and reliability
- Increased productivity: Business users could focus on strategic tasks rather than time-consuming data preparation
- Enhanced data integrity: Robust data governance practices would protect data privacy and security
Improved Data Quality:
- Data consistency: Centralised data management would ensure consistency across different reports and analyses
- Data accuracy: Automated quality control measures would help identify and correct errors
- Enhanced data integrity: Robust data governance practices would protect data privacy and security
Enhanced Decision Making:
- Real-time insights: Point-in-time data views would provide up-to-date information for informed decision-making
- Flexible reporting: Customised reports and dashboards would enable users to analyse data in a way that best suits their needs
- Data-driven decision-making: A reliable data platform would support evidence-based decision-making
Cost Savings:
- Reduced labour costs: Automation would reduce the need for manual labour, leading to cost savings
- Improved resource allocation: By streamlining data processes, organisations can allocate resources more effectively
- Optimised IT infrastructure: A well-designed data platform can help optimise IT infrastructure costs