Hello, I'm

Salayhin

Data Architect & Engineering Leader shaping the modern data stack

Data Architect and Data Engineering Leader with 14+ years of experience building scalable data platforms on GCP and AWS. With a strong foundation as a Backend Engineer, I specialize in lakehouse architectures, dimensional modeling, and FinOps — transforming complex data into business value.

Md Sirajus Salayhin

About Me

My Journey in Data Engineering

I am a Data Architect, Backend Engineer, and Data Engineering Leader with over 14 years of experience building robust data platforms, implementing lakehouse architectures, and delivering scalable solutions on GCP and AWS. My expertise lies in transforming complex datasets into valuable business insights through innovative engineering and architectural practices.

With a strong background in data warehousing, cloud platforms, backend systems, and big data technologies, I specialize in designing and implementing data solutions that drive measurable business growth. I am proficient in modern data stack technologies, and passionate about applying FinOps and Data Governance strategies to optimize performance and efficiency.

My approach combines technical excellence with business acumen, enabling me to deliver solutions that are both technically sound and commercially impactful. I have led cross-functional initiatives that improved query performance, streamlined data operations, and significantly reduced costs, while mentoring teams to adopt best practices in data engineering and architecture.

My Resume

Data Architect & Backend Engineer

Experience

Staff Data Engineer

Zeals Co., Ltd

Tokyo, Japan Nov 2023 - Present
Current
Directed the design and implementation of a modern data lakehouse and analytics platform, increasing data accessibility and accelerating decision-making
Orchestrated organization-wide adoption of dbt Core, establishing modeling/testing standards and CI/CD integration that improved data quality 30% and reduced pipeline maintenance 40%
Built BigQuery cost monitoring (system tables + Billing Export) to track spend by query/team and flag hotspots; used those insights to optimize queries, cutting runtime 90%, bytes scanned 80%, per-query costs up to 95%, and monthly costs 40%
Engineered the gold layer with dimensional modeling (star schema), transforming raw events into optimized dim_ and fact_ tables, improving query performance 70% and reducing scan volume 80%
Delivered business-specific mart_ tables for self-service analytics, cutting BigQuery costs 90% via strategic partitioning, clustering, and materialization
Optimized high-cost queries (restructuring joins, pruning scans, refactoring UDFs), reducing runtime 90%, bytes scanned 80%, and query costs 95%
Instituted data lineage and a centralized data dictionary via dbt documentation/metadata, enabling clear traceability from source to marts and boosting stakeholder trust and self-service adoption
Spearheaded organization-wide data governance and quality practices (contracts, tests, SLAs), increasing data integrity and accuracy 25%
Built a GCP Asset Inventory–based utilization monitor across GCP components with automated alerts and rightsizing policies
Partnered with ML teams to build automated pipelines for feature engineering and batch inference using BigQuery, GCS, Dataproc, and Dataflow—cutting model iteration time 30%
Mentored junior engineers, driving a 20% uplift in technical proficiency and fostering a culture of continuous learning
Improved cross-functional collaboration (engineering, data science, product), increasing project delivery efficiency 15%
Implemented GDPR-aligned security controls (access policies, encryption, retention), reducing breach risk 20% and ensuring regulatory compliance
BigQuery dbt Core Python Airflow

Senior Big Data Engineer

Zeals Co., Ltd

Tokyo, Japan Nov 2022 - Oct 2023
Implemented data governance using Google Dataplex, improving data management efficiency by 20% and ensuring compliance with internal and external standards
Deployed CI/CD pipelines with Terraform, automating data workflows, reducing manual intervention by 30%, and increasing infrastructure reliability
Designed and rolled out a FinOps pipeline using GCP billing exports, reducing infrastructure costs by 15% through proactive cost monitoring
Built a semantic analytics layer for a customer-facing dashboard, improving data access speed by 20% and enhancing overall user experience
Integrated BI tools (e.g., Tableau) with the Data Lakehouse Gold layer, enabling real-time insights and improving decision-making for business stakeholders by 25%
Dataplex Terraform GCP FinOps

Senior Data Engineer Level 2

Pathao Ltd

Dhaka, Bangladesh Jun 2018 - May 2021
Established a scalable analytics ecosystem using Google Dataproc, Google Dataflow, BigQuery, and Data Studio, increasing analytics capabilities by 40%
Spearheaded the development of a data lake and warehouse in Google Cloud Storage and BigQuery, reducing data retrieval times by 50%
Enhanced the data pipeline framework to handle over 1,000 attributes, boosting data processing efficiency by 35%
Designed and implemented a scheduled reporting framework on the data warehouse, generating 10+ financial and operational reports, improving team productivity by 70%
Dataproc BigQuery Dataflow Python

Senior Big Data Engineer (Remote)

Zeals Co., Ltd

Tokyo, Japan May 2021 - Oct 2022
Developed a robust data lakehouse architecture on GCP using BigQuery and Cloud Storage, improving data scalability and accessibility for internal stakeholders
Partnered with R&D and analytics teams to support ML model development and BI dashboarding, driving a 20% improvement in data-driven decision-making
Audited and remediated data storage systems in collaboration with the security team, strengthening data security protocols
BigQuery Cloud Storage GCP ML Pipelines

Senior Data Engineer

Augmedix Bangladesh Ltd.

Dhaka, Bangladesh Jun 2015 - Mar 2018
Served as a core contributor in designing and building a data warehouse using SQL Server 2016, improving business decision-making speed by 25%
Integrated a data pipeline framework to ingest data from 3rd-party APIs, Google Sheets, and MySQL, processing 100+ attributes and improving data efficiency by 30%
Developed a reporting framework for the data warehouse, generating 20+ operational reports and increasing stakeholder access to decision-ready data by 35%
Built backend services to serve millions of records from the warehouse, reducing data retrieval latency by 20%
SQL Server MySQL Data Pipeline APIs

Software Engineer

Nascenia Ltd

Dhaka, Bangladesh Jun 2013 - May 2015
Deployed a B2B marketplace platform for farmers using Ruby on Rails and MySQL, increasing seller participation by 30% and boosting transactions by 25%
Built a web-based tool for social media analytics and post scheduling with Ruby on Rails and PostgreSQL, improving engagement by 40%
Enhanced a school management system by extending core features using Ruby on Rails and PostgreSQL, improving system efficiency and reducing administrative workload by 35%
Ruby on Rails MySQL PostgreSQL Web Development

Software Engineer

Right Brain Solution Ltd

Dhaka, Bangladesh Nov 2011 - May 2013
Developed and deployed an API for social media mobile apps in collaboration with the dev team, increasing user engagement by 20%
Co-authored a high-traffic automotive portal for the Toronto Star, boosting monthly user visits by 35% and improving customer engagement
Built a multilingual B2C ski rental shop, increasing rentals by 20% and expanding the international customer base by 15%
Optimized an OMR reader for university online exams, reducing manual data entry time by 95% through improved input handling
API Development Web Applications Mobile Apps OMR Systems

Projects

Modern Data Lakehouse & Analytics Platform

Staff Data Engineer at Zeals Co., Ltd

Designed and implemented a cloud-native lakehouse, improving data accessibility, governance, and decision-making across the organization.

GCP Cloud Composer Dataproc Pub/Sub BigQuery Apache Iceberg Apache Spark dbt Python Scala Dataproc Github Action

Dimensional Modeling & Gold Layer Engineering

Staff Data Engineer at Zeals Co., Ltd

Architected the Gold Layer with star-schema modeling, delivering business-ready fact and dimension tables that cut scan volume 80% and boosted query performance 70%.

BigQuery Cloud Composer dbt Core Dimensional Modeling Data Warehouse Github Action

FinOps Cost Optimization Pipeline

Staff Data Engineer at Zeals Co., Ltd

Built a BigQuery and GCP Billing–based cost monitoring system, reducing monthly query costs by 40% and per-query costs by up to 95%.

GCP Billing BigQuery Cloud Monitoring

Data Governance & Quality Framework

Staff Data Engineer at Zeals Co., Ltd

Established dbt-driven data lineage, testing, and governance practices, improving data accuracy by 25% and boosting stakeholder trust.

dbt GCP Data Catalog Data Lineage Data Quality Data Contracts Cloud Asset Management Great Expectations

Skills & Technologies

Proficient
Intermediate
Exploring
Familiar

Programming & Scripting

Python SQL Scala GoLang Ruby PHP Bash

Data Processing & Management

Apache Spark Apache Flink Apache Kafka Apache Presto Apache Iceberg dbt Core

Database Technologies

PostgreSQL SQL Server MySQL DuckDB MongoDB

Cloud Platforms

BigQuery Dataproc Dataflow Cloud Composer Pub/Sub Cloud Storage Athena EMR DynamoDB S3 Lambda

Infrastructure & Automation

Docker Kubernetes Terraform GitHub Actions CI/CD

BI & Analytics

Tableau Google Data Studio Redash

Data Governance & Quality

Data Stewardship Data Cataloging dbt Lineage & Tests Great Expectations GDPR/HIPAA

FinOps & Cost Optimization

Query Tuning Partitioning & Clustering GCP Billing & Monitoring Anomaly Detection Cost Optimization

Certifications

Data Engineering with Google Cloud Specialization

Coursera

Google Cloud Training

2023

Comprehensive specialization covering data engineering fundamentals on Google Cloud Platform, including BigQuery, Dataflow, Cloud Storage, and data pipeline design patterns.

Verify Certificate

Data Engineering, Big Data, and Machine Learning on GCP Specialization

Coursera

Google Cloud Training

2023

Advanced specialization in big data and machine learning on Google Cloud Platform, covering Dataproc, Dataflow, BigQuery ML, and scalable data processing architectures.

Verify Certificate

Fundamentals of Agents

Hugging Face 🤗

Hugging Face Instructors

February 27, 2025

Successfully completed Unit 1: Foundations of Agents in the Hugging Face Agents Course, covering fundamental concepts of AI agents and their implementation.

View Certificate

Implement Multimodal Vector Search with BigQuery

Google Cloud Provider

Machine Learning & AI

April 20, 2025

Demonstrates skills in implementing multimodal vector search using BigQuery for advanced AI and machine learning applications with vector embeddings.

View Skill Badge

Get In Touch

Let's Work Together

I'm always interested in discussing new opportunities, challenging projects, and innovative data solutions. Feel free to reach out if you'd like to collaborate or just want to connect!

salayhin.lab@gmail.com
+81-080-6918-4753
Tokyo, Japan

Social Media