Job Title:Big Data EngineerLocation:RemoteEmployment Type:[Full-Time/Contract]Department:Data Engineering / AnalyticsAbout the Role:We are looking for a highly skilled and experiencedBig Data Engineerto join our growing data team.
As a Big Data Engineer, you will be responsible for designing, developing, and optimizing scalable data pipelines and architectures that enable data-driven decision-making across the organization.
You'll work closely with data scientists, analysts, and software engineers to ensure reliable, efficient, and secure data infrastructure.Key Responsibilities:Design, develop, and maintain robust and scalable data pipelines for batch and real-time processing.Build and optimize data architectures to support advanced analytics and machine learning workloads.Ingest data from various structured and unstructured sources using tools like Apache Kafka, Apache NiFi, or custom connectors.Develop ETL/ELT processes using tools such as Spark, Hive, Flink, Airflow, or DBT.Work with big data technologies such as Hadoop, Spark, HDFS, Hive, Presto, etc.Implement data quality checks, validation processes, and monitoring systems.Collaborate with data scientists and analysts to ensure data is accessible, accurate, and clean.Manage and optimize data storage solutions including cloud-based data lakes (AWS S3, Azure Data Lake, Google Cloud Storage).Implement and ensure compliance with data governance, privacy, and security best practices.Evaluate and integrate new data tools and technologies to enhance platform capabilities.Required Skills and Qualifications:Bachelor's or Master's degree in Computer Science, Engineering, Information Systems, or related field.3+ years of experience in data engineering or software engineering roles with a focus on big data.Strong programming skills in Python, Scala, or Java.Proficiency with big data processing frameworks such as Apache Spark, Hadoop, or Flink.Experience with SQL and NoSQL databases (e.g., PostgreSQL, Cassandra, MongoDB, HBase).Hands-on experience with data pipeline orchestration tools like Apache Airflow, Luigi, or similar.Familiarity with cloud data services (AWS, GCP, or Azure), particularly services like EMR, Databricks, BigQuery, Glue, etc.Solid understanding of data modeling, data warehousing, and performance optimization.Experience with CI/CD for data pipelines and infrastructure-as-code tools like Terraform or CloudFormation is a plus.Preferred Qualifications:Experience working in agile development environments.Familiarity with containerization tools like Docker and orchestration platforms like Kubernetes.Knowledge of data privacy and regulatory compliance standards (e.g., GDPR, HIPAA).Experience with real-time data processing and streaming technologies (e.g., Kafka Streams, Spark Streaming).Why Join Us:Work with a modern data stack and cutting-edge technologies.Be part of a data-driven culture in a fast-paced, innovative environment.Collaborate with talented professionals from diverse backgrounds.