Hadoop Developer

Bright Vision Technologies

United StatesRemoteFull Time

Is this role a fit for you?

Upload your resume and see this job scored against your skills, experience, pay, and preferences — no account needed.

We are seeking an experienced Hadoop Developer to design, build, and operate large-scale data processing pipelines and analytics platforms on Hadoop and related big-data ecosystems. We are seeking an experienced Hadoop Developer to design, build, and operate large-scale data processing pipelines and analytics platforms on Hadoop and related big-data ecosystems. Requirements Design, develop, and operate end-to-end big-data pipelines on Hadoop, ingesting data from a diverse mix of relational, file-based, streaming, and API-driven sources. Build robust ETL/ELT workflows using Apache Spark, Hive, Pig, and Sqoop, with strong attention to data quality, idempotency, error handling, and recoverability. Develop high-throughput streaming data pipelines using Kafka, Spark Streaming, or Flink, and integrate them with downstream analytical and operational systems. Optimize Spark and MapReduce jobs through careful tuning of partitioning, memory, serialization, and skew handling to meet demanding SLAs at minimal cost. Design and maintain data models and storage layouts on HDFS, Hive, HBase, and modern lakehouse formats (Parquet, ORC, Delta, Iceberg, Hudi) to balance flexibility and performance. Implement data governance, lineage, and quality controls in collaboration with data governance and security teams. Build robust monitoring, alerting, and logging strategies for big-data pipelines, including job-level SLAs and proactive failure detection. Partner with data scientists and analysts to deliver curated, reliable, and well-documented datasets that accelerate their work. Automate pipeline orchestration using Airflow, Oozie, or similar workflow engines, with clean dependency management and clear ownership boundaries. Continuously evaluate and adopt new technologies in the big-data and cloud ecosystem (EMR, Databricks, Snowflake, BigQuery) where they offer meaningful improvements. Lead performance reviews and architecture audits of existing pipelines, proposing concrete refactoring and optimization initiatives. Document data architectures, schemas, pipeline behaviors, and operational runbooks in a way that makes the platform supportable as the team scales. Mentor junior engineers and contribute to the team’s engineering standards and best practices. Benefits Competitive base salary commensurate with experience, plus benefits. Originally posted on Himalayas

Similar roles