Job Description
We are seeking a highly skilled and experienced Spark and Scala Developer to join our team. In this role the responsibilities will be – designing, developing, and maintaining large-scale data processing applications using Apache Spark and Scala. Expertise in Spark and Scala will be crucial in implementing efficient and scalable solutions for handling big data analytics.
Roles and Responsibilities:
- Design and develop Spark-based data processing applications using Scala as the primary programming language.
- Collaborate with data scientists and business stakeholders to understand their requirements and translate them into technical solutions.
- Optimize and tune Spark applications for performance and scalability, ensuring efficient data processing across large datasets.
- Implement data ingestion processes, including data extraction, transformation, and loading (ETL) from various sources into Spark.
- Develop and maintain data pipelines to process, cleanse, and transform structured and unstructured data.
- Perform data analysis and provide insights to support business decision-making.
- Work with the infrastructure team to deploy Spark applications on distributed clusters and monitor their performance.
- Conduct code reviews, identify areas for improvement, and suggest best practices to ensure high-quality code development.
- Troubleshoot and debug issues in Spark applications and provide timely resolutions.
- Stay updated with the latest trends and advancements in Spark and Scala technologies and proactively suggest innovative solutions.
Required Skills:
- Minimum of 3 years of hands-on experience in software development, with a focus on Spark and Scala.
- Strong proficiency in Scala programming language, including functional programming concepts.
- Extensive experience in Apache Spark and its ecosystem (Spark SQL, Spark Streaming, Spark MLlib, etc.).
- Proficiency in working with distributed computing frameworks and cluster computing technologies.
- Solid understanding of data processing concepts and techniques, including batch and real-time processing.
- Experience with data serialization formats such as Avro, Parquet, or ORC.
- Knowledge of SQL and experience in working with relational databases and data warehouses.
- Familiarity with version control systems (e.g., Git) and continuous integration/continuous delivery (CI/CD) pipelines.
Notes: If you’re interested with the above job, please click button [Apply the job @Company’s site] below to brings you directly to the company’s site.
Job Features
Job Category | Information Technology |
Date | Jul. 21, 2023 |
Job ID | 50702141 |