Job Search
Senior Data Engineer IRC173106
Job: | IRC173106 |
Location: | Poland |
Designation: | Senior Software Engineer |
Experience: | 3-5 years |
Function: | Engineering |
Skills: | Apache Spark, AWS, ETL, Java, Pipeline Execution/ETL, SQL |
Remote | Yes |
Description:
We are cooperating with a company developing scientifically validated software for performing rapid analyses to generate real-world evidence at scale. Their solutions are used worldwide to make meaning from real-world data and to produce meaningful results for life sciences companies.
You will work closely with our Product and Science team in developing custom transformation logic for longitudinal data, which is in Java / Python / Scala and/or R, and executed over a Spark cluster. In addition, you will be integral in developing and enhancing our platform and its connections to Spark and a combination of Big Data infrastructure.
Requirements:
- 4+ years of experience or equivalent in the position offered or related position, including 2 years of experience with designing, developing, and maintaining large-scale data ETL pipelines using Java/Scala in AWS, Hadoop, Spark, and DataBricks to manage Apache Spark infrastructure
- Bachelor’s degree or equivalent in Computer Science, Computer Engineering, Information Systems, or a related field
- Experience working with programming languages like Java, Python, SQL, and SCALA
- Experience or knowledge of building and optimizing ETL pipelines
- Experience building systems with large data sets
- Experience or working knowledge of distributed systems
- Experience translating requirements from the product and DevOps teams to technology solutions using SDLC
- Communicative English
Job Responsibilities:
- Develop transformation logic to convert disparate datasets into the Client’s proprietary format
- Work with the Science team to develop transformations in Spark SQL and UDFs executed over a Spark cluster
- Assess, develop, troubleshoot and enhance our measurement system, which utilizes a combination of Java, Scala, Python
- Work on a full-stack rapid-cycle analytic application
- Develop highly effective, performant, and scalable components capable of handling large amounts of data for over 100 million patients
- Work with the Science and Product teams to understand and assess client needs, and to ensure optimal system efficiency
- Take ownership of software development and prototyping through the implementation
- Build proprietary cloud-based big data analytics for healthcare and improve core back-end & cloud-based data services
What We Offer
Exciting Projects: With clients across all industries and sectors, we offer an opportunity to work on market-defining products using the latest technologies.
Collaborative Environment: You can expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities!
Work-Life Balance: GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules.
Professional Development: We develop paths suited to your individual talents through international knowledge exchanges and professional certification opportunities.
Excellent Benefits: We provide our employees with private medical care, sports facilities cards, group life insurance, travel insurance, relocation package, food subsidies and cultural activities.
Fun Perks: We want you to feel comfortable in your work, which is why we create good working environment with relax zones, host social and teambuilding activities and stock our kitchen with delicious teas and coffees!