Job Search
Podemos ayudarlo a desarrollar una carrera excepcional.
SRE/Production Support Engineer IRC217092
Job: | IRC217092 |
Location: | Argentina - Buenos Aires |
Designation: | Senior Developer |
Experience: | 5-10 years |
Function: | Engineering |
Skills: | AWS, PostgreSQL, Python, REST, Software Development, Technical Documentation |
Description:
We are looking for a talented and experienced SRE engineer for the project which is related to the development of the brand-new order management system for the global media and entertainment company that encompasses a diverse portfolio of television networks, film studios, theme parks and digital platforms.
The engineer will have the opportunity to be part of a strong R&D group that creates a cutting-edge solution designed to streamline and optimize business operations, provide seamless order processing, inventory management and real-time tracking by ensuring efficient and accurate order fulfillment.
Requirements:
- Excellent communication skills and cross-team collaboration
- Proactive and intiative in problem solving
- Solid knowledge of SOA and integration of the distributed systems
- Solid knowledge and experince of SDLC
- Technical background in software development is highly desirable (JavaPythonSQLAmazon CloudJSSwiftKotlin)
- Experience with Splunk/Datadog
Job Responsibilities:
- Provide Incident management support to all teams
- Support engineering teams with triage and troubleshooting
- Act as point person for production issues, incidents, or questions
- Create runbooks and action plans for the various types of incidents and production requests
- Work with engineering teams to create process around production requests
- Provide reporting and data on incidents and requests
- Use Jira and Confluence to track and document incidents and requests
- Must take initiative and communicate well with all levels
- Ensure high availability and lead disaster recovery when needed
- Troubleshoot issues, provide root cause analysis and build a knowledge database for known issues and fixes
- Participate in planning and executing Maintenance, Outage Management, and Problem Management
- Continuously improve SRE tools, processes, and procedures
- Should have enough technical skills to be able to help troubleshoot as well as learn/understand our systems over time
- help create a diagram/document(data dictionary, application flow….) of resources being used for a particular business function
- Gather/recommend metrics from various monitoring tools
- Build alerts/alarms, Integrate SR tooling for alert consumption and validation
- Implement a framework for root cause analysis, incident notification and runbook creation to facilitate the end-to-end production support process.
- Implement the Site Reliability lifecycle of an alert thru key steps: Monitor, Alert, Analyze, Collect, Action
We Offer
Exciting Projects: Come take your place at the forefront of digital transformation! With clients across all industries and sectors, we offer an opportunity to work on market-defining products using the latest technologies.
Collaborative Environment:Expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities!
Work-Life Balance:GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules.We offer you the best quality of work life so that you exceed the expectations of our clients, while achieving your professional and personal ambitions.
Professional Development:Our dedicated Learning & Development team regularly organizes English classes, professional certifications, and technical and soft skill trainings. We also offer the chance to travel internationally
Excellent Benefits:We provide our employees with competitive salaries, family medical insurance, extended paternity leave, annual performance bonuses, and referral bonuses.