We are looking for a Big Data Site Reliability Engineer with hands-on experience in Hadoop or Kafka. This role requires a strong development background combined with platform exposure to ensure the stability, performance, and scalability of big data systems.
responsibilities :
Maintain and optimize Hadoop/Kafka platforms to ensure high availability and reliability.
Work on platform engineering tasks, including deployment, scaling, and performance tuning.
Collaborate with development teams to enhance platform efficiency and resolve issues.
Automate operational tasks to improve system reliability and reduce manual efforts.
Monitor system health and performance, implementing proactive solutions to minimize downtime.
requirements-expected :
Hands-on experience in Hadoop or Kafka with a strong development background.
Experience managing big data platforms with exposure to infrastructure and operations.
Strong troubleshooting and problem-solving skills in large-scale distributed systems.
Knowledge of automation and monitoring tools to improve platform reliability.