The Google Cloud team helps companies, schools, and government seamlessly make the switch to Google products and supports them along the way. You listen to the customer and swiftly problem-solve technical issues to show how our products can make businesses more productive, collaborative, and innovative. You work closely with a cross-functional team of web developers and systems administrators, not to mention a variety of both regional and international customers. Your relationships with customers are crucial in helping Google grow its Cloud business and helping companies around the world innovate.
As a Technical Solutions Engineer for AI Infrastructure, you will own complex customer issues and provide specialized support to other teams.
In this role, you will be a part of a global team that provides 24x7 support to ensure customers can seamlessly deploy their AI and Machine Learning (ML) workloads on AI Infrastructure products. When customers encounter technical issues, you will ensure we have the expertise, tools, and processes to resolve the issue. You will troubleshoot technical problems with a mix of hardware and software debugging, networking, Linux system administration, coding/scripting, and updating documentation. You will help our customer’s success in the AI/ML space by making improvements to the product, internal tools, processes, and documentation. Youll help drive business growth by recognizing and advocating for our customers issues related to AI deployments.
Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
responsibilities :
Manage customer’s problems through diagnosis, resolution, or implementation of new investigation tools to increase productivity for customer issues on AI/ML infrastructure.
Develop an understanding of AI/ML workloads and underlying hardware architectures by troubleshooting, reproducing, determining the root cause for customer reported issues, and building tools for diagnosis.
Act as a consultant and subject matter expert for internal stakeholders in engineering, business, and customer organizations to resolve deployment and operational obstacles in AI infrastructure environments.
Work with multiple product and engineering teams to find ways to improve the product, and interact with our Site Reliability Engineering (SRE) teams to drive production.
Be available for non-standard work hours or shifts which may include weekends as needed.
requirements-expected :
Bachelor’s degree in Science, Technology, Engineering, Mathematics, or equivalent practical experience.
5 years of experience writing code in one or more general purpose programming languages (e.g., C++, Java, Python, Go, etc).
Experience with Linux/Unix systems with debugging issues across the hardware/software boundary on enterprise-grade server infrastructure.
Experience in troubleshooting for customer needs, and triaging technical issues across the stack (e.g., hardware faults, networking, virtualization, kernel drivers, firmware, performance).
benefits :
sharing the costs of sports activities
private medical care
sharing the costs of foreign language classes
sharing the costs of professional training & courses
life insurance
remote work opportunities
fruits
corporate products and services at discounted prices