✨ Fill and validate PDF forms with InstaFill AI. Save an average of 34 minutes on each form, reducing mistakes by 90% and ensuring accuracy. Learn more

Chief AI Infrastructure Specialist

Capital One Crewe, VA
chief ai infrastructure ai infrastructure training cloud capital clusters search engineering machine learning learning
April 13, 2024
Capital One
Crewe, VA
OTHER

Chief AI Infrastructure Specialist

Job Summary:

If you are interested in constructing the groundwork for our AI capabilities, Capital One is searching for a Lead Generative AI Infrastructure Engineer to expand our team. In this role, you will be assigned several initiatives, including creating extensive distributed training clusters to launch Large-Language Models (LLMs) on GPU instances that facilitate real-time applications. You will also participate in the unfurling of advanced AI research and development by working with our cloud and container infrastructure teams and AI researchers.

Job Duties and Responsibilities:

  • In order to perform large-scale computations efficiently, it is recommended to deploy distributed training clusters in the public cloud, optimized for storage and networking stacks.
  • To guarantee maximum uptime during training, develop an infrastructure that can distributed computation across multiple nodes and selectively restart only the failed nodes, while keeping the rest of the process intact.
  • The aim is to implement and refine real-time infrastructure that delivers optimal performance in the public cloud.
  • Create a strong framework for deploying search indexes and embeddings in vector databases, allowing for streamlined search processes and increased productivity.
  • Take part in the design and execution of crucial features by partnering with cloud and container infrastructure teams as well as AI researchers.

Qualifications and Experience:

  • The minimum requirement for pursuing a Bachelor's degree is a technical field degree such as Computer Science or Computer Engineering.
  • The applicant must have a minimum of six years of experience building data-intensive solutions that utilize distributed computing.
  • A minimum of six years of coding experience using Python, Go, Scala, or Java is required.
  • Proficiency in High-Performance Computing (HPC), vector embedding, or semantic search technologies of at least one year is a prerequisite for this job position.
  • A minimum of one year's experience in constructing, expanding, and optimizing training or inferencing systems for deep neural networks.

Preferred Qualifications:

  • A Master's or Doctoral degree in Computer Science, Electrical Engineering, Computer Engineering, Mathematics, or another comparable field is a prerequisite for this position.
  • Demonstrating proficiency in machine learning, enriched by a wealth of experience in training and deploying deep neural networks and transformer models across large-scale production environments.
  • To excel in the field of AI, one must have hands-on experience with modern machine learning frameworks like TensorFlow, PyTorch, Lightning, or Mosaic ML.
  • Proficiency to excel in a dynamic environment that presents obscurity and competing obligations and schedules.
  • Experience in startups and companies that prioritize their products and technology is highly valued.
  • Adept at deploying advanced neural networks in challenging production scenarios.
  • A proficient and knowledgeable individual in the area of developing GPU clusters within the public cloud, with a focus on tightly-coupled storage and networking capabilities.

Benefits of the Position:

  • Cash bonuses and long-term incentives are popular forms of performance-based incentive compensation because they encourage employees to work towards a clearly defined goal and reward them for achieving it.
  • Encouraging the total well-being of employees through a comprehensive benefits package, encompassing health, financial, and other benefits.
  • One of the benefits of working at Capital One is the variety of career paths available for employees, allowing for personal and professional growth.
  • Embracing inclusivity and support in the workplace to cultivate an environment that nurtures personal and professional development.

About Company:

Capital One is focused on developing AI systems that are synonymous with integrity, consistency, and maintained by human intervention, all while transforming the banking industry. Our objective is to utilize machine learning and AI technology to introduce intelligent and automated customer interactions in real-time. We invite you to collaborate with us in creating a new landscape of how we can serve our customers and businesses, preparing for the banking system's future.


Report this job

Similar jobs near me

Related articles