Please scroll down, To apply

Staff Software Engineer, Platform App GenAI

hiring now
New job

ServiceNow

2024-11-08 02:42:26

Job location Santa Clara, California, United States

Job type: fulltime

Job industry: I.T. & Communications

Job description

Job Description

Key Responsibilities:

Develop and maintain APIs using NVIDIA Triton Inference Server for scalable deployment of Large Language Models (LLM).

Implement and optimize pre-processing and post-processing pipelines tailored for LLMs to improve accuracy and efficiency.

Work with Retrieval-Augmented Generation (RAG) frameworks to enhance the model's response generation capabilities.

Collaborate with data scientists, software engineers, and product teams to integrate and deploy ML solutions into production.

Troubleshoot and resolve issues related to model inference, performance, and scalability

Inform a friend!

<!– job description page –>
Top