Opportunity for MLOPS Engineer: Serve a wav2vec Speech Recognition Model through Triton Server
Budjetti $250-750 USD
We are looking for a talented MLOPS engineer to work on a challenging speech recognition project. The project has a tight deadline of 5 days. The tasks involved are:
Based on the wav2vec2 model available in the repository lgris/wav2vec2-large-xlsr-open-brazilian-portuguese-v2, convert it to ONNX and TensorRT
Evaluate the WER of the model in TensorRT compared to the original model in Hugginface
Create a Dockerfile with the Triton server configured with an endpoint to consume the model in TensorRT
Create a Dockerfile with a Python server using [login to view URL] to send audio to the Triton server for inference
Create a Docker Compose file with the three services communicating with each other and ready for testing
Compare the inference times of the PyTorch model served directly from Python, the TensorRT model served directly from Python, and the model served through the TensorRT server
Evaluate the latency of the communication between the Python server and the TensorRT server
The goal is to perform audio inference captured from the user's microphone in browser through [login to view URL] communication with the Python server and then from this to the Triton server to be able to receive multiple concurrent requests from different users
Attention should be paid in the Python server to have a session for each user, so that the streaming audio can be returned to the user who sent the audio.
If you have the skills and experience to tackle this project, we would love to hear from you. Please apply with your portfolio and relevant experience. Time is of the essence, so apply as soon as possible.
14 freelanceria on tarjonnut keskimäärin $650 tähän työhön
Hello. As a Professional NLP Engineer, I have strong knowledge and rich experience with Python, Pytorch, Tensorflow, NLP, ChatBot, OpenAI ChatGPT, Fine-tuning the OpenAI API model, ASR(Automatic Speech Recognition usin Lisää