
Closed
Posted
Paid on delivery
**This need to be an extremely fast turnaround.** We're building a screenless voice communication device for kids, built on ESP32. We have a working prototype that can make and receive calls. Looking for an embedded/firmware developer to help push the prototype further. What's working ESP32-based device making real calls via SIP Basic audio in/out Core call flow functional What we need help with Latency — end-to-end delay is too high. Get it under 150ms. Start with OPUS codec and jitter buffer tuning. Echo — calls have echo. Implement AEC (look at esp-sr, built into ESP-IDF). Contact storage — need to store a small list of names and numbers on-device via a config file (SPIFFS or NVS). On-device UI to scroll and select. ESP32-to-ESP32 calling — alongside calling real numbers, add support for device-to-device calls over SIP across different WiFi networks. This means setting up a SIP proxy (Kamailio) on a VPS and registering each device to it. Stack ESP32 (ESP-IDF) SIP / VoIP Ideal candidate Strong ESP32 / ESP-IDF experience Comfortable with SIP/VoIP Has done audio work on embedded hardware before
Project ID: 40397475
59 proposals
Remote project
Active 24 hours ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
59 freelancers are bidding on average $545 USD for this job

With 10+ years of experience under our belt, ZAWN Tech is well-versed in the realm of embedded systems and VoIP development, making us your one-stop solution for this project. Our expertise aligns perfectly with the needs you've highlighted: strong knowledge of ESP32/ESP-IDF, comfort working with SIP/VoIP, and extensive experience in audio implementations on embedded hardware. We understand the urgency for a fast turnaround, and I commit to delivering efficient results without compromising quality. What sets us apart is our ability to combine technical know-how with real-world problem-solving skills that ensure fast, secure and scalable solutions. We'll prioritize tackling the latency issue for an improved end-user experience; by tweaking the OPUS codec and jitter buffer, we'll get that end-to-end delay under your target of 150ms. Managing contact storage effectively is also crucial for the device's usability. I propose utilizing SPIFFS or NVS for storing name and number lists on-device along with a user-friendly UI to scroll and select contacts effortlessly.
$750 USD in 7 days
8.9
8.9

Hi there, I will optimize your ESP-IDF audio pipeline with OPUS and esp-sr AEC to deliver sub-150ms echo-free SIP calls for your device. --Why me?-- Expertise: I specialize in low-latency embedded audio processing and deploying Kamailio SIP proxies for custom ESP32 VoIP networks. Hardware & Compliance: I already have the required development hardware on hand to start immediately and guarantee EMI certification-ready design architectures. Availability & Support: I align with your timezone, respond in under 5 minutes, and provide 2 weeks of free post-project support to ensure seamless integration. Documentation: You will receive complete manufacturing files, an optimized LCSC BoM, and clean source code ready for immediate production. --How I will handle your task-- Milestone 1: Kamailio SIP proxy deployment and NVS contact storage setup Milestone 2: OPUS codec integration and jitter buffer latency tuning Milestone 3: Acoustic echo cancellation implementation and cross-network testing Regards, Majeed
$500 USD in 7 days
6.4
6.4

I am an embedded systems engineer with over 16 years of experience. I have experience working with voip on embedded platform. I have have implemented a similar system but not using esp32. I can develop the firmware for you and a full working prototype as well. Please contact me to discuss details.
$1,000 USD in 14 days
5.7
5.7

Hi, I'm excited to propose my help for your project to build a screenless voice communication device for kids using an ESP32. With over a decade of experience in the field, I have the skills and expertise needed to assist with the enhancements you require. The current prototype is functioning well, making and receiving calls via SIP, but we need to address two critical issues: latency and echo. My plan is to start by tuning the OPUS codec and jitter buffer to reduce end-to-end delay below 150ms. I will then implement AEC (echo cancellation) using esp-sr from the ESP-IDF library to eliminate any echoes during calls. For contact storage, I'll develop a feature that allows storing a small list of names and numbers on-device via SPIFFS or NVS. Additionally, I’ll create an on-device UI for easy scrolling and selection of contacts. To support peer-to-peer calling over different WiFi networks, I will set up a SIP proxy (Kamailio) on a VPS and register each device to it. This will enable ESP32-to-ESP32 calls alongside traditional call functionalities. I have extensive experience with ESP32/ESP-IDF, SIP/VoIP, and embedded audio work, ensuring that I can deliver the project efficiently within your tight timeframe. For more details on my previous projects, you can check out [my portfolio here](https://www.freelancer.com/u/reedsystems). Thank you for considering my proposal.
$550 USD in 10 days
5.9
5.9

⭐⭐⭐⭐⭐ ✅Hi there, hope you are doing well! I recently worked on an ESP32-based voice communication device where I optimized SIP call latency and audio processing, achieving seamless real-time calls with low delay. From my experience, tuning the audio codec and jitter buffer is crucial to minimizing latency and ensuring call quality. ⭕ I will start by profiling and tuning the OPUS codec and jitter buffer settings, implement echo cancellation using ESP-IDF's esp-sr library, develop an efficient on-device contact storage solution using SPIFFS or NVS, create a simple UI for contact navigation, and set up a Kamailio SIP proxy for device-to-device SIP calls across WiFi networks. ❓ What is the current average latency measured from end-to-end? ❓ Could you share more about the existing contact UI implementation? ❓ Are there any specific latency threshold requirements besides sub-150ms? I am confident my embedded audio processing and SIP expertise will push your prototype swiftly towards a robust final product. Best regards, Nam
$550 USD in 5 days
5.4
5.4

Hi! Your ESP32 architecture is sound — SIP voice calls on that platform are absolutely achievable, and the fact that you have a working prototype validates the core approach. The remaining challenges aren't feasibility problems, they're optimization problems, and the hardest one isn't on your radar yet: Acoustic Echo Cancellation is tied directly to the physical geometry of your enclosure. The moment the housing changes, the speaker-to-mic coupling changes, and the AEC model's assumptions break down. On a kids' device especially — where children hold things unpredictably and use them in reverberant spaces like bathrooms and cars — getting echo cancellation to actually work (not just compile) requires acoustic characterization of your final enclosure design. That's the piece that separates a good demo from a product that works in someone's home for six months. I've worked through this exact class of problem on an ESP32-based intercom system where I2S pipeline management, SIP re-registration state machines, and jitter buffer tuning all had to coexist on a constrained dual-core platform without starving the WiFi stack. Two questions worth aligning on before we start: Is your current enclosure design finalized, or is the housing still in flux? And when you say under 150ms end-to-end, have you measured where your current latency is coming from — codec delay, jitter buffer depth, or network propagation to your VPS? we can discuss it more in details in chat. Thanks
$500 USD in 5 days
5.5
5.5

Hi, how are you doing? I went through your project description and I can help you in your project. your project requirements perfectly match my expertise. We are a team of Electrical and Electronics engineers, we have successfully completed 1000+ Projects for multiple regular clients from OMAN, UK, USA, Australia, Canada, France, Germany, Lebanon and many other countries. We are providing our services in following areas: Antenna Design (CST, HFSS) Embedded C Programming. VHDL/Verilog, Quartus/Vivado, LabVIEW/ Multisim/PSPICE/VLSI MATLAB/SIMULINK Network Simulator NS2/NS3 Microcontroller like Arduino, Raspberry Pi, FPGA, AVR, PIC, STM32 and ESP32. IDEs like Keil MDK V5, ATmel studio and MPLab XC8. PLCs / SCADA PCB Designing Proteus, Eagle, KiCAD and Altium IOT Technologies like Ethernet, GSM GPRS. HTTP Restful APIs connection for IOT Communications. Also, we have good command over report writing, I can show you many samples of our previous reports. Kindly consider us for your project and text me so that we can further discuss specifically about your project's main goals and requirements.
$500 USD in 7 days
5.5
5.5

Toriqul Global Solutions is a trusted web design and development company specializing in modern, high-performance, and user-friendly digital solutions. Founded by Engineer Md. Toriqul Islam, a Computer Science & Engineering graduate from RUET, we bring over 10+ years of industry experience in creating scalable, visually stunning, and business-focused websites. Our Expertise We provide complete full-stack web and mobile app development services with modern technologies, including: HTML5, CSS3, Bootstrap, JavaScript, jQuery, React JS, Angular JS, Node JS, PHP, Laravel, WordPress, .NET, Python, Ruby on Rails, MySQL, MongoDB, React Native, and more. Why Choose Us? ✔ Modern, clean, conversion-focused designs ✔ Fully responsive across all devices ✔ Scalable, secure, and optimized development ✔ Clean and maintainable code structure ✔ On-time delivery with strong commitment ✔ Professional communication & support ✔ 100% Client Satisfaction Priority We have successfully delivered projects for clients across multiple industries with excellent feedback and long-term relationships. Let’s build something exceptional together. Contact us today to turn your ideas into reality. Best Regards Toriqul Global Solutions
$250 USD in 7 days
5.2
5.2

With over a decade of experience as a full stack developer, particularly specialising in areas like embedded systems and firmware, I seem to be an ideal choice for resolving the challenges faced in your project. With my structured end-to-end approach towards software solutions, noteworthy experience with ESP32 and ESP-IDF as well as expertise in VoIP/SIP building applications, I believe that I am capable enough to tackle the specific roadblocks you have mentioned. Having worked on audio enhancement projects on embedded hardware, I can effectively implement the AEC (esp-sr) to tackle echo concerns. Furthermore, with an understanding of the significance of efficient latency rates and clean code for a smooth voice communication device, I will zealously work towards reducing end-to-end delay to millisecond scales. Additionally, my familiarity with storing config files via SPIFFS or NVS will facilitate hassle-free integration of contact storage system while my proficiency in setting up Kamailio SIP proxy on a VPS will ensure that calls/devices communicate smoothly across networks.
$250 USD in 7 days
5.2
5.2

Hi, I’m an embedded firmware engineer with 8+ years of experience on ESP32/ESP-IDF and VoIP systems, having delivered 20+ production devices with real-time audio pipelines and reducing end-to-end latency by up to 60% in prior SIP-based projects. I’ve implemented OPUS pipelines, AEC/NS/AGC chains, and optimized jitter buffers on constrained hardware, achieving sub-120ms latency and stable full-duplex audio on ESP32-class MCUs. Approach ✅ I will profile the current audio pipeline (I2S, codec, network stack) and reduce latency via OPUS parameter tuning (bitrate/frame size), adaptive jitter buffer, and FreeRTOS task prioritization. ✅ I will integrate AEC using ESP-SR with proper mic/speaker calibration, echo path modeling, and double-talk detection to eliminate feedback. ✅ I will implement contact storage using NVS/SPIFFS with a lightweight indexed structure and design a minimal button-driven UI state machine for navigation and selection. ✅ I will configure SIP device-to-device calling via a Kamailio proxy on VPS, including NAT traversal (STUN/TURN), registration handling, and secure session setup. Questions ✅ I need details on current latency and audio pipeline setup. ✅ I want to confirm OPUS configuration (bitrate, frame size). ✅ I need info on available hardware controls for UI. Best, Yaroslav
$650 USD in 7 days
5.0
5.0

With over 9 years of experience in mobile app development, I assure you that my team and I are the best fit for this firmware development job. Despite specializing in Android and iOS, I and my team have significant hands-on skill with ESP32 and ESP-IDF, which are an integral part of your project. We have a track record of turning ideas into reality; just like yours, we have worked on many projects to improve latency and audio on embedded hardware. Moreover, We are comfortable working with SIP/VoIP, as well as setting up SIP proxies on VPS. This past experience matches your requirement to enable device-to-device calls over SIP across different WiFi networks. Apart from these technical skills, our cost-effective approach combined with the assurance of free support for the first three months makes us even more attractive for this project. In conclusion, choosing us means partnering with a seasoned professional who has a successful track record in both the embedded system and mobile app development realms. Rest assured that by choosing us, you're not only getting an IT service but also access to a valuable solution partner who will transform your voice communication device for kids from prototype to a top-notch reality. Thank you for considering my profile!
$500 USD in 7 days
5.2
5.2

⭐⭐⭐⭐⭐ I’ve worked on ESP32 audio pipelines and SIP stacks where latency and echo had to be pushed down to usable levels—this is exactly the stage where careful tuning makes a huge difference. How I’d tackle it: 1) Latency (<150 ms target) • Switch to OPUS low-delay config (10–20 ms frames) • Reduce jitter buffer (adaptive, tight bounds) • Optimize I2S + DMA buffering (avoid over-queuing) • Profile end-to-end path (mic → encode → RTP → decode → speaker) 2) Echo (critical for usability) • Integrate AEC from ESP-SR (tuned for your mic/speaker layout) • Control gain + add basic AGC to prevent feedback loops • Validate with real acoustic conditions (not just bench) 3) Contact storage + UI • NVS or SPIFFS JSON config (lightweight, editable) • Simple scroll/select state machine (button-driven UI) 4) ESP32 ↔ ESP32 calling (SIP) • Set up Kamailio on VPS (registration + routing) • Each device registers as SIP client • Direct RTP between devices (minimize relay latency) Timeline: Fast—can start improving latency/echo within 1–2 days of access Quick questions: Which ESP32 variant? (S3, WROOM, etc.) Current audio path (I2S mic + DAC or codec IC?) Frame size / codec settings you’re using now? If you want this to feel “instant” and kid-friendly, getting latency + echo right will make all the difference—I can help you push it there quickly.
$500 USD in 7 days
4.4
4.4

Since 2003 I am working in Digital Electronic. So more than 18 years of experience in Electronics. Arduino NANO/UNO/MEGA, ESP32 and Raspberry PI to build a digital device to read sensor data and send it to the web server, motor control, control relay switches and LEDs. More than 5(five) years of experience in Arduino design and build. If you want an excellent and error-free project delivery, then send a message to me, please.
$1,500 USD in 30 days
4.5
4.5

Hi there, considering your voice communication device for kids, I'm ready to tackle the latency issue efficiently. Latency can be a real bottleneck in real-time communication. My approach: Optimize the OPUS codec and fine-tune the jitter buffer to ensure we deliver clear, timely audio. In a recent project, I reduced latency to under 120ms on a similar ESP32-based system, enhancing call quality significantly. I include 30 days of post-deployment bug-fixing at no extra charge. How do you plan to prioritize the AEC implementation alongside latency improvements? Let's discuss how we can enhance your prototype swiftly.
$500 USD in 7 days
4.1
4.1

I can help push your ESP32 voice device from working prototype to a low-latency, production-ready SIP voice system optimized for real-time communication on constrained hardware. My first focus would be reducing end-to-end latency below 150ms, starting with OPUS codec tuning (bitrate, frame size, and complexity settings) and tightening the jitter buffer strategy to balance stability vs responsiveness. For the echo issue, I would integrate and configure ESP-SR’s Acoustic Echo Cancellation and fine-tune microphone/speaker gain staging to ensure clean full-duplex communication on the ESP32 audio path. For contact management, I’d implement a lightweight on-device storage layer using NVS or SPIFFS, allowing you to store a small address book of names and numbers, along with a simple navigation UI for selection using the existing hardware controls. For ESP32-to-ESP32 calling, I would set up a SIP infrastructure using Kamailio on a VPS, enabling device registration, NAT traversal handling, and seamless calling across different networks while keeping compatibility with external SIP numbers. The firmware will be structured cleanly in ESP-IDF with modular audio, SIP, storage, and UI layers so future enhancements can be added. Given the urgency, I would start immediately with a focused sequence. If this approach aligns with your current prototype setup, let’s connect and I can jump in right away.
$500 USD in 15 days
3.8
3.8

Hey, this is very doable, and I can jump in fast. I’ve worked with ESP32 + ESP-IDF audio pipelines and SIP/VoIP stacks before, so I’m comfortable optimizing low-latency voice systems like this. Main focus would be: - Reducing end-to-end latency under ~150ms by tuning OPUS settings, buffer sizes, and RTP jitter handling - Fixing echo using ESP-IDF’s esp-sr stack and proper AEC configuration - Adding lightweight on-device contact storage using NVS or SPIFFS with simple UI navigation - Extending SIP setup so devices can register through a VPS-based proxy like Kamailio for true device-to-device calling across networks I’ll keep changes incremental so your working prototype stays stable while we improve performance and add features. Fast turnaround is fine — I can start immediately and prioritize latency + echo first since those are critical for voice quality
$500 USD in 7 days
3.7
3.7

As an experienced developer in the VoIP and embedded systems space, I am confident that I can help you take your ESP32-based device to the next level. My proficiency with ESP-IDF and deep understanding of SIP/VoIP protocols will be instrumental in addressing the challenges you've mentioned. With regards to latency, I have successfully fine-tuned codecs like OPUS before and can achieve your target end-to-end delay of under 150ms. Echo reduction through implementing AEC is another area I'm well-versed in, having used functionalities like esp-sr, built into ESP-IDF. Additionally, I have a solid command over storage systems like SPIFFS or NVS and can efficiently develop on-device UI features to enable ease of use, such as scrolling and selecting from a small list of names and numbers for this project. Furthermore, my experience with Wi-Fi network calling over SIP proxies is critical here; I’ve implemented similar functionality using Kamailio on VPS links for device-to-device communication. With more than a decade in the field, my skills portfolio covers all the requirements in your project description. Collaborating with me would ensure a fast turnaround without compromising quality along with reliable communication. Let's bring this amazing project to fruition together!
$250 USD in 7 days
3.8
3.8

Hi, I can jump in immediately and help optimize your ESP32 voice platform for production-grade performance. My background includes extensive ESP32/ESP-IDF development, real-time embedded systems, wireless communication, and low-latency audio processing. I’ve built communication and control devices where timing, reliability, and efficient resource management were critical. For your device, I’d focus first on the highest-impact areas: OPUS tuning, jitter buffer optimization, and audio pipeline profiling to drive end-to-end latency below your target. I can also integrate ESP-IDF’s built-in AEC to significantly reduce echo and improve call clarity. Beyond audio, I’ll implement robust local contact storage using NVS or SPIFFS, along with a lightweight, intuitive on-device selection interface. I’m also well-equipped to add ESP32-to-ESP32 calling via SIP by configuring device registration through a Kamailio proxy, enabling reliable communication across separate networks. Given the urgency, I can start right away and work iteratively to deliver measurable improvements quickly. My approach is practical, performance-driven, and focused on getting your prototype ready for the next stage fast. Best regards
$400 USD in 7 days
3.9
3.9

Hi there, This is a tight, performance-critical embedded task—and I can jump in immediately to stabilize and optimize your ESP32 voice prototype. Approach (fast-track): I’ll focus on the three bottlenecks first—latency, echo, and SIP flow—then extend to contacts and device-to-device calling. Phase 1: Audio + Latency Optimization • Tune OPUS (bitrate, frame size, complexity) for low-latency profiles • Optimize jitter buffer (adaptive sizing, packet handling) • Reduce buffering in audio pipeline (I2S + codec chain) • Target <150ms end-to-end delay Phase 2: Echo Cancellation • Implement AEC using ESP-IDF / esp-sr • Tune mic/speaker gain + acoustic path • Validate full-duplex stability Phase 3: SIP & Networking • Clean SIP stack handling (registration, keep-alive, NAT traversal) • Set up Kamailio proxy on VPS for ESP32-to-ESP32 calls • Ensure stable cross-network communication Phase 4: Device Features • Contact storage (NVS or SPIFFS) • Lightweight on-device UI (scroll/select logic) • Efficient config handling Deliverables: • Optimized firmware (ESP-IDF) • Reduced latency + echo-free calls • SIP proxy setup + device registration • Contact system + UI logic • Build + flash instructions Classification: • Type: Embedded VoIP optimization • Platform: ESP32 (ESP-IDF) • Focus: Low latency, audio quality, SIP reliability I’ve worked on embedded audio + real-time systems and can move quickly. Ready to start immediately. Best Regards, JP
$250 USD in 7 days
3.5
3.5

Hello I will refine your existing ESP32 VoIP prototype into a production-ready low-latency voice device by directly targeting the audio pipeline, echo path, and SIP architecture without over-engineering the system. ✔ Objective -Bring end-to-end call latency under 150ms -Eliminate echo in real call conditions -Enable reliable SIP-based device-to-device calling ✔ Core Improvements -OPUS + jitter buffer tuning for tight real-time audio flow -ESP-SR AEC integration and calibration for your hardware acoustic path -SIP stability tuning for current call flow + NAT resilience -Kamailio-based VPS setup for ESP32-to-ESP32 calls across networks -Lightweight SPIFFS/NVS contact storage with simple selection UI ✔ Execution Strategy -Start from audio path (latency + echo first, as this defines perceived quality) -Stabilize SIP signaling and routing next (call reliability layer) -Add contact storage and UI as final integration layer -Keep changes minimal and aligned with your existing working prototype The focus is not redesigning your system, but tightening what already works into a stable, low-delay, real-world communication device. Best Regards, Nichita.
$500 USD in 7 days
3.0
3.0

Providence, United States
Payment method verified
Member since Mar 8, 2026
$30-250 USD
$30-250 USD
$250-750 USD
$10-30 USD
₹1500-12500 INR
$30-250 USD
$30-250 USD
₹600-601 INR
$20000-50000 USD
₹600-1500 INR
₹37500-75000 INR
£10-20 GBP
$1500-3000 USD
₹12500-37500 INR
₹75000-150000 INR
$42 USD
₹600-1500 INR
$30-250 USD
$15-25 USD / hour
₹12500-37500 INR
$10-30 USD
₹12500-37500 INR