About BeGig
BeGig is the leading tech freelancing marketplace. We empower innovative, early-stage, non-tech founders to bring their visions to life by connecting them with top-tier freelance talent. By joining BeGig, you’re not just taking on one role—you’re signing up for a platform that will continuously match you with high-impact opportunities tailored to your expertise.
Your Opportunity
Join our network as a Model Compression Engineer and help startups and AI-driven companies scale their machine learning solutions by making models more efficient, faster, and resource-friendly. Your work will be crucial for deploying AI on edge devices, reducing cloud costs, and improving performance across products.
Enjoy the flexibility of remote work, with both hourly and project-based options available.
Role Overview
As a Model Compression Engineer, you will:
- Optimize AI Models: Apply techniques like quantization, pruning, distillation, and weight sharing to compress large-scale neural networks without sacrificing accuracy.
- Deploy to Edge & Cloud: Package and deploy compressed models to various environments, including mobile, embedded, edge, and cloud.
- Toolchain Integration: Work with frameworks like TensorFlow Lite, ONNX, PyTorch Mobile, and TensorRT to automate and streamline model conversion and optimization.
- Benchmark & Evaluate: Measure and report on performance metrics such as inference speed, memory usage, and accuracy loss post-compression.
- Collaborate on System Design: Work with ML, backend, and hardware teams to ensure end-to-end system compatibility and optimal integration.
- Continuous Improvement: Research emerging compression methods and bring new approaches into production.
Technical Requirements & Skills
- Experience: Minimum 2+ years in machine learning, deep learning, or AI model optimization.
- Compression Techniques: Proficiency in quantization, pruning, distillation, low-rank approximation, or similar methods.
- Frameworks & Tools: Hands-on experience with TensorFlow Lite, ONNX, PyTorch Mobile, TensorRT, or related toolchains.
- Programming: Strong Python skills; C++ experience is a plus for custom ops and kernel optimization.
- Benchmarking: Ability to run performance tests and interpret results for both cloud and on-device deployments.
- System Integration: Familiarity with deploying models in production and troubleshooting integration issues.
What We’re Looking For
- An engineer who loves to make AI models lean, fast, and cost-effective.
- A freelancer with practical experience turning research-grade models into production-ready, optimized solutions.
- A collaborator who enjoys working at the intersection of ML, engineering, and deployment.
Why Join Us?
- Immediate Impact: Power startups and scaleups by enabling AI to run where and how it’s needed most.
- Remote & Flexible: Choose how you work—hourly or per project, from anywhere in the world.
- Future Opportunities: Get matched with roles in edge AI, mobile AI, and large-scale ML deployment.
Growth & Recognition: Join a community that values efficiency, ingenuity, and next-gen AI delivery.