mbr partners company
Our clients are a fabless chip design technology company specialising in the design and development of cutting-edge, customised server hardware solutions optimised for artificial intelligence and machine learning applications.
Their mission is to empower businesses and researchers to accelerate their AI initiatives by providing them with high-performance,
scalable, and energy-efficient hardware infrastructure. As a rapidly growing company at the forefront of AI hardware innovation, they are constantly seeking talented and motivated individuals to join their team. They offer a dynamic and challenging work environment, with opportunities to make a significant impact on the future of AI technology.
Your objectives
Build the low-level firmware foundation that brings our next-generation AI training
ASIC to life — from secure boot to runtime management, PCIe to bring-up to multi-die coordination. Your ability to code will be the critical bridge between their custom silicon and the AI workloads that will reshape industries. This is where you can make an impact and where your expertise directly accelerates AI breakthroughs for our clients' customers worldwide.
What you will manage :
Boot & Security: Implement secure boot chain (BootROM → PBL/SBL →RTOS), device identity, attestation, anti-rollback, and field updates that protect
our customers' valuable AI models and data.
Hardware -Initialize clocks, power, HBM, PCIe, NoC, and SerDes PHYs often before our silicon arrives (emulation/FPGA), ensuring first-time-right
deployment.
High-Speed Networking: Bring up 400G Ethernet interfaces, configure MAC/PCS/FEC layers, and coordinate with network stack for RDMA offload
engines essential for distributed AI training.
Runtime excellence Power/thermal control, DVFS, watchdogs, RAS (ECC, crash dumps), and error recovery that delivers the 99.99% uptime
Host Communication: Design mailbox protocols, MSI/MSI-X, DMA coordination, and health reporting with our Linux driver team to enable seamless integration.
Telemetry & Observability: Unified event tracing with synchronised timestamps (PTP/PHC) giving our customers unprecedented visibility into their AI
infrastructure.
Multi-Die Scaling: Discovery, link training, topology management, and loadbalancing hooks across chiplets — because AI models keep growing.
Validation & Documentation: Own bring-up guides, scripted tests, HIL/emulation validation and HAL/config schemas that enable our global support team and customers
Nice-to-Haves
This role is based in Dubai, where the client is building their team - relocation is supported by the provision of flights and family assistance too.
Please let me know your family details and answer the questions to check suitability for the role.