Tooling Engineer
Job Description
We're building the company which will de-risk the largest infrastructure build-out in history.
When people finance GPU clusters, the datacenters housing them, and the infrastructure powering them, they need "offtake" - meaning someone has signed a contract to lease the cluster for a period of time before its even built.
Financing a GPU cluster is inherently risky, since margins are thin and volumes are huge. Lenders don't want to take on the risk that cluster developers can't repay their loan, and cluster developers really don't want to risk not selling their cluster. As a result, risk is offloaded to the customer using fixed-price long-term contracts.
If you don't mitigate this customer risk, there's a bubble. This isn't SaaS anymore - application layer companies sign multi-year contracts for computer and inference, but sell to customers on monthly subscriptions. If you mess up a purchase, it's game over: a minor shift in your revenue growth rate might mean the difference between profit or bankruptcy. But what if companies could exit their contract by selling it back to the market?
Otherwise, as AI scales, compute only becomes available to folks who can effectively take on that risk. A 2-person startup in a San Francisco Victorian can't realistically sign a 5-year take or pay contract on $100m supercomputers. But they may be able to buy the month of liquidity that someone else sold back.
So that's what we make: a liquid market for GPU offtake.
About the Tooling Team
We are a small team focused on making SFCompute engineering faster, more observable, and more reliable. Our work spans data infrastructure, developer experience, pre-production environments, and AI tooling. The common thread is not any single domain. It is that we find the problems nobody else owns and turn them into solved problems.
We act as internal field engineers. Our job is to maximize the speed and effectiveness of everyone else at the company. The team is kept deliberately small and independent so it can respond directly to its internal customers without waiting on outside approval. We own a graph of metrics and we are driven by goals, not by a ticket queue.
Everyone here wears many hats. You will work across the stack, collaborate with every part of engineering, and regularly take on problems that do not fit neatly into a job description. If you want a narrow scope and a clear ticket queue, this team is not it. If you want a large, legible impact on a small team building serious infrastructure, read on.
About SFCompute
The San Francisco Compute Company runs large-scale GPU clusters (H100s, H200s, B300s) on contracts you can exit. Need 256 H100s for three days? Buy them at market price, cancel what you don't use. We operate the stack from UEFI up, so you are never paying a reseller markup or waiting on a support ticket. Customers include NVIDIA, MIT, Liquid AI, and Roboflow. We are a small team that has managed over $1B of hardware and is building what we think will be the defining infrastructure marketplace for the AI era.
The Role
We are looking for a generalist Tooling Engineer to own the systems that sit underneath every engineer's daily work. You will embed with the people you serve, watch how they actually work, and fix what slows them down. Some weeks that means build pipelines and infrastructure as code. Other weeks it means a staging environment, a per-engineer developer sandbox, an internal data pipeline, or better tooling for AI coding agents. You decide what matters most by talking to your internal customers, not by waiting for a spec.
This is not a "build dashboards and wait for requests" role. The team owns the software development lifecycle, and the gaps in it are yours to close. You will need to scope your own work, ship it, and then stand up in front of the company every two weeks and show what improved.
What You'll Do
Embed with engineers across the company, learn their workflows, and find the bottlenecks nobody owns
Build and operate the systems beneath daily engineering work: build pipelines, infrastructure as code, internal services, and the production platform
Pick up problems across our focus areas as priorities shift: pre-production and staging environments, isolated developer and agent sandboxes, internal data and observability pipelines, and AI coding tooling
Drive ambiguous problems to clear outcomes, deciding what to build, not just how to build it
Demo your work to the whole company on a regular cadence and take candid feedback well
Track the metrics that show whether engineering is getting faster and more reliable (deployment frequency, lead time for changes, change failure rate, time to restore)
What We're Looking For
Ability to scope your own work and operate without a spec. The first job is figuring out what the problem actually is
Strong communication. You can explain your work clearly, demo it, and write it down
Genuine curiosity about how other people work and a habit of walking up to anyone to investigate their workflow
Comfort with ambiguity and a small ego, in the sense of caring more about the outcome than about owning it
Solid engineering fundamentals and the intellectual honesty to say when you do not know something
Nice to have: experience with CI/CD, infrastructure as code (Terraform, Helm, or similar), Kubernetes, ETL and analytical data stores, or AI coding tools such as Claude Code; familiarity with marketplace or infrastructure business models
Why This Role
The tooling team is small, trusted, and independent. You will have direct access to the engineers you serve and to leadership, and the backing to fix things the right way rather than just document them. The systems you build sit underneath everyone's daily work, so when they get better, the whole company feels it. The work you produce is a real artifact, not a presentation deck, and on a team this size your impact is immediate and legible.
Benefits
Generous equity grant
Team members are offered a competitive salary along with equity in the company
Visa Sponsorships
Yes, we sponsor visas and work permits
Retirement matching
We match 401(k) plans up to 4%
Medical, dental & vision
We offer competitive medical, dental, vision insurance for employees and dependents and cover 100% of premiums
Time off
We offer unlimited paid time off as well as 10+ observed holidays
Parental leave
We offer biological, adoptive, and foster parents paid time off to spend quality time with family
Daily lunch
We cover lunch daily for employees
Unlimited office book budget
You can buy as many books for the office as you want
The San Francisco Compute Company is committed to maintaining a workplace free from discrimination and harassment.
We make employment decisions based on business needs, job requirements, and individual qualifications, without regard to race, color, religion, belief, national origin, social or ethical origin, age, physical, mental, or sensory disability, sexual orientation, gender identity or expression, marital status, civil union or domestic partnership status, past or present military service, HIV status, family medical history or genetic information, family or parental status including pregnancy, or any other status protected by law.
We welcome the opportunity to consider qualified applicants with prior arrest or conviction records. Our commitment to diversity includes hiring talented individuals regardless of their criminal history, in accordance with local, state, and federal laws, including San Francisco’s Fair Chance Ordinance and California’s ban-the-box laws.