We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Sr. Manufacturing Site Reliability Engineer, Windows Platform

Tesla Motors, Inc.
paid holidays, flex time, 401(k)
United States, Texas, Austin
May 12, 2026
What to Expect

Tesla's Manufacturing SRE team owns the underlying platforms that keep production lines moving across Fremont, Sparks, Austin, Berlin, and Shanghai. The fleet spans tens of thousands of Linux and Windows hosts driving Station Controllers, Industrial PCs (IPCs), camera systems, robot controllers, and label printers across every shop on every line.

The Controls organization is shifting more workloads back to Windows, and the Optimus program is on track to run more Windows computers than Linux in production. Today, one engineer carries the Windows specialty for the entire team. This role exists to change that, to scale the depth of Windows expertise on MFGSRE so the platform can absorb the Optimus rollout, the IPC fleet growth, and a steady increase in Windows software complexity per host without losing reliability.

You partner with Controls Engineering, IT Manufacturing Operations, and the Optimus team to make sure Windows hosts boot, image, monitor, recover, and patch the same way Linux hosts already do, and you contribute back to the broader MFGSRE platform (TFI imaging, WINFinder inventory, TFO observability) so the wins compound.


What You'll Do
  • Own the Windows production fleet end to end: Industrial PCs, Tangents, MTE benches, Optimus dyno PCs, GA station HMIs, and the long tail of factory Windows hosts across all sites
  • Drive Windows imaging through the TFI PXE pipeline (Ansible, Jenkins, Artifactory) so a new IPC can boot, join the domain, and report green telemetry without manual intervention
  • Extend WINFinder, the Windows host inventory and management service, to cover new platforms and new sites; co-maintain Windows LAPS rotation, AD lifecycle, and the ITMFGAgent runtime
  • Push the Windows fleet onto the same observability surface as Linux: Grafana Alloy and the Tesla Metrics Agent (TMA) collecting metrics, structured logs into Splunk MFGSRE indexes, alerting in Opsgenie or JSM
  • Build automation for the Windows-specific operational pain that does not exist on Linux: GPO drift, driver and firmware management, Windows Update windows, NTFS permission audits, time zone enforcement across sites, certificate rotation for OPC-UA and ACR
  • Roll out and sustain the SentinelOne agent across factory Windows endpoints, partnering with Infosec on detections and exclusions tuned for production hardware
  • Own the deployment story for Windows-resident Tesla applications (PrintApp configurator, ZCP, NX Witness, station controllers) when they touch Windows boxes
  • Carry production on-call rotation for Windows incidents: triage P1 line-down events, write the runbook, file the Jira, drive the post-mortem, and turn the fix into automation
  • Contribute to the cross-platform tools (TFI, TFD, TFC, TFO, ITMFG/itmfg-windows, ITMFG/print-app) that MFGSRE owns; submit PRs in Go, Python, PowerShell, or Ansible as the work demands

What You'll Bring
  • 5+ years operating Windows in production at scale, including Active Directory, GPO, Windows Server, and Windows 10 or 11 LTSC on industrial hardware
  • Strong PowerShell skills: scripts that hit the Win32 API, parse event logs, drive WMI, and integrate with REST endpoints, not one-liners
  • Hands-on experience with at least one configuration management or imaging platform: Ansible, Intune, SCCM, MDT, Puppet, or equivalent custom PXE work
  • SRE practice fundamentals: SLO design, alert hygiene, runbook discipline, blameless post-mortem authorship, error-budget thinking
  • Working knowledge of Linux as a peer platform; you do not need to be a kernel hacker, but you can read a systemd unit, write a bash one-liner, and submit a clean Ansible PR
  • Comfort writing application code in at least one of Python, Go, or C#, enough to ship a small service or extend an existing one
  • Production experience with observability tooling: Prometheus or Grafana, Splunk or equivalent log platform, OpenTelemetry concepts
  • A bias for automating away repeated work, even at the cost of more upfront engineering effort
  • Direct, low-ego communication style; comfort working asynchronously across Sparks, Fremont, Austin, and Berlin

Compensation and Benefits
Benefits

Along with competitive pay, as a full-time Tesla employee, you are eligible for the following benefits at day 1 of hire:

  • Medical plans > plan options with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental (including orthodontic coverage) and vision plans, both have options with a $0 paycheck contribution
  • Company Paid (Health Savings Accounts) HSA Contribution when enrolled in the High-Deductible medical plan with HSA
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA)
  • 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits
  • Company paid Basic Life, AD&D
  • Short-term and long-term disability insurance (90 day waiting period)
  • Employee Assistance Program
  • Sick and Vacation time (Flex time for salary positions, Accrued hours for Hourly positions), and Paid Holidays
  • Back-up childcare and parenting support resources
  • Voluntary benefits to include: critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance
  • Weight Loss and Tobacco Cessation Programs
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program
    Applied = 0

    (web-bd9584865-94bfb)