DevOps Engineer / Site Reliability Engineer

Remote (U.S.) — Full Time

The Cari Network is building a blockchain-based payment system powering real-time, 24/7 movement of digital money across the traditional finance and digital asset ecosystems. A pre-launch startup, we're creating an entirely new category of financial infrastructure that brings cash on chain through tokenization while powering the full spectrum of payments for our network—banks, digital asset exchanges, and, most importantly, the customers they serve.

About the Role

You'll own the full infrastructure lifecycle for Cari's production environment: containerized service deployments, async pipeline infrastructure, observability stack design, and CI/CD pipeline maturation. You'll work alongside engineers across our backend services, smart contracts, and API gateway to ensure every layer deploys safely, observably, and with clear rollback paths.

You'll also be the engineering team's interface for operational discipline—defining upgrade governance, incident response runbooks, and the monitoring posture appropriate for a financial-grade system processing tokenized bank deposits.

What You'll Do

  • Design and operate the production deployment pipeline for all Cari services, replacing manual deployment processes with a governed, automated CD pipeline with environment promotion and rollback controls

  • Build and maintain the async transaction pipeline infrastructure

  • Establish and own the observability stack: application-layer monitoring layered on cloud infrastructure metrics—structured logging, distributed tracing, error budgets, and alerting calibrated for financial transaction systems

  • Define and enforce infrastructure-as-code discipline covering networking, compute, message queue clusters, databases, caching layers, and secrets management

  • Implement smart contract upgrade governance controls: upgrade timelocks, contract versioning CI enforcement across repos, and deployment record standards

  • Own secrets management and environment configuration across development, testnet, and production environments, including custody platform credential rotation

  • Design and run incident response processes: on-call rotation, severity classification, post-mortems, and SLA tracking appropriate for bank partner commitments

  • Partner with engineering to establish and enforce service reliability standards: health checks, graceful shutdown, crash recovery patterns, and at-least-once delivery guarantees for the transaction pipeline

Qualifications

  • 6+ years of DevOps or SRE experience, with at least 3 years owning production infrastructure for transactional or financial systems

  • Deep AWS expertise: ECS (or EKS), MSK, RDS, ElastiCache, CloudWatch, IAM, VPC, and Secrets Manager

  • Production experience operating Kafka or equivalent message queue infrastructure, including consumer lag monitoring, partition management, and operational failure modes

  • Strong infrastructure-as-code skills with Terraform, Pulumi, or CloudFormation; able to take an undocumented environment and systematically codify it

  • Experience building CD pipelines (GitHub Actions or equivalent) with environment promotion, approval gates, and automated rollback

  • Hands-on observability experience—structured logging pipelines, distributed tracing (OpenTelemetry or equivalent), and SLO/error budget design in Datadog or similar platforms

  • Security-first operational mindset: secrets rotation, least-privilege IAM, audit logging, and network segmentation in regulated environments

  • Comfortable operating with high autonomy in a small team where significant infrastructure work remains to be defined and built

Nice to Haves

  • Experience operating infrastructure for blockchain or Web3 systems—RPC node management, transaction monitoring, or custody platform integrations

  • Familiarity with Docker multi-stage builds and Node.js 24 Alpine runtime environments

  • Experience with GitHub Actions monorepo pipelines (pnpm workspaces, selective builds)

  • Prior work at a fintech, payments company, or bank technology vendor where reliability commitments were contractual obligations

  • Familiarity with Foundry-based Solidity CI pipelines (forge fmt, forge build, forge test)

What We Offer

  • Competitive compensation: salary and equity

  • Full medical, vision, dental benefits

  • 401(k) with matching program

  • Flexible vacation policy (PTO) and remote-first work environment

Interested in this role? Send us your resume.