Systems Software Engineer
Company: LatchBio, Inc.
Location: San Francisco
Posted on: October 19, 2024
|
|
Job Description:
About UsLatchBio is building a cloud platform for biology that
includes data storage, analysis, and visualization. Our end-to-end
bioinformatics solution gives biologists direct access to
computational tools without the need to deal with data
infrastructure, command line interfaces, or code. Computational
staff also benefit as they no longer manually execute analyses or
setup and maintain infrastructure. Teams can aim higher and easily
set up user-friendly interfaces and access infinitely scalable
compute resources-something that would previously require an ad hoc
solution, a large time investment, and likely hiring new expensive
software engineering staff.On a higher level, our centralized,
easy-to-adopt, openly accessible platform solves the need to
reinvent similar infrastructure at each biology company. By
dedicating a team of product and software experts to the problem,
we make software tooling a top priority, something not possible at
companies making science their first priority. We bring the entire
industry forward to the level of quality expected of modern
software and beyond what even the biggest players can achieve on
their own.TLDR: the convergence of laboratory automation,
high-throughput assays, and machine learning is moving the medium
of biological discovery to silicon. At LatchBio, we are building
the ubiquitous cloud platform to store, visualize and analyze data
from biological experiments.- Data in biopharma will 20x in 7
years. $200m to $4b, 54% CAGR- The increasing size of file data and
compute needed for standard biological analysis will demand
software innovation- We are positioned to meet these software
infrastructure needsYou can learn more about us at any of these
places- - - - - About The TeamOur engineering team is small,
ambitious, and fast. We place extremely high emphasis on code
quality and professional development. Most of our engineers have
joined before graduating college and were immediately able to
perform at the equivalent of Senior level. We have experience
investing in early-career hires and helping them become stellar
engineers.Engineering Practices- Ubiquitous in-depth code review-
Static typing wherever possible- Very few meetings: daily 15-minute
stand-up + -2 hours of company-wide sync on Saturdays- High trust
environment & zero blame culture- High ownership & responsibility
for all members of the team regardless of level- No code owners.
Every engineer works on every product- End-to-end development. No
separation into frontend/backend/devops/DBA- Close collaboration
with the design, product, business, and customer success teams-
Emphasis on learning through weekly reading groups and occasional
research projects- Engineers are responsible for Quality
Assurance/testing on their own features- Aiming for code that is
self-documenting and simple+robust without the need for excessive
commenting or automated testingAbout The RoleThis role will require
both the understanding of low-level system concepts and the ability
to ship high-level frontend/scripting code at high velocity. The
following is a list of technologies and major concepts and their
role within Latch. You do not need any existing experience with any
one of these topics but will end up working with all of them at
some point. The ideal candidate will have interest in each topic
and in-depth familiarity with at least a few.- PostgreSQL-main
database, used also as a queue system, message broker,
permissioning enginem, etc. We do not use any ORMsConcepts:
transaction isolation, explicit locking, stored procedures,
Postgres database administration, row-level security- TypeScript,
Python-main programming languages on the frontend and backend
respectivelyConcepts: static typing, asyncio- React-UI
frameworkConcepts: writing hooks- Distributed systems-most of our
code is designed to be horizontally scaled. We maintain a
distributed file system AWSConcepts: fuse, distributed systems,
state machines- Cloud environmentConcepts: terraform, cloud
administration, cloud cost optimization- Kubernetes-cluster
softwareConcepts: k8s administration, kustomize, cluster
networking- Linux, Docker, systemd-server system softwareConcepts:
containerization, cgroups, writing sytemd unitsRequirements-
Demonstrable in-depth understanding of some complex technology. See
the Interview Process section for example topics- Ability and
interest to work with technology regardless of its complexity or
role- Motivation to learn and improve continuously- Work well with
minimal management- Desire to create things and make them your own,
contributing at every stage including requirement discovery,
design, implementation, testing, and maintenanceSalary and Equity-
160k to 260k- Equity: roughly > 0.1%, Benefits- O-1/H-1B visa
sponsorship- Best vision, dental, health insurance available to
companies of our size via Gusto- 12+ meals delivered from
restaurants each week- $3,000 annual office technology stipend-
Biannual company-sponsored conference trips, textbooks, and other
professional development- Internal reading groups- Company-wide
2-week Christmas vacation- Annual company retreat- 401(k) via
Guideline- Unlimited PTO- Competitive cash compensation- Large
equity grantsInterview ProcessLatch engineers typically benefit
from knowledge, interest, and experience in engineering complex
systems technology. At the bottom of this document is a list of
example topics and sample problems that we find interesting and
indicative of a candidates level.To judge your fit we follow a
3-step process: 1: Introductory call. Be prepared to walk through
your resume and talk in detail about past experience, going
in-depth on your technical contributions. 2: Technical interview.
This is a conversation on a technical topic to judge the depth of
your knowledge. Be prepared to discuss a technology of your choice
from your past experience similar in complexity to one of the
example topics. Be prepared to discuss example topics of your
choosing in case the original discussion ends up too short. 3: 1
week paid on-site contracting period. This is the time for us to
meet each other and directly experience what it would be like to
work together. Be prepared for a short onboarding followed by a
real-world production task off our backlog (though we try to pick
an interesting one). At the end of the week you will be required to
give a short 1-hour presentation/explanation of your work and
answer questions.Example Topics and ProblemsDatabases- Transaction
isolation levels. SQL Repeatable Read and Serializable
transactions. Implementing transaction isolation for all levels and
Serializable transactions in particular.- Locking. Using explicit
locking to solve concurrency issues with Read Committed isolation.-
**Specific problem:** a User may create and delete "Teams". These
are generic entities that are "owned" by the User that created
them. How do we reliably limit the total number of Teams each User
owns? Specify the tables, constraints, triggers, stored procedures,
etc. The User is adversarial and can send any valid SQL requests
directly to the database and manipulate the order in which
statements in transactions complete (i.e. exploit race conditions
or any other concurrency bugs).- Please use PostgreSQL for any
implementation specifics or examplesJavascript- React. Rules of
hooks. What is the purpose of `useCallback` and `useMemo`. Why
should arrow functions and `.bind` be avoided in component props.-
**Specific problem:** `eslint-react-hooks` warns against using
`async` functions in `useEffect`. Why? What similar problems can
arise when using `async` in `useCallback`? Design a `useAsync` hook
that works around these issues. There is a technical side to this
question (things that React doesn't like) and a semantic side
(things that behave weirdly in certain edge cases).- TypeScript.
Difference between `any` and `unknown`. Covariant and contravariant
types.- **Specific problems:** What is the most specific type for
`(xs, cls) => (x => x isinstance cls)`? (Note that TypeScript
will not accept `filter` here, but an explicit for-loop would
type-check). Why are function types contravariant in the parameter
type?- Immutable data structures (e.g. ). Benefits and drawbacks vs
mutable ones. What is the basic idea behind how these are
implemented?Python- Threading. GIL. Multiprocessing using ``.-
**Specific problem:** `tqdm` is a popular library for creating CLI
progress bars. Design the simplest possible
multiprocessing-compatible wrapper around a `tqdm` progress bar
using the standard library. It should allow a number of processes
to all add progress to a single progress bar. Hint: the intended
solution simply uses a standard library class.- `asyncio`. How to
run multiple tasks at the same time.- **Specific problem:**
`aiohttp` is an async HTTP client library. It benefits greatly from
reusing a single connection pool for each request. Design a way of
using a `` to share a connection pool between multiple threads
using the standard library. The consumer threads do not run
`asyncio` loops.- Static typing. How to properly add types to a
function decorator.GraphQL- N+1 problem. How GraphQL solves it.-
Subscriptions.- Apollo cache. Data normalization. Avoiding
additional requests in child components.- **Specific problem:** A
UI displays a set of delivery orders fetching using a GraphQL API.
`` is the main page React component which contains `` children. How
should the queries be set up to make only one request? How should
information be passed to the `` children to take advantage of ``
components (i.e. avoid re-rendering each child when the parent
renders)? How should subscriptions be set up to reliably update
components state?Containers- Application isolation techniques.
Traditional (file permissions, non-root users, kernel capabilities)
vs `chroot` vs containers vs VMs vs AppArmor+SecComp. Limitations,
pros and cons.- Image building. Purpose and implementation of
layers. Purpose and implementation of multi-stage builds. Reducing
image size.- Process supervisors/init processes. Purpose and
different options (e.g. `tini`, `s6`, `supervisord`, `systemd`).-
Logging/monitoring solutions. Pros and cons of different
approaches. Stdout/stderr, log files, fluentd/fluentbit, syslog (+
various backends e.g. rsyslog, journald), application-level logging
instrumentation (e.g. in-app log file rotation/upload), distributed
tracing.- Solutions to Docker-in-Docker or running system-level
software in container-like environments. GVisor, SysBox,
FireCracker.- Extra: modern [Dockerfile frontend]()- Extra: purpose
and implementation of VM hypervisors. Virtualization vs
para-virtualization. `virtio`.Kubernetes- Basic internals. `etcd`,
k8s resource definitions, controllers.- Basic built-in resources.
Nodes, Pods, Deployments, DaemonSets, Services, Ingresses,
ConfigMaps, Secrets, Jobs, CronJobs, Horizontal Pod Autoscalers,
Persistent Volume Claims, Storage Classes.- Purpose of init
containers, sidecar containers, ephemeral containers.- Networking.
In-cluster DNS. Network Policies. Purpose of service meshes. Load
balancer setup and configuration. Reverse proxies.- Autoscaling and
node assignment. Node selectors and affinity, preemption, QOS,
priorities. Scaling to and from 0 (any special configuration of
cloud resources etc.).- Extra: purpose and implementation of "node
shells".Database Implementation- Basic data structures. B-Trees,
Log-Structured Merge Trees. Tradeoffs.- Crash recovery. Write-ahead
log. Undo vs redo logging vs both.- Replication. Unidirectional vs
bidirectional. Physical vs logical. Single master vs multi-master.
Role in system availability, upgrades, and backups.Distributed
Systems- Consistency (C in ACID) guarantees. "Eventual" vs "strong"
consistency (+ why these terms are imprecise). Distributed system
vs single host database guarantees (e.g. linearity vs
serializability). Impact on Availability. Byzantine failures.-
Consensus protocols. 2-phase commit, BFT, Paxos + notable variants,
Raft.UNIX-like Operating Systems- Process state: current working
directory, environment variables, parent process, kernel
capabilities, signal masks, etc. Setting up and creating
subprocesses using `fork(2)`.- Inter-process communication. Stdio.
Pipes. Shared memory. Sockets. System busses (e.g. dbus).$160,000 -
$260,000 a year
#J-18808-Ljbffr
Keywords: LatchBio, Inc., Sacramento , Systems Software Engineer, IT / Software / Systems , San Francisco, California
Click
here to apply!
|