Full-Stack Engineering: Back-End and Databases

This is Part 3 of a three-part series on full-stack engineering.
Part 1: Full-Stack Engineering: Introduction
Part 2: Full-Stack Engineering: Front-End Development Focus
Part 3 (this post): Back-End and Databases

The back end is where a system commits to being correct and durable. Front-end work decides how an experience feels; back-end work decides what is actually true after a request returns. For full-stack engineers, the goal in Part 3 is not to master every database engine or framework, but to understand the trade-offs well enough to pair the right back-end shape with the front-end choices made in Part 2.

Back-end and databases in the full-stack picture

Back-End in the Full-Stack Context
APIs and Protocols
- Choosing a Style: REST, GraphQL, gRPC, tRPC
- Cross-Cutting Concerns: Versioning, Idempotency, Contracts
Database Choices
- Engines and When They Fit
- Polyglot Persistence and Its Costs
Caching and Performance
- Where Caches Live
- Strategies and the Hard Part: Invalidation
Deployment Patterns
Conclusion: Closing the Loop

Back-End in the Full-Stack Context

A useful lens for the back end is to ask what guarantees it provides. A REST endpoint promises a response shape and a status code. A database promises some level of durability and consistency. A queue promises at-least-once or exactly-once delivery. The full-stack engineer's job is to understand which of those guarantees the front end is implicitly relying on, and to make sure the back end actually provides them.

"Everything fails, all the time." - Werner Vogels, CTO, Amazon

Front-end choices push specific back-end requirements. Server-side rendering and React Server Components, covered in Part 2, mean the back end needs low-latency data access in the same region as the rendering process. Static generation pushes work to build time but introduces the question of when to rebuild. Real-time interfaces require streaming protocols and back-pressure handling rather than just request/response. The back end is rarely free to be designed in isolation.

At a high level, full-stack engineers are expected to be comfortable with:

Designing API surfaces that are stable enough to evolve without breaking clients.
Picking database engines that match read/write patterns and consistency needs.
Layering caches without making correctness depend on cache state.
Choosing deployment patterns that match team size and operational maturity.

APIs and Protocols

Choosing a Style: REST, GraphQL, gRPC, tRPC

Most modern back ends end up exposing more than one API style: a public API for partners, a typed contract for the company's own front end, and an internal protocol for service-to-service traffic. Each style optimizes for a different audience.

Style	Strengths	Trade-offs / Caveats
REST	Universally understood, cacheable at HTTP layer, simple tooling	Over- and under-fetching, weak typing across the wire, versioning is awkward
GraphQL	Client-driven queries, single endpoint, strong types via schema	Caching is harder, easier to write expensive queries, server complexity goes up
gRPC	Compact binary format, strict schemas via Protobuf, great for service-to-service traffic	Browser support is limited (needs gRPC-Web/proxies), heavier tooling to operate
tRPC	End-to-end type safety in TypeScript monorepos, no schema duplication	TypeScript-only by design, couples client and server tightly

A common pattern in mid-sized systems is to expose REST for public consumers, use gRPC between internal services, and pick GraphQL or tRPC for the company's own front-end clients where the schema and the UI evolve together. The choice is rarely "which is best" and usually "which is best for this specific consumer."

Cross-Cutting Concerns: Versioning, Idempotency, Contracts

Once an API is in use, the interesting work shifts from picking a style to keeping it stable while it evolves. Three concerns repeat across every style:

Versioning: How clients survive a change. Options range from URL-based versions (/v1/, /v2/) to additive-only schema evolution (the GraphQL convention) to feature flags carried in headers. Whichever approach is chosen, the rule is the same - published contracts cannot silently change.
Idempotency: How the back end behaves under retries. Network failures and client retries mean any non-trivial write should accept an idempotency key, so a duplicate request produces the same effect as the original rather than a second charge or a duplicate row.
Contract testing: How services confirm they still speak the same language. Tools like Pact and OpenAPI-driven test generation let consumers and providers verify compatibility without spinning up the entire system.

"The hard part of microservices isn't the code; it's the boundaries you draw and the contracts you commit to maintaining." - paraphrase of Sam Newman, Building Microservices

Database Choices

Engines and When They Fit

There is no neutral default database; every choice trades off something. The most common families and the workloads they suit are summarized below.

Family	Examples	Fits well	Watch out for
Relational	PostgreSQL, MySQL	Transactional data with relationships, reporting, strong consistency	Sharding is operationally heavy; schema changes need migration discipline
Document	MongoDB, Couchbase	Hierarchical or semi-structured data, schema-flexible products	Joins are awkward; "schema-less" is rarely truly schema-less in practice
Key-Value / Wide	Redis, DynamoDB, Cassandra	High-throughput lookups, sessions, caches, time-series-shaped writes	Range queries and secondary indexes are second-class citizens
Search	OpenSearch, Elasticsearch	Full-text and faceted search, log analytics	Not a system of record; index drift and reindexing costs add up
Vector	pgvector, Pinecone, Weaviate	Similarity search for embeddings, retrieval for AI features	Tuning is opaque; results depend heavily on embedding model and chunking

"Most data systems are built on a small set of primitives - logs, indexes, replication, partitioning. Once you understand the primitives, the products are easier to compare." - paraphrase of Martin Kleppmann, Designing Data-Intensive Applications

Polyglot Persistence and Its Costs

Mixing database engines - for example, Postgres for transactional data, Redis for sessions, OpenSearch for search, and a vector store for retrieval - is increasingly common. This is sometimes framed as "polyglot persistence," and it lets each workload run on the engine that fits best.

The cost is rarely about the technology and almost always about the operational and consistency burden:

Operations multiply: Every engine has its own backup, upgrade, monitoring, and security model.
Consistency gets harder: Data that lives in two places will drift; teams need to decide which store is authoritative and how the others get updated.
On-call gets harder: New engines mean new failure modes engineers have to learn.

Most teams are best served by starting with a single relational store, and only adopting additional engines when the workload genuinely outgrows what a relational database can do well.

Caching and Performance

Where Caches Live

Caches sit at every layer of a modern stack. A read for a single product page might pass through, in order, a CDN edge cache, an API gateway cache, an application-level cache, and a database query cache before reaching disk. Each layer has different invalidation semantics and different blast radii when something goes wrong.

A useful mental map:

CDN / edge cache: Cheapest reads, hardest to invalidate quickly. Best for content that is the same for many users.
API gateway cache: Coarse-grained, request-shaped. Useful for expensive read endpoints with stable query parameters.
Application cache: In-process or shared (Redis, Memcached). Holds derived values, computed views, session data.
Database cache: Buffer pool and query cache inside the engine. Mostly automatic, but worth understanding when tuning.

Strategies and the Hard Part: Invalidation

For application-level caches, three strategies cover most situations:

Strategy	How it works	When it fits
Read-through	Application reads from cache; on a miss, cache loads from the source and stores it	Read-heavy workloads where staleness within a short window is acceptable
Write-through	Writes go to the cache and the underlying store synchronously	When the cache must always reflect the latest write, at the cost of latency
Write-behind	Writes go to the cache first and are flushed to the store asynchronously	Very write-heavy workloads where a small risk of data loss is tolerable

The famous remark that "there are only two hard things in computer science: cache invalidation and naming things" still holds. Two practical rules tend to work well:

Prefer time-based expiration (TTLs) for anything that doesn't strictly need to be fresh. It makes failure modes self-healing.
Use explicit invalidation sparingly, and keep it as close to the write path as possible so a stale read after a write is rare and short-lived.

When neither approach is enough, the right answer is usually to redesign the read pattern (denormalize, precompute, or move work to write time) rather than to add another cache layer.

Deployment Patterns

The Spectrum: Monolith to Serverless

Deployment patterns are best thought of as a spectrum rather than a set of competing camps. The same product can move along it as the team and the workload change.

Pattern	What it is	Fits well
Monolith	One deployable unit; one database; one repo	Small teams, early products, anything where the boundaries are still moving
Modular monolith	One deployable, but internal modules with strict boundaries and isolated data	Teams outgrowing a single shared codebase but not ready for distributed ops
Microservices	Many independently deployable services, each owning its data	Larger orgs where team boundaries justify deployment boundaries
Serverless / functions	Stateless functions invoked on demand, billed per request, scaled by the platform	Spiky workloads, glue code, workloads where idle cost matters most

"Almost all the successful microservice stories I've seen started with a monolith that got too big." - paraphrase of Martin Fowler, "MonolithFirst"

The honest framing is that microservices buy isolation and independent deployability at the cost of operational complexity. They are worth that cost when team size makes a single deployment a coordination problem, and rarely before. Serverless is a similar trade: it removes most operational concerns but introduces cold starts, vendor coupling, and a different mental model for state.

Tying Back to Front-End

Front-end choices from Part 2 push specific deployment requirements:

SSR and React Server Components want low-latency data access close to the rendering region. That favors regional deployments with read replicas, or edge runtimes with carefully scoped data fetching.
Static generation and ISR push freshness concerns onto the build and revalidation pipeline rather than the request path. The back end's job becomes "be fast enough at build time and at the revalidation tick."
Real-time interfaces (dashboards, chat, trading) require persistent connections, which sit awkwardly on classic serverless and tend to push back toward long-running processes or specialized platforms.

The pattern across all three is that the back-end deployment shape and the front-end rendering shape have to be designed together. A monolithic back end behind a globally distributed Next.js app will become the bottleneck; a microservices back end fronted by a single-region SPA will pay for complexity it never uses.

A Word on Observability

Any back-end that matters needs observability - structured logs, metrics, traces, and the ability to ask new questions about production after the fact. This series doesn't go deep on observability, but the rule of thumb worth carrying is that observability is a back-end concern that the front end depends on. Most user-visible incidents are diagnosed faster when the back end exposes good signal, and most front-end performance work eventually leads back to a slow back-end call.

Conclusion: Closing the Loop

Across this series, the same theme has repeated. Front-end work, covered in Part 2, decides how an application feels and how much JavaScript users are asked to download. Back-end work decides what is actually true after a request, how that truth is stored, and how it survives load and failure. Full-stack engineering is the discipline of designing both sides together so that one set of decisions doesn't quietly invalidate the other.

A few directions worth continuing to invest in beyond this series:

Data systems: A deeper understanding of replication, partitioning, and consistency models pays back across every back-end choice.
Distributed systems fundamentals: Once a system grows past a single process, the interesting failure modes are distributed-systems failure modes.
AI integration: Retrieval, vector stores, and model-serving infrastructure are increasingly part of the back-end conversation.
Operational maturity: Observability, on-call practices, and incident response shape what is actually buildable, not just what is theoretically possible.

Thanks for following along with the series.

Part 1: Full-Stack Engineering: Introduction
Part 2: Full-Stack Engineering: Front-End Development Focus

Table of Contents