Shared Packages in Monorepos: The Problem

The Hidden Cost of Copy-Paste in Microservices

The 3 AM Production Bug

It’s 3:42 AM. My phone buzzes with a PagerDuty alert:

CRITICAL: Kafka message delivery failure - 87% error rate

I grab my laptop, still half-asleep, and SSH into our production cluster. After 15 minutes of investigation, I find the culprit: a bug in our Kafka event publishing logic. A simple off-by-one error in the retry mechanism.

“Easy fix,” I think. I write the patch in 5 minutes.

Then reality hits me.

This bug exists in five different microservices. Each service has its own copy of the Kafka client code, copy-pasted months ago when we were moving fast and “breaking things.”

My “5-minute fix” now requires:

Total time: 2.5 days.

Time to actually fix the bug: 5 minutes.

This is the hidden tax of code duplication in microservices. And we’re all paying it.


How We Got Here

Six months ago, our team started building RadarKit, a distributed content monitoring platform. We were excited about microservices:

We split our monolith into services:

radarkit/
├── auth-service/
├── sources-service/
├── scraper-service/
├── alerts-service/
└── notification-service/

Everything was great. Until it wasn’t.


The Copy-Paste Disease

It started innocently. Our first two services needed to log events. Simple, right?

// auth-service/src/utils/logger.ts
export class Logger {
  log(level: string, message: string, meta?: any) {
    console.log(JSON.stringify({
      timestamp: new Date().toISOString(),
      level,
      message,
      service: 'auth-service',
      ...meta
    }));
  }
}

Works perfectly. Then sources-service needed logging. Rather than setting up a shared package (which seemed like “premature optimization”), we just… copied the file.

// sources-service/src/utils/logger.ts
export class Logger {
  log(level: string, message: string, meta?: any) {
    console.log(JSON.stringify({
      timestamp: new Date().toISOString(),
      level,
      message,
      service: 'sources-service',  // Just changed the service name
      ...meta
    }));
  }
}

Two weeks later, we needed Kafka integration. Same pattern:

// auth-service/src/kafka/client.ts
export class KafkaClient {
  async publish(topic: string, message: any) {
    // 150 lines of Kafka connection, retry logic, error handling
  }
}

// sources-service/src/kafka/client.ts
export class KafkaClient {
  async publish(topic: string, message: any) {
    // Same 150 lines, copy-pasted
  }
}

// scraper-service/src/kafka/client.ts
export class KafkaClient {
  async publish(topic: string, message: any) {
    // You get the idea...
  }
}

By month three, we had:


When Reality Caught Up

Problem 1: The Version Drift

Three months in, I noticed something odd. Each service was logging differently:

// auth-service logs:
{"timestamp":"2025-01-15T10:00:00Z","level":"info","message":"User logged in"}

// sources-service logs:
{"time":"2025-01-15T10:00:00Z","severity":"INFO","msg":"Source created"}

// scraper-service logs:
{"@timestamp":"2025-01-15T10:00:00Z","log.level":"info","event.original":"Scraping started"}

Why? Because each team had “improved” their copy independently. We now had three different logging formats, making centralized log analysis impossible.

Problem 2: The Bug Multiplication

Remember that 3 AM bug? Here’s what the original code looked like:

// Buggy retry logic (in 5 services)
async publishWithRetry(topic: string, message: any) {
  const maxRetries = 3;
  let attempt = 0;
  
  while (attempt < maxRetries) {  // 🐛 Bug: should be <=
    try {
      await this.kafka.send({ topic, messages: [message] });
      return;
    } catch (error) {
      attempt++;
      await this.sleep(1000 * attempt);
    }
  }
  
  throw new Error('Max retries exceeded');
}

The bug was subtle: attempt < maxRetries meant we only tried 2 times, not 3.

When I discovered this in auth-service, I had to search for the same pattern across all services:

$ git grep -n "attempt < maxRetries" services/
services/auth-service/src/kafka/client.ts:45:  while (attempt < maxRetries) {
services/sources-service/src/kafka/client.ts:52:  while (attempt < maxRetries) {
services/scraper-service/src/kafka/publisher.ts:38:  while (attempt < maxRetries) {
services/alerts-service/src/events/publisher.ts:41:  while (attempt < maxRetries) {
services/notification-service/src/kafka/producer.ts:29:  while (attempt < maxRetries) {

Five files. Five PRs. Five reviews. Five deploys.

Problem 3: The Feature Inconsistency

We needed to add distributed tracing. The conversation in Slack:

Me: “Adding trace IDs to Kafka events for distributed tracing”

Sarah (Backend Dev): “Oh, I already did that in sources-service last week!”

Me: “…it’s not in auth-service”

Tom (Backend Dev): “I added it to scraper-service yesterday, different implementation though”

We had three different tracing implementations, none compatible with each other. Our tracing tool showed broken traces because services used different correlation ID formats.

Problem 4: The Onboarding Nightmare

New developer joins the team:

New Dev: “Where’s the standard way to publish Kafka events?”

Me: “Well… check auth-service, but actually sources-service has a better implementation, oh wait, scraper-service has the latest retry logic…”

New Dev: “So… there’s no standard?”

Me: “…not exactly.”


The Real Cost

Let’s put numbers on this:

Time Waste

Over 3 months:

Total: 100 hours = 2.5 weeks of engineering time

Bug Multiplication

When we audited our codebase:

Technical Debt

$ cloc services/*/src/kafka/
# Kafka client implementation
Auth Service:     312 lines
Sources Service:  298 lines
Scraper Service:  334 lines
Alerts Service:   301 lines
Notification:     289 lines
-----------------------------------
Total:          1,534 lines

# 80% similar code = ~1,200 lines of pure duplication

1,200 lines of code that should have been 300 lines in one place.


The Copy-Paste Cascade

The worst part? It gets worse over time:

Month 1: "Let's just copy this logger, it's only 50 lines"
           ↓
Month 2: "Let's just copy the Kafka client, it's already copied"
           ↓
Month 3: "Let's just copy the database helpers, everything else is copied"
           ↓
Month 4: "Let's just copy..." 

It becomes the norm. Each copy makes the next copy feel justified. The technical debt compounds.


The Breaking Point

The final straw came during a security audit. We discovered a critical vulnerability in our JWT token validation:

// Vulnerable code (in 3 services)
export const verifyToken = (token: string) => {
  return jwt.verify(token, process.env.JWT_SECRET);
  // 🚨 No algorithm verification - allows "none" algorithm attack
}

The fix was simple:

export const verifyToken = (token: string) => {
  return jwt.verify(token, process.env.JWT_SECRET, {
    algorithms: ['HS256']  // Explicitly specify allowed algorithms
  });
}

But we had to:

  1. Fix it in auth-service
  2. Fix it in sources-service
  3. Fix it in alerts-service
  4. Deploy all three within a tight window
  5. Hope we didn’t miss it in another service

A one-line security fix became a multi-day, high-stakes deployment.

That’s when we knew we had to change something fundamental.


The Hidden Questions

After this incident, I started asking myself:

The answer surprised me. Yes, there is a better way.

Companies like Google don’t have this problem because they use shared packages. Not shared libraries that couple services together. Not going back to a monolith. But properly designed, versioned, shared packages in a monorepo.


What’s Coming Next

In the next post, I’ll show you:

We went from:

All while keeping our microservices completely independent in production.


Your Turn

Have you experienced the copy-paste cascade in your microservices? How many copies of your logging code exist right now?

Count them:

# Try this in your codebase
git grep -l "class Logger" services/

Drop the number in the comments. I bet it’s more than you think.

Next post: “The Solution - How to Build Shared Packages That Don’t Suck”


This is Part 1 of a 3-part series on building maintainable microservices. I’m María, Senior Backend Developer building RadarKit, a distributed content monitoring platform. Follow along as we solve real problems with real solutions.

Series:


Found this helpful? Follow me on [Twitter/LinkedIn] for more posts on distributed systems, microservices, and lessons learned building production systems.