I Stopped Hitting Claude's Usage Limits — 10 Things I Changed

April 6, 2025

Most people blame Claude for strict limits. I blamed Claude too.

Then I realized something that changed everything — Claude doesn't count the number of messages. It counts tokens. All you need to do is use tokens wisely, but not everyone knows how to do that and ends up losing a ton of tokens and money as a result.

I got really into this and put together a list of the best habits that will save you a fortune. Let's get into it.

1. Edit Your Prompt — Don't Send a Follow-Up

When Claude doesn't get your thoughts right, you might feel tempted to send:

  • "No, I meant [your message]"
  • "Ugh, that's not what I wanted [your message]"

Don't do that!

Every subsequent message is added to the conversation history. Claude re-reads ALL of it every turn — burning tokens on context that didn't even help.

Here's the brutal math:

Token cost per message = all previous messages + your new one. Total = S × N(N+1) / 2 (S = avg tokens per exchange, N = message count)

At ~500 tokens per exchange, the costs explode fast:

MessagesTotal Tokens Burned
57,500
1027,500
20105,000
30232,500

Message 30 costs 31x more than message 1. Let that sink in.

Instead: click Edit on your original message → fix it → regenerate. The old exchange gets replaced, not stacked.

Fix the prompt, don't feed the history.

2. Start a Fresh Chat Every 15–20 Messages

In the previous section, I showed how token costs grow with every message. Ideally, you should start a new chat every 15–20 messages.

Now imagine a chat with 100+ messages. At ~500 tokens per exchange, that's over 2.5 million tokens burned — most of it just re-reading old history.

One developer tracked his usage and found that 98.5% of tokens were spent re-reading the history. Only 1.5% went toward actually outputting the result. That's insane.

When a chat gets long → ask Claude to summarize everything → copy it → new chat → paste as first message.

This one habit alone can save you more tokens than everything else on this list combined.

3. Batch Your Questions Into One Message

Many people believe that splitting questions into separate messages leads to better results. Almost always, the opposite is true.

  • Three separate prompts = three context loads
  • One prompt with three tasks = one context load

You save tokens twice: fewer context reloads, and you stay further from hitting your limit.

Instead of sending three messages:

  1. "Summarize this article"
  2. "Now list the main points"
  3. "Now suggest a headline"

Write one message: "Summarize this article, list the main points, and suggest a headline."

Bonus: the answers often turn out better because Claude immediately sees the full picture.

Three questions. One prompt. Always!

4. Upload Recurring Files to Projects

If you upload the same PDF to multiple chats, Claude re-tokenizes that document every single time.

Use the Projects feature instead. Upload your file once → it gets cached. Every new conversation inside that project references it without burning tokens again.

Cached project content doesn't eat into your usage when you access it repeatedly.

If you work with contracts, briefs, style guides, or any long docs — this alone could cut your token spend dramatically.

5. Set Up Memory & User Preferences

Every new chat without saved context wastes 3–5 messages on setup: "I'm a marketer, I write in a casual style, I prefer short paragraphs…"

You've probably seen people start every prompt with "Act as a..." — that's tokens burned on repeat. Claude can remember this permanently.

Go to Settings → Memory and User Settings. Save your role, communication style, and preferences once. Claude will automatically apply them to every new chat.

No more wasted setup messages. No more "Act as a..." preambles.

6. Turn Off Features You're Not Actively Using

Web search, connectors, and "Explore" mode — all of these add tokens to every response, even if you don't need them.

Writing your own content? Turn off the "Search and Tools" feature.

The Advanced Thinking feature also consumes tokens. Keep it turned off by default. Only turn it on if your first attempt was unsatisfactory.

Rule: if you didn't turn a feature on intentionally, turn it off.

7. Use Haiku for Simple Tasks

Grammar checking, brainstorming, formatting, quick translations, short answers — Haiku handles all of this at a much lower cost than Sonnet or Opus.

Choosing the right model is the most important decision you make every day.

ModelBest ForCost Level
HaikuQuick tasks, drafts, formattingLow
SonnetReal work, analysis, codingMedium
OpusDeep thinking, complex reasoningHigh

Haiku for drafts and simple tasks → frees up 50–70% of your budget for tasks that truly require powerful models.

You don't need a sports car to get groceries. Don't use Opus for simple tasks.

8. Spread Your Work Across the Day

The Claude system uses a rolling 5-hour window. It does not reset at midnight — your limit gradually decreases. Messages sent at 9 a.m. will no longer count by 2 p.m.

If you use up your entire limit during a single morning session, most of your daily capacity will remain unused.

Divide your day into 2–3 sessions: morning, afternoon, and evening. By the time you return, your previous usage has rolled off, and you have a fresh limit.

Think of it like a refilling tank, not a daily quota.

9. Work During Off-Peak Hours

Starting March 26, 2026, Anthropic now uses up your 5-hour session limit more quickly during peak hours:

  • 5:00 AM – 11:00 AM Pacific Time (8:00 AM – 2:00 PM Eastern) on weekdays

Same query, same chat — but during peak hours, it impacts your limit more.

Your weekly limit remains the same. But how it's distributed has changed. Running resource-intensive tasks in the evening or on weekends will significantly stretch your plan.

If you're outside the U.S. (Europe, Latin America, or Asia), peak hours may actually fall during your afternoon — so check the time zone math.

10. Enable Extra Usage as a Safety Net

Subscribers to the Pro, Max 5x, and Max 20x plans can enable the "Overage" feature in Settings → Usage.

When your session limit is reached, Claude won't block your access. It will switch to pay-as-you-go billing at API rates.

You set a monthly spending limit to avoid unexpected bills.

This isn't about saving tokens — it's about not losing your work at the worst possible moment.

The Bottom Line

At first, it will feel hard to follow all these rules. But once they become automatic habits, you'll almost never hit your limits.

You might even switch from the Max plan to a regular one — you'll have plenty of tokens.

Here's the takeaway you need to remember:

Claude doesn't count messages. It counts tokens. Use them wisely, and you'll never run out.

LinkedIn
X
youtube