Why Your Rails Error Logs Are Full of Noise
Every public-facing Rails application is under constant reconnaissance. Automated bot scanners continuously probe for exposed credentials, configuration files, and common CMS admin panels — often attempting thousands of requests per day for paths like /.env, /wp-admin, /aws/credentials, and /.git/config.
These requests don't simply bounce off your application with a harmless 404. They penetrate deep into the Rails stack, triggering a cascade of unnecessary operations:
- CSRF token verification failures when bots attempt to POST to admin panels that don't exist
- Routing exceptions as Rails tries to match
/xmlrpc.phpor/phpmyadmin/against your route definitions - Database queries executed by
before_actionfilters in yourApplicationController(checking current user, loading site configuration, etc.) before Rails finally determines the route doesn't exist - ActionController instrumentation overhead for every single bot request that reaches your controller layer
The result? Error logs flooded with exceptions for paths your application never intended to serve, obscuring genuine errors that need investigation. Your monitoring tools count these as application errors, skewing your metrics and potentially triggering false alerts.
Key insight: By the time Rails returns a 404 for/.env.backup, your application has already spent milliseconds executing filters, checking sessions, and querying databases — wasted cycles repeated thousands of times daily across bot traffic.
Why Rack Middleware Is the Right Layer for This
Rack sits between your web server (Nginx, Puma) and your Rails application, processing every HTTP request before it touches ActionDispatch, the router, or any controller code. This makes it the ideal interception point for bot scanner traffic.
When a bot requests /wp-admin/install.php, rejecting it at the Rack layer means you avoid:
- Router pattern matching across dozens or hundreds of routes
- ActionDispatch middleware chain (CSRF protection, session loading, parameter parsing)
- Controller instantiation and filter execution
- Database connection checkout
- ActiveRecord queries triggered by
before_actioncallbacks - View rendering and response serialisation
The request never enters your application logic. Compare this to alternatives:
Nginx rules block traffic earlier but require deployment-specific configuration, regexp expertise, and lack access to Rails conventions (you can't easily match "any path ending with a list of 40 config filename variants"). Changes require web server reloads rather than application deploys.
Controller filters run too late — the request has already consumed resources traversing the middleware stack and matching routes. CSRF failures still generate exceptions and error reports.
External WAFs (Cloudflare, AWS WAF) add latency, monthly costs, and complexity. They excel at volumetric attacks but are overkill for simple pattern matching against known bot signatures.
Rack middleware gives you Rails-native pattern matching with zero application overhead for rejected requests.
Designing the Middleware: What to Block and How
Effective bot blocking requires multiple detection layers — a single matching strategy will miss probe variants. The middleware uses three complementary techniques:
Extension matching catches technology-specific probes like .php, .asp, or .jsp files. Scanners blindly test for common web platforms, and these extensions have no legitimate place in a Rails application.
Dot-directory and dot-file detection blocks attempts to access hidden configuration directories (.git, .aws, .docker) and sensitive files (.env, credentials). This protects accidentally exposed repositories or credential files.
Filename prefix variants handle obfuscated naming patterns. Scanners don't just probe for .env — they also test .env.backup, .env-production, .env_local, and .env-script.js. The dash-separated variant is particularly insidious: requests for .env-script.js will reach your Rails router as JavaScript format, bypass CSRF protection, and trigger unnecessary database queries before returning 404.
def bot_filename?(path)
basename = path.split("/").last.to_s
BOT_FILENAMES.any? { |name|
basename == name || basename.start_with?("#{name}.")
} ||
basename.start_with?(".env-") ||
basename.start_with?(".env_") ||
basename.start_with?("ftpsync")
end
The middleware returns403 Forbiddenimmediately when any pattern matches — before CSRF tokens are checked, database connections are established, or any Rails code executes.
Rack middleware is the sweet spot for bot blocking in Rails: it intercepts requests before routing, CSRF checks, or database queries execute — giving you Rails-native pattern matching with zero application overhead for rejected requests.
Nginx rules block traffic earlier but require deployment-specific configuration, regexp expertise, and web server reloads for changes. Controller filters run too late — requests have already consumed resources traversing the middleware stack and triggering CSRF exceptions. External WAFs (Cloudflare, AWS WAF) add latency, monthly costs, and complexity that's overkill for simple pattern matching against known bot signatures.
For maximum protection, use Rack middleware as your primary defence and layer Nginx rules or CDN firewall rules upstream for defence in depth.
The Implementation: A Complete, Battle-Tested Middleware
Here's the complete middleware with inline commentary explaining each detection strategy:
class BotPathBlocker
# Extensions commonly probed by PHP/ASP exploit scanners
BOT_EXTENSIONS = %w[.php .asp .aspx .cgi .jsp .py .pl].freeze
# Hidden directories scanners enumerate for credentials
BOT_DOTDIRS = %w[.git .svn .aws .ssh .docker .kube].freeze
# Config files with common naming variants
BOT_FILENAMES = %w[.env .htaccess credentials wp-config].freeze
# Exact paths frequently targeted (WordPress, CMS admin panels)
BOT_PATHS = %w[/wp-admin /wp-login.php /admin /phpmyadmin].freeze
def initialize(app)
@app = app
end
def call(env)
return block_request if bot_path?(env["PATH_INFO"].to_s)
@app.call(env)
end
private
def bot_path?(path)
# Extension check: catches exploit scanners probing PHP/ASP endpoints
return true if BOT_EXTENSIONS.any? { |ext| path.end_with?(ext) }
# Dot-directory check: detects enumeration of version control and config dirs
return true if path.split("/").any? { |segment| BOT_DOTDIRS.any? { |dir| segment.start_with?(dir) } }
# Exact path check: blocks common CMS admin panel probes
return true if BOT_PATHS.include?(path)
# Filename check with prefix variants: handles .env, .env.backup, .env-production, .env_local
bot_filename?(path)
end
def bot_filename?(path)
basename = path.split("/").last.to_s
BOT_FILENAMES.any? do |name|
basename == name ||
basename.start_with?("#{name}.") ||
basename.start_with?("#{name}-") ||
basename.start_with?("#{name}_")
end
end
# Return minimal 404 with no body to avoid leaking server information
def block_request
[404, { "Content-Type" => "text/plain" }, []]
end
end
The key insight: dash and underscore prefixes (-, _) were edge cases discovered in production. Scanners probe .env-production.js and .config_backup.php to bypass naive matchers checking only dot separators.
Edge Cases Discovered in Production
While the initial middleware implementation blocked obvious patterns, production log analysis revealed sophisticated scanner behaviour that bypassed naïve matchers. The most common gap appeared in config file naming conventions—requests for wp-config-sample.php and .env-production.js sailed through the middleware because they used dashes or underscores between segments rather than dots.
The original bot_filename? method only checked for dot-separated variants:
def bot_filename?(path)
basename = path.split("/").last.to_s
BOT_FILENAMES.any? { |n| basename == n || basename.start_with?("#{n}.") }
end
This caught .env.backup but missed .env-backup, .env_production, and config-script.js. Weekly log reviews showed these variants generating CSRF exceptions as they reached Rails controllers, triggering database queries and filter chains before ultimately failing.
The hardened version explicitly handles multiple separator patterns:
def bot_filename?(path)
basename = path.split("/").last.to_s
BOT_FILENAMES.any? { |n| basename == n || basename.start_with?("#{n}.") } ||
basename.start_with?(".env-") ||
basename.start_with?(".env_") ||
basename.start_with?("ftpsync")
end
Key lesson: Extension-only matching is insufficient. Bot scanners actively probe separator variants (.,-,_) to evade basic pattern filters.
Logging and Observability: Knowing What You're Blocking
Visibility into blocked traffic is essential — not just to confirm your middleware is working, but to spot emerging scanner patterns that aren't yet on your blocklist. Structured logging allows you to aggregate blocked requests without drowning application logs in noise.
Add logging directly to your middleware using Rails' tagged logging:
class BotPathBlocker
def call(env)
path = env["PATH_INFO"]
if bot_path?(path)
Rails.logger.info(
"[BotBlocker] Blocked: #{env['REQUEST_METHOD']} #{path} " \
"from #{env['HTTP_X_FORWARDED_FOR'] || env['REMOTE_ADDR']}"
)
return [403, {}, ["Forbidden"]]
end
@app.call(env)
end
end
For production environments, consider incrementing metrics instead of (or alongside) log entries. A simple counter with path labels helps identify which patterns trigger most frequently:
BLOCKED_COUNTER = Prometheus::Client::Counter.new(
:blocked_bot_requests_total,
docstring: 'Blocked bot scanner requests',
labels: [:pattern_type, :extension]
)
Review blocked request logs weekly to identify new scanner patterns. If you see repeated requests for.envsor.aws-credentials, add them to your blocklist before they multiply.
Keep blocked request logs in a separate namespace (e.g., tagged with [BotBlocker]) so they can be filtered out of standard application monitoring without losing visibility altogether.
Testing the Middleware
Testing your middleware thoroughly prevents both security gaps and false positives that might block legitimate application routes. Start with unit tests that verify the middleware in isolation from Rails:
RSpec.describe BotPathBlocker do
let(:app) { ->(env) { [200, {}, ["OK"]] } }
let(:middleware) { described_class.new(app) }
def call_with_path(path)
middleware.call({ "PATH_INFO" => path })
end
it "blocks .env file requests" do
status, = call_with_path("/.env")
expect(status).to eq(403)
end
it "blocks dash-separated config variants" do
status, = call_with_path("/.env-production")
expect(status).to eq(403)
end
it "allows legitimate application paths" do
status, = call_with_path("/developers/calculator")
expect(status).to eq(200)
end
end
Integration tests verify that blocked requests never reach your Rails router. Mount a test route that increments a counter, then confirm bot paths never trigger it:
it "prevents blocked paths from reaching Rails routing" do
counter = 0
Rails.application.routes.draw do
get "/*path", to: ->(_) { counter += 1; [200, {}, []] }
end
get "/.aws/credentials"
expect(response.status).to eq(403)
expect(counter).to eq(0)
end
Critical: Test your blocklist against every route inrails routes. Patterns like/api/configor/files/.hiddenmight accidentally match bot detection rules, creating false positives in production.
Scanners actively probe dash and underscore-separated config file variants like .env-production, .env_local, and .env-backup specifically to bypass naive filters that only match dot separators (e.g., .env.backup). In production, these variants were observed reaching the Rails router, triggering CSRF exceptions and unnecessary database queries before returning 404.
Your blocklist must explicitly test against multiple separator patterns (., -, _). Extension-only matching is insufficient — always test your middleware against all three separator types and review blocked request logs weekly to catch new evasion patterns before they multiply.
Deployment Considerations and Complementary Measures
Roll out bot-blocking middleware using a staged approach to avoid false positives. Start in monitoring mode by logging matched paths rather than blocking them:
def call(env)
path = env["PATH_INFO"].to_s
if bot_path?(path)
Rails.logger.info("[BotBlock] Would block: #{path}")
# return [403, {}, ["Forbidden"]] # Commented during monitoring
end
@app.call(env)
end
Review logs for a week, add legitimate patterns to an allowlist, then enable blocking.
Defence in depth: Rack middleware is your first line of defence, not your only one.
Combine it with upstream rules for maximum efficiency. Configure Nginx to reject bot patterns before they reach your application server:
location ~ /\.(env|git|aws) {
return 403;
}
location ~ \.(php|asp|jsp)$ {
return 403;
}
At the CDN level (Cloudflare, Fastly), create firewall rules using threat scores or known bot signatures. This blocks attacks at the edge, reducing bandwidth costs.
Add rate limiting as a complementary layer using Rack::Attack — middleware alone won't stop volumetric attacks from rotating IPs. For persistent offenders, implement IP-based blocking with temporary bans (e.g., 24-hour blocks after 100 rejected requests).
Monitor your middleware's effectiveness by tracking 403 responses in your analytics pipeline. If you see a pattern bypass detection, extend your matcher patterns immediately.