A simple Nanoc filter to compress HTML using Nokogiri
When using Nanoc, there are standard filter options for compressing CSS and minifying javascript, but there’s no equivalent for HTML. However, we can use Nokogiri to trim our HTML a little by removing some whitespace and getting rid of comments.
To do this, we’re going to build a custom Nanoc filter. Create a
new file in lib/filters
, let’s call it html_compress.rb
.
# encoding: utf-8
require 'nokogiri'
class HTMLCompressFilter < Nanoc::Filter
identifier :html_compress
type :text
def run(content, params={})
doc = Nokogiri::HTML(content)
# Find comments.
doc.xpath("//comment()").each do |comment|
# Check it's not a conditional comment.
if (comment.content !~ /\A(\[if|\<\!\[endif)/)
comment.remove()
end
end
doc.to_html
end
end
Simply parsing the HTML through Nokogiri removes excess whitespace in
the <head>
section with no extra steps necessary. This filter could
be extended to remove whitespace from the <body>
section too, but
be careful with the content of any <pre>
tags for example.
The other reduction uses an xpath selector to find any comments
and removes most of them. The only ones that
are left are conditional comments which are often used as a way of
including or excluding elements based on Internet Explorer version.
The following example of this is generated by Compass to include the
ie.css
stylesheet for Internet Explorer only:
<!--[if IE]>
<link href="/stylesheets/ie.css" media="screen, projection" rel="stylesheet" type="text/css" />
<![endif]-->
Apply the filter as normal by placing filter :html_compress
in the compile
sections where you want it to apply in your Rules
file.
As this filter works on the HTML of your pages, it’s best to apply this
filter after any that generate the HTML for example :erb
or :kramdown
.
Further reading: