The why, what and how of automated static asset pipelines
We are currently working on a big platform update. For this, we plan to drop one of our core features: the locally attached network storage. That means that future Apps will not have persistent storage any more, we call them "ephemeral Apps". This new architecture will be faster and even more stable.
That also means that those 12-factor flavored Apps will not have the convenient SSH/SFTP access any more and you'll only deploy using Git.
But not all parts of your App are supposed to be part of Git: Runtime data like log files and user uploads and also compiled static assets such CSS/JS & images live outside Git. Further on i would like to explore opportunities to generate and deploy those fixed static files. Let's have a look back first:
A look back: Legacy CSS/JS workflows
That changed when I began working with CSS preprocessors. I suddenly needed a workflow to separate the authoring files from the one used in production. So i found the tooling to automate those build-processes and even help to do more. Things became "ugly".
Why to pipeline
The benefits of using automated build processes are obvious:
- make your life easier with better authoring tools
- speed up page delivery with optimized files
What to pipeline
Everything is possible. The most common tasks are:
Compiling CSS preprocessors: You enjoy authoring your CSS in less, Sass or Stylus. So you also need to convert this to plain CSS for production at some point.
Concating: You multiple single JS or CSS files in authoring, but you don't want to have that many different calls for external files in production, so in the building pipeline you join them together to one big file.
Image optimizing: You can make your vector- and raster-images even small than your graphic editors "save for web" export — use gifsicle, jpegtran, optipng, svgmin and pngquant.
Minifying: Get rid of all the characters that are only necessary to keep your files readable for humans. Remove all those white-spaces, and line-breaks, redundant declarations: compress it, uglify it.
Gzipping: Now turn your one line of uglified code into mojibake to make even smaller.
(Deploying: Get all the stuff up.)
How to pipeline
That's what you want to have done, but which technology stack to use?
Local JS task runner
TLDR; That's the popular choice these days, just use Gulp.
Why not use front-end technologies when dealing with front-end files? Task runners like Gulp and Grunt are a popular choice to automate your development tasks. They are based on Node.js, but they work side by side with whatever kind of language you are coding in.
Some frameworks have built-in asset solutions. These probably better fit into your workflow and come with additional features like: File versioning (for cache-invalidation), Dev/Stage scenarios, link-rewriting, deployment helpers …
Ruby on Rails
Let's remember: SASS was one of the first CSS-preprocessors. It's written in Ruby. In addition Compass is probably the first generation of micro-mixin-framework. Rails itself handles CSS/JS automation with The Asset Pipeline.
Laravel has a clever approach by just integrating predefined Gulp tasks. It even has a fancy name: Elixr.
How to deploy
Now you have automated workflows to generate optimized versions each time you save your original JS/SASS file. How do you deal with it? Do you just put it in Git and deploy with the rest along, or do you define yet another asset pipeline task to deploy it.
Static assets VS Git
Source control was designed to deal with code changes in your original authoring source files. It's actually not the place to put your ugly files in. If you just put your assets in Git you'll have to deal with bad side effects like:
- No diffing: Compiled static assets are binary, they consist of one line, impossible and not and even necessary to diff.
- Merge conflicts: You'll run into conflicts when everyone in your team has "different" versions of compiled files.
- Bloating: Your
.gitdirectory get's bigger and that makes everything slowers.
Shame on us! We are supporting you in such bad practices with "Git push to deploy". It's what everybody loves as it is such a convenient way to upload code changes. The only problem: it's a hack. Now what can we do about it?
1. Put assets in Git, work around quirks
You can at least make yourself more comfortable by dealing with some of the quirks. Andrew Ray describes in his blog how to:
- Exclude built files from diffing in
- don't let compiled files conflict in a rebase with a merge driver in
- rebuild files automatically with a Git hook.
Other solutions to use branches or submodules for this.
2. Put assets in Git Large File Storage
That's just an idea that came up while researching this topic. After git-fat and git-annex git-lfs is a new approach to actually deal with large binary files, but hey why not use it for something like this. Git Large File Storage needs to be installed on both, the client and the remote server to work.
3. Exclude from Git, generate static assets on remote
In general we assume that you are compiling your assets in your local development environment. One can also consider to have the same setup on remote so, that you everything can be compiled on remote after a (Git) deployment (or even live on each user request) again.
I don't think that this makes much sense, as it is probably hard to debug errors for instance when your local dependencies differ from the remote ones.
4. Exclude from Git, deploy separately
Exclude your static assets from Git at all. Deploy them in a different way, maybe even to a different space such as a cloud object storage space. That means you have two deployments (with a platform like ours) — Git push and the "other one". The "other one" might be some rsync or some upload to an external object storage provider. The "other one" might be triggered with Git Hook.
That is probably the most professional and most complicated way. We do so with our web properties here — all JS, CSS and images are served from S3. This way you can also easily hook up a CDN for those assets. Serving those files from another domain improves performance (non-blocking).
Now, with our upcoming 12-factorish App you can't SSH in any more, so you'll probably need some kind of external space.
A separate cloud storage might also help you with your runtime data.
That’s dogma over practicality.
John Albin Wilkins in a Drupal Community comment
Yes, asset pipelining with task runners makes sense. The automation of mundane tasks not only helped us to increase productivity; the file optimizations reduced page load time significantly.
Now it only seems to us, that there are a thousand ways to do it. We are designing our service around deployment and hosting. So it's crucial that we get this right. What's your opinion and what's your practice? We are curious if you are considering solution 4 where you separate assets from Git deployment. Are you using this in practice, which cloud storage provider are you using? If using AWS S3, do you make use of IAM?
We are considering to implement a new "cloud storage" (working title) component, which basically makes using AWS S3 much more convenient, integrated tightly into fortrabbit.
The upcoming Amazon Elastic File System also looks promising. It could be a replacement for the persistent solution we currently have, although we are skeptical as we there are always two sides, NFS and Operating System. But for sure we'll keep an eye on that.