No, CSS isn’t always faster

Recently, a post appeared on Hacker News that was yet another exploration of crazy things that can be achieved with CSS.  Like many others, this example was logos rendered using HTML & CSS.  I quickly commented that I didn’t feel we needed more of these examples which, while cool, do almost nothing to advance web development in general. (Note: as Nicolas Gallagher, mentioned in the comments, his article clearly states that these his icons are an experimentation and shouldn’t be deployed in a real-world environment; apologies if the initial text implied that these demonstrations are endorsements of the “use this in the wild” approach).  I suggested that the people involved in such endeavors would be better suited devoting their efforts to demonstrating tips and techniques applicable to the real world.

While I’m sure I could devote an entire post to why CSS is the wrong tool for this job (see the “related reading” section below for a great argument there), one response to my comment in particular caught my attention:

A CSS logo would be useful for performance issues and overall page rendering speed. A CSS animated logo, such as the Atari example, even moreso than a Flash / JS counterpart. CSS renders more quickly than images, and definitely more quickly than embedded Flash.

It’s time for this misinformed, dogmatic approach to web performance to end.  

When Steve Souders came out with his book, High Performance Websites, and the accompanying YSlow tool, the race to reduce HTTP requests was on! Several years later, it’s somewhat unfortunate that in all of the great advice that Steve doled out, many people never make it past this first bullet point. It’s also unfortunate that this rule set has defined website performance for the masses primarily in terms of network-related bottlenecks.  It’s some to start thinking about performance from a holistic point of view; and not isolating things like network performance and elevating them to higher importance than other concerns.

Note: I mention Steve Souders’ work here here not to blame him for what’s happened; more to make the point that good practices (like those Steve suggests) put to use by people without understanding the full ramifications of that advice can lead to some nasty misconceptions.

Thinking about performance holistically, I’m immediately reminded of the talks John Resig did shortly after Sizzle was released and competitors started popping up everywhere to convince people that selector engines were no longer the performance bottleneck in Javascript. People had become obsessed with a single point of performance and were ignoring the “bigger fish.”

So here we are in 2011, and people are still regurgitating the idea that “using CSS is faster than images” without truly understanding the implications of that statement (and how untrue it can be under the right circumstances).  An analogy that I like to use comes from the world of 3D gaming: Rendering something (for example, a drop shadow) in CSS and markup is the real-time, in-game rendering while downloading an appropriately compressed drop shadow png image is using a pre-rendered full-motion-video cut scene. Sure, the initial payload may be somewhat heavier; but again, looking at the complete picture, pre-rendering allows for overall higher performance.

I mention the drop shadow example because it’s absolutely real.  When the team at Netflix was building the initial versions of our web-based UI for connected devices, we ran into serious performance problems on all sorts of devices like the PlayStation3. One of the things we found that significantly sped up the application was to avoid using CSS effects like shadows, gradients, and other “CSS effects;” instead, favoring pre-rendered effects. In some cases, the performance gains were nothing short of astonishing.

To the examples provided by the initial Hacker News article, let’s look at the Twitter logo demonstration. It’s 27 DOM elements and over 4 kilobytes of CSS.  So already we have a comparably sized payload, a more complex DOM, and now it’s time to add flow and paint operations to actually do the rendering of the elements.  There’s a lot of calculations happening to position and style the elements to appear as the Twitter logo.  By using an image, we bypass the calculations on layout and painting that need to be done to do the rendering; those calculations were already completed.

Performance is more than optimizing for the network layer and initial load-time of a page. As with anything, there are tradeoffs to be made (add a little download time, speed up the rendering) and in different situations, there will be different speed winners based on those tradeoffs. More interesting when working with lower-end devices (such as mobile devices) are the tradeoffs between memory use and computation cycles that I hope to cover in depth in future blog posts.

Sadly, there are few resources that cover the topic of in-browser performance in any more depth than a cursory level. I’ve only touched on it here; though I do hope to continue writing about it. That said, before we can even start talking intelligently about working on performance from a holistic approach rather than focusing on individual concerns, we need to learn to get past the dogmatic rules that are holding us back.

Related Reading: Pure CSS Icons: Make the Madness Stop

Why are you still deploying overnight?

10/17/2011 – Welcome Hacker News folks!

There’s some discussion happening in the comments; but, as always, the better conversation is on the article page on Hacker News itself.

“3:00 am Deployment?  Why Not?”

That was the Facebook status of a former co-worker about a week ago. I happened to be awake and online at the same time (he’s on the East Coast; it was only midnight here) and immediately responded, “The better question is, ‘Why?’”

Deployment.  Production Push.  Go Live.  Rollout.  Whatever you call the process of turning your development codebase into a live, production application, I sincerely hope you’re not living in the Stone Age and doing it in the middle of the night under the guise of avoiding customer impact. Unfortunately, if my past experiences, and the experiences of many I’ve spoken to, are the norm, you very likely are.  If your strategy to avoid customer interruption is based solely on trying to avoid your customers, you’re setting yourself up for even more headaches and long-term failure.

The motivations for these overnight deployments are suspect at best. The claim is that by avoiding the daylight hours, fewer customers will be impacted by the rollout.  Problem 1: You presume there will be problems that impact availability.  You have no confidence in your code quality; or (or maybe, and), you have no confidence in your infrastructure and deployment process.  If you lack confidence that your new system is ready for production, you probably shouldn’t be pushing it to production!  If you think your servers aren’t ready or that your deployment process stinks, take the extra time now to improve them. I’ve seen great companies with absolutely terrible build and deployment processes who have nightmares getting code into production.  At the same time, these same companies refuse to devote more than a single person (or maybe only part of his or her time) to improving that process.

Perhaps even more suspect in the reasoning is the notion that because the process is complicated and volatile, it should be done in off-hours. Problem 2: You’ve got a complicated process and you’re sending over-tired, over-worked people to deal with it.   Imagine, for a moment, that your team is rolling out an update to a service that monitors life-support systems in hospitals.  Do you want tired, stressed, and unmotivated people working on the process?  If deployment is one of your most complicated procedures, why are you sending your people at their worst to handle it?

Earlier, I mentioned that teams are simply attempting to avoid customers by deploying overnight. Aside from this being a futile goal for any global business (and this is 2011), it likely suggests you’re missing two things.  Problem 3: You have no means of doing a phased rollout or a quick rollback. Deployments in this world are likely one-way affairs with a lot of time devoted to pushing the new code and no clean way to revert those changes quickly if something goes South. Make no mistake, I’m not suggesting that deployments are easy (or even that they should be). Nor am I suggesting that everything should always be perfect when deploy code. However, attempting to sneak code out in the middle of the night is hardly meeting the challenge head on.

Compounding all of these issues is the fact that there are some problems you can only see as certain scale is achieved. By hiding from your customers during deployment, you may also be burying your head in the sand with regard to these potential bugs.

Plan For Success; React Quickly to Obstacles

There are several techniques I’ve seen employed that have had a great impact on improving the deployment process to the point teams have felt comfortable deploying while the sun is up.

Involve your QA team early so they have a full understanding of the feature and how to test it. Foster a partnership between QA engineers and developers so they work together to understand the full impact of the feature and ensure that your testing, especially regression, is thorough enough to develop high confidence in your quality.  Always remember that the later in your process bugs are discovered and fixed the more expensive it becomes (especially if these bugs make it all the way to production).  Incentivize your people around delivering quality early–not finding bugs late.

Devote time and energy to your deployment processes; don’t shunt them off onto one guy working in isolation. Establish an owner; but, make sure this person is integrated with the rest of the development team and understands their pain points and needs.  Automate complicated manual processes to prevent mistakes (you know, the type of mistakes that happen when a tired engineer is sitting at his or her console at 3:00 am).

Decouple various parts of your system so they can be deployed and rolled back independently. There’s no sense in having to take your checkout process offline simply because of a regression in your unrelated public API.  This concept is often easier said than done; but it’s incredibly important and worth your team’s investment in time.

Use feature kill-switches aggressively; allow certain parts of your application to be turned on and off via runtime configuration.  Deprecate old functionality rather than destroying it in your codebase. Allow the feature switch to revert to the old code paths without forcing a code rollback or additional push.  Once you’re confident in your new functionality, the deprecated paths can be removed in your next deployment. In cases where this concept is prohibitively difficult or even impossible, modularize the code containing that feature and run both so you can quickly switch back to the old code if necessary.

Avoid unnecessary deployments. When I talked to my friend mentioned above about his 3:00 am deployment, he told me they had to do it to end a contest that their website had been running.  I’m sorry; that’s just not a reason to have to push new code to production. Feature switches targeting alternate code paths could have solved that problem. Moreover, they could have been set on a timer such that the moment the contest ended, the entry form was disabled.  Even following every suggestion here and others you can find, deployments are never going to become easy, only “easier.”  Don’t saddle your team with more of them than you need.

Release early, release often. I realize this mantra has been repeated over-and-over for years; but, that’s only because it’s such important advice. By releasing new code to production often, you’re shrinking the size of each deployment. The less stuff that changes, the less that can go wrong.

Create a system for phasing your rollouts. It’s a much better way to reduce customer impact of issues than simply hiding from your customers. Take your time between each phase to really let issues surface. An example of such a plan would start with a small number; like 5-10%. This level of exposure is still likely less than 100% at 3:00 am; but it’s also likely large enough to alert you to any glaring issues quickly.  Once you’ve cleared that hurdle, ramp up to a number that gives you some level of scale; say 50%. This level keeps your customer impact somewhat diminished if anything goes wrong; and, it will expose issues that may not appear until your app is running “at scale” (such as a new API call taking far more cycles than intended because the caching isn’t working correctly.  You may not notice these extra cycles at low volume; or worse, you may simply write it off that the service isn’t seeing enough traffic to really warm the cache).  Once you’ve crossed that hurdle, you’re ready to ramp up to 100%. Each phase should be designed to contribute confidence along the way that once all customers have this new code, they’ll be getting the quality experience you intended to deliver.

Get Some Sleep (Or Maybe…)

Ultimately, deploying overnight is likely indicative of something (probably several somethings) being broken in your process.  Luckily, by considering your deployment process an important part of your product and devoting time and energy to it, your can turn overnight deployment into a thing of the past and reclaim those late nights for important things like sleep.

Or Adult Swim.  Your choice.

Android “ended” and “pause” events on <video>

In working on implementing playlists for the Brightcove Smart Players, I discovered some pretty annoying discrepancies between the HTML5 video specification and how Android handles video objects.  I’ve created a droidfix GitHub repository which captures an example of how to solve these problems.

The Problems

On Android devices, similar to the iPhone and iPod touch, video is a modal experience; that is, rather than playing the video within the context of the page, the video is immediately expanded to a full-screen experience. We can debate whether this approach is right or wrong all day long if you’d like; but for now let’s just accept it and move on.

On the Android platform, the video player is opened as if it were a new page. To return to the page from the playing (or paused) video, the user must use the device’s back button.  Several things do (or don’t) happen when you hit the back button:

  1. An “ended” event fires
  2. The currentTime property of the video is reset to 0
  3. A “pause” event does not fire

The ended event is particularly troubling when building a playlist-driven player as the ended event can be used to signal to the playlist that it’s time to load the next video. It creates a funny (for the developer; probably not the user) inescapable scenario when the user pauses the video and the next item on the playlist loads immediately until the end of the playlist is reached.

Workaround

I’ll point you to the GitHub version of droidfix.js to see the code; including an example page showing the code in action. Here’s a quick description of the workaround:

First, when timeupdate events are fired, we store the furthest forward position in the video that has played. We have to do this because of item #2 above; once the video actually ends, we’ll lose the last time index.

Now, when an ended event is received, we check to ensure the video has really reached the state specified for an ended event. In other browsers, we check that the currentTime property of the video is equal to the video’s duration; if that’s true, we’ve reached the end of the video.  For android, we check that the furthest forward position variable we stored earlier is within 2 seconds of the end of the video. Why 2 seconds? Well, the Android browser doesn’t fire a timeupdate event at the end of the video; furthermore, it also doesn’t seem to be consistent as to how far from the end of the video it fires its last timeupdate. In my testing, 2 seconds was a long enough buffer to include all examples; it also has the side-effect of being short enough that we can reasonably assert that beyond that point, the user has “completed’ the video.

If we determine that the ended event is legit, we fire a series of custom listeners set up in droidfix. However, if we determine this is android telling us that the video has ended when it hasn’t, we create a custom pause event and fire it so handlers bound to pause will be triggered.

Go fork and wreak havoc!

I put the example code on GitHub with a pretty descriptive README of how it works along with the current known issues / caveats. Feel free to grab it, modify, comment, etc. If nothing else, it provides an example that you can integrate with your own video solutions.

Incredibly Strange iOS Issue

Update 8/19/2010 @ 11:55 EDT

It looks like this may be a cross-domain issue after all.  I changed my testing methodology and was able to reproduce this issue without Gmail being involved. That’s the good news, the bad news is that even in a cross-domain scenario, access should simply be denied to the properties or an exception should be thrown.  In this case in Mobile Safari, the script dies at the point in execution where it attempts to access the property.

</update>

I’m posting this to my blog in hopes of finding some help; I’d love to file this issue as a bug report but I have no idea if it’s a bug in Gmail or a bug in Mobile Safari.  I’m inclined to think the latter; but the fact that I can consistently reproduce this issue in the manner described below leads me to think it could be something crazy that Gmail does.

Window.opener

The window.opener object is available on browser windows that have been spawned by other browser windows (either using Javascript or the classic target=”_blank”). It’s a reference to a window object. In general, it behaves entirely as expected on the iOS; except when you click a link from Gmail (and the issue occurs whether you’re using the Mobile Gmail version or the desktop version on an iOS device).

What Happens?

Clicking a link from Gmail on an iOS device causes the link to open in a new browser window. However, if you have a page that attempts to reference any properties of the window.opener object, you’ll receive the error:

TypeError: no default value

Here’s where it gets really strange, if I evaluate the window.opener object itself in any way, it reports that its a DOMWindow object. However, if I try to enumerate the properties of that object none are found.  If I try to access a property of a DOMWindow object by name, I receive the above error.

What I’ve Tried

I’ve tried and ruled out several possible methods of reproducing this issue without Gmail; in each case the window.opener object behaved normally and I was able to both enumerate the properties and read a property directly without an error:

  • I created a page with a link using target=”_blank”
  • I created a page using Javascript’s window.open to open a link
  • I hosted the pages on different domains to see if there was some cross-domain issue affecting it.

I’ve also tried a few workarounds; none of which prevented the error:

  • Check for the presence window.opener.location (or any specific property) before using it in an assignment.
  • Wrapping any references to properties of window.opener in a try / catch.

At this point, I’m pretty well stumped on what’s happening here; or if there’s any way for me to work around it.  I’ve set up a demo that I encourage you to email to yourself so you can see this bug in action.   Turn on the debug console in MobileSafari, click that link from the gmail web app, and you’ll see the errors.  Click that link from anywhere else, you’ll get every property of the window.opener object written to your console.

You can test this on the devices themselves or the emulators; the behavior is consistent.

One more thing…

It just wouldn’t be an Apple-related post without that line; Once you see the error, tap the icon to switch browser windows.  Safari will crash.

Pass that Interview 1: Mimic a Class in Javascript

Javascript, unlike most object-oriented programming languages, does not have the concept of Classes. Instead, Javascript uses a model in which objects are created, cloned, and enhanced by creating copies of the objects (it’s loosely based on the Prototype pattern). There is a lot of power in this Prototypal system and people like Douglas Crockford have shown how Prototypal inheritance can be implemented in Javascript. This type of work is really cool; unfortunately it’s not generally well-known or well-understood and many developers wish to use Javascript to simulate the more familiar Classes & objects as seen in languages like Ruby and Java.

Due to this simulation being common practice, the following has become a fairly common interview question:

How do you create a class in Javascript?

The answer is a bit wonky; but pretty simple to follow:

Continue reading