The blog of , a Ruby on Rails development team
Known from Rails LTS, makandropedia and Nanomize

Now available: Post Rails Book Bundle

The Post Rails Book Bundle is a collection of books about growing Ruby on Rails applications as they become bigger and more successful.

The bundle is only available until this Friday (July 22nd 2016), so grab your copy while you can!

The following eight books are included at a heavily discounted price:

Fearless Refactoring Rails Controllers

This book guides you through the complicated task of cleaning up the mess that controllers often become in legacy Rails applications.

How do you extract the business essence from the controller, while keeping the HTTP related aspects in it, untouched?

It also touches to topic of introducing explicitness to multiple Rails conventions which simply further refactorings.

Trailblazer - A new architecture for Rails

Trailblazer introduces several new abstraction layers into Rails. It gives developers structure and architectural guidance and finally answers the question of "Where do I put this kind of code?" in Rails.

This book walks you through a realistic development of a Rails application with Trailblazer and discusses every bloody aspect of it.

Rails As She Is Spoke - How Rails gets OOP wrong but it works anyway

Do you want to understand Rails? Do you want to write Rails apps effectively, or see your own open source creations delight their users and enjoy wild success like Rails has? Have you noticed that Rails never quite makes perfect sense according to traditional object-oriented theory, or that the Ruby On Rails Guides never seem to quite explain the realities of cutting-edge Rails development?

Frontend friendly Rails - Better defaults for your sophisticated frontends

Upgrade Rails defaults and introduce cool features that’ll help you with making your apps more maintainable and faster to write.

Take your API to a higher level in terms of maintenance and provide user experience improvements.

Growing Rails Applications in Practice

Discover a simpler way to scale Rails codebases. Instead of introducing new patterns or service-oriented architecture, we will show how to use discipline, consistency and code organization to make your application grow more gently.

Rails TDD Video Class

The real goal of the class is to teach you how to combine Rails with TDD. Other things, like DDD and mutant are also crucial. We don’t want to show the only one and true way. We are going to present more techniques over time. Think of this class as a TV series, which is easy to consume in short episodes every day.

Modular Rails - The complete guide to modular Rails applications

Wait! What's a modular application?!

It's pretty simple. Instead of putting everything into one project, you put your MVC components into specialized Rails engines packaged as gems. Module by module, you can define what your application will be!

Unfuck a Monorail for Great Justice

Monolithic Rails apps – or monorails – are a problem in the world of Rails development. This book doesn't just show you how to get them back on track. It shows you how to get them back on track more cleanly and more swiftly than you would have believed humanly possible.


All in all this is $458 worth of books for only $199. Offer is valid until this Friday (July 22nd 2016). More information at www.railsbookbundle.com

Using responsive images

Back in the web's youth, you'd embed an image with <img src="some/file.jpg" />. Everyone who read your site would download and see that image. After all, it was only a small file a few pixels high and wide.

Times have changed. Web pages contain huge images. People visit web pages from various devices with different screen sizes and resolutions. Moreover, hi-DPI screens have more pixels per area, requiring higher-resolution images.

Web pages need to adapt. But how?

Naive approach

The easiest means developer-wise would be to simply provide images with double (or triple) the traditional resolution.

Unfortunately, this has the drawback of delivering 4x (or 9x) as large image files to everyone. Even people with a 3.5 inch display would need to load the full 4K landing page image over their prepaid mobile connection.

That's bad.

Flexible approach (the future)

What if we gave each client exactly the image size it needed? It would mean more work for the developer, but combine the best user experience with the best possible load time for the client.

That's called a "responsive image". Like a responsive web page, it sort of "adapts" to the client's screen size. Unlike a responsive web page, there is some effort required to achieve this.

Using responsive images is a trade-off between storage and complexity on the server side, and perceived performance and quality on the client side. Offering 100 versions for an image will give each device the optimum image file as quick as possible, but consume lots of space on the server and make rendering and maintenance a nightmare.

Hence, it is about finding the sweet spots of valuable image sizes. Nowadays, browsers are quite good in interpolating images, so you do not actually need to target all resolutions - which, by the way, is impossible. You could never target all mobile device resolutions.

Let's get to the meat. Here's the process (inspired by Matt Wilcox). Do this for each image location in your application (e.g. header banner, image thumbs, gallery, etc.):

  • Determine required image widths
  • Generate them
  • Render them

However, it is not totally as simple as this, so we'll talk about the process in detail.

1. Finish the design

You'll save yourself wasted time when you only start optimizing image performance after all conditions have settled, including responsive design! Remember, premature optimization is the root of all evil.

2. Pick a standard version

You should always have a standard image that's just as large as it needs to be for a normal density desktop screen.

3. Determine responsive image versions

Next, resize the window though all breakpoints and make a list of the widths the image goes through. Then add pixel density variants to their right.

Example:

Breakpoint x1 x1.5 x2 ← Pixel density
lg 1310 1965 2620 Standard image and retina image (see below)
md 1110 1665 2220 The versions above are still fine for these widths
sm+xs 890 1335 1780 1 version for mobile screens, those with Retina can still use 1310

The selected versions are: 890, 1310 and 1965. For an example image, their sizes are 110kB, 212kB and 430kB.

Because width matters way more than height in responsive design, the image sizes are denoted with the new w unit. Thus, the selected image sizes above are called 890w, 1310w and 1965w.

How to choose image versions

How many versions you choose depends on the degree of client optimization you want to achieve. More images means possibly better performance, but also more complexity on the server.

  • When image sizes are close (as in 10%), drop the smaller one. E.g. 1200, 1280 → 1280
  • There are reasonable limits for image size and density factor. Read on.

Reasonable lower limit

Almost all desktop browsers are used at widths beyond 1000px. Most mobile devices sport double-density screens that are 320px or more in width. Thus, images below 640w are not useful for full width images. (Images smaller than full-width on a 320px screen are mostly so small in file size that you generally should not need to create variants. Usually these images are quite small on desktop, too, so you might simply reuse the desktop image on mobile.)

A 640w JPG with 65kB would be only 20kB at 320w. While this looks like two thirds saved (wow!), it's only 45kB. So you'd create an extra image version just to save 45kB per image on one of the few non-retina tiny screens. Probably there are places where you can optimize better.

Reasonable factor limit

The intuitive upper resolution limit is the max size an image can get, multiplied by the screen's pixel density. However, since browsers are quite good at interpolating images nowadays, you do not need to feed every client pixel. Under most circumstances, 1.5 times larger will do fine for retina double-density devices. (I've manually tested this.)

For a 800w JPG and a 2x version of 300kB, the 1.5x version has 187kB. That's a 40% or 113kB reduction, which is quite much. Still, both variants are visually almost indistinguishable with only very sharp edges slightly blurrier in the smaller image. I call it "loss-less enough" ;-).

Considerations

The above limits and suggestions target the "average" case, where you want improve bandwidth usage and retina user experience with little effort. Of course you need to derive your own strategy depending on your audience.

4. Generate image versions

Now that you know which image widths you need, you'll have to generate them.

I'm using Carrierwave to store image uploads and to generate these versions. In the respective uploader, add image versions like this:

version :banner_1965 do
  process resize_to_fill: [1965, 600]
end
version :banner_1310, from_version: :banner_1965 do
  process resize_to_fill: [1310, 400]
end
version :banner_890, from_version: :banner_1310 do
  process resize_to_fill: [890, 272]
end

Affixing the versions with their width makes using them much easier, see below.

5. Render the image

We're almost done. However, the technically hardest part remains: How do we give the browser the right image variant? We cannot reliably know upfront which screen size he has, nor how large the image will be rendered.

Fortunately, a solution is already in the making. It has quite complex powers, but in the form we'll be using it consists of two additional attributes on the img tag: srcset and sizes.

srcset
Contains a list of image URLs together with their width: srcset="large.jpg 1965w, medium.jpg 1310w, small.jpg 890.jpg"
sizes
Contains a list of "reduced media queries" (which work slightly different from CSS) and the corresponding image size: sizes="(max-width: 900px) 500px, 1310px". An entry without a media query is taken as default value.

However, since I'm already using the awesome lazysizes (I can recommend this great library, btw) for lazy image loading, I'll also use it for managing responsive images. Because it lazy loads the image anyway, it can also select the appropriate size on the fly. Thus it makes the sizes attribute dispensable.

We'll use the following markup:

<img src="fallback/image.jpg" # Fallback image for old browsers
  class="lazyload" # Activate lazy image loading
  # A transparent 1px gif that keeps modern browsers from loading the src
  srcset=""
  data-sizes="auto" # Tell lazysizes to manage the `sizes` attribute
  data-srcset="large.jpg 1965w, medium.jpg 1310w, small.jpg 890px" />

To generate this in Rails, we'll use the following helper …

# Expects the version names to end with _<image width>, e.g. banner_1965
def responsive_image_tag(uploader, versions:)
  url_widths = versions.map do |version|
    url = uploader.send(version).url
    width = version[/_(\d+)$/, 1]
  
    [url, width]
  end
  srcset = url_widths.map { |url, width| "#{ url } #{ width }w" }.join(', ')
  
  image_tag url_widths.first[0], # lazysizes recommends src == data-srcset[0]
    class: 'lazyload',
    # This transparent gif keeps modern browsers from early loading the
    # fallback `src` that is only for old browsers
    srcset: '',
    data: {
      srcset: srcset,
      sizes: 'auto',
    }
end

… and use it like this:

= responsive_image_tag @page.image, versions: %i[banner_1965 banner_1310 banner_890]

Voilà! (This example image does not use lazysizes because it is not installed on this blog. Nontheless it's using the srcset and sizes attributes to have the browser choose the appropriate version.)

Conclusion

We've built a simple system of responsive images that delivers appropriate image sizes to each client. It is as easy as determining the required image widths, defining them in a Carrierwave uploader and rendering them with the supplied helper method.

I hope you enjoyed this post. Send me a note when you've incorporated responsive images somewhere!

Resources

Your Rails 3.2 project should switch to Rails LTS

After many months of work, Rails 5 was released today. Thanks to the almost a thousand committers, this new version brings us exciting new features such as ActionCable or the attributes API.

Unfortunately this also means your trusty Rails 3.2 app is no longer getting new security patches from the open-source community, who now has their hands full maintaining versions 4 and 5.

Rails 3.2 has been on limited maintenance for the past two years. With the Rails 5 release, this support has ended entirely. If your business uses Rails 3.2, you should act now to prevent data breaches and liability.

If your team can't or don't want to upgrade your application now, you can switch to Rails LTS and continue to receive security updates. Rails LTS is a drop-in replacement for the official Ruby on Rails gems.

For more information about Rails LTS visit https://railslts.com/.

Why I'm closing your Github issue

If I just closed your Github issue with a one-line comment, I'm sorry.

I open-source a lot of libraries on Github. This has been a huge source of stress and guilt for me. Users demanding that I fix their edge-case issue, demanding compatibility with an unstable alpha of another library, people demanding that I answer their project-specific question on Stack Overflow, demanding progress if I'm too slow to respond to any of the above.

I've been thinking a lot about how to not burn out over this. Many issue authors expect commercial levels of support for this free product they're using. But that's not something I can give. I open-source my projects so others can debug and fix things without depending on me.

So here's an exhaustive list of how I will be using Github issues from here on out:

  1. Discuss the best way to make a PR

I won't implement your feature, and I won't fix your edge-case bug. What I will do is discuss whether a proposed change is within the scope of the library, and explain the implementation to help your build your PR. Other than that, I just can't help you. I'm sorry.

PostgreSQL query optimization

With nanomize we built an URL Shortener complying strictly with the German Federal Data Protection Act. One core element of nanomize is the analysis dashboard of a customer, helping to understand the user flow behind links.

While the application scales perfect with the current requirements, we want to guarantee the same performance for companies with high traffic, too. This way we learnt a lot about how to tweak a PostgreSQL database with many rows.

Understand the architecture

A user doesn't recognize much about what's going on, when she's redirected from a short URL to the destination. But nanomize stores a bunch of data on this redirect. That's the related link, the referrer url, the extracted country code from the ip address and many more. All in all each redirect causes the creation of one row in a table of nanomize, where analysis can be performed later.

Things become slow

To understand the upcoming bottlenecks, we use a simplified table of nanomize, which we call hits and fill with 100 million rows. The timestamp is distributed uniformly over one year, the link_id and the country_code are chosen randomly (#link_ids = 100 and #country_codes = 250). There are no indices yet, so we can improve performance step by step. An extract from the hits table is shown below.

id link_id country_code created_at
1 1 US 2016-03-07 10:51
2 1 US 2016-03-07 10:52
3 2 UK 2016-03-07 10:53

Furthermore we take care of caching by dropping the buffer cache and the system cache before each query.

service postgresql stop
sync
echo 3 > /proc/sys/vm/drop_caches
service postgresql start

As no indices and triggers exist until now, an INSERT with 171 ms is as fast as possible. That's useful to note and compare with a later constellation again. The next query is a SELECT to count the total number of entries in the hits table.

SELECT count(*) FROM hits;
Time: 64306,54935 ms

It takes over one minute until a result returns. Counting is due to implementation reasons slow in Postgres and there is no trick to speed up execution significantly. Keep this in mind and be sure, that you really need an exact count. Often you just use a wrong implementation (e.g. infinite scrolling) or an estimation is enough. However we want to understand if and how an index could improve counting. Therefore we need to analyse the query.

Postgres execution plan

Postgres uses a planner to analyse a query and to decide what the cheapest way of a execution might be. When executing a query it's possible to output the underlying decisions of the planner with the (EXPLAIN ANALYSE) statement, that combines the estimations of the planner and the costs of the real execution. Mistreating the syntax highlighting for didactic reasons the output of the count query looks like this.

EXPLAIN ANALYSE SELECT count(*) FROM hits;</span>
Aggregate
  (cost=1886943.00..1886943.01 rows=1 width=0)
  (actual time=63654.115..63654.115 rows=1 loops=1)
  ->  Seq Scan on hits
    (cost=0.00..1636943.00 rows=100000000 width=0)
    (actual time=48.896..57394.584 rows=100000000 loops=1)
Planning time: 50.857 ms
Execution time: 63654.274 ms

We read the query from bottom to top. The query itself took 63654.274 ms, whereas the planner has needed 50.857 ms before the real execution could be started. The next parts of the execution plan are separated in blocks and labeled with an action. The first block is a Seq Scan on hits including two additional information surrounded by brackets. For the following it's enough to just understand the first two attributes.

  1. Estimated vs. actual costs: cost=0.00..1636943.00 vs. 48.896..57394.584
  2. Estimated vs. real number of rows output: rows=100000000 vs. rows=100000000

The estimated cost of the planner have no units, whereas the actual results are in milliseconds. The first value describes the elapsed time before the first output begins. The last value measures the costs/time until the execution runs to completion. Most often we are interested in the real execution time, but always take care, that the estimations of the planner return similar values. For the inspected block this means, that the first row appeared within 48.896 ms, but the complete execution took over 57 seconds. After fetching all rows they were counted in the Aggregation block. The aggregation needed to wait for the result of the sequence scan, otherwise it may had started before 57394.584 ms.

Now we have seen the output without an index, so we add an index with the hope to make execution magically faster. Creating a primary key in Postgres adds an index with a unique constraint.

ALTER TABLE hits ADD PRIMARY KEY (id);

Unfortunately running the same query again changes nothing. For further understandings we need to take a closer look at what a sequence scan actual is.

Sequential scan vs. index only scan

Postgres stores every table and index as an array of pages of a fixed size on disc. The easiest ways to find a row with the ID 15 is to walk through every page and look if this page includes a row with the ID 15. If the row with the ID 15 is inside the last page of the array this might be expensive. The described method is called a sequence scan.

Now let's look at another approach called an index scan: If we know, that we are searching for IDs we could store the ID and the page address in a special data structure. Postgres uses for that by default a balanced tree. Searching in a balanced three is very fast and a found tuple can be resolved to a page address, where the complete row could be read.

Furthermore it's important to know that reading many pages in a row is much faster for a hard drive than reading the same pages in a random order, as the hard disk head is moved less. On top of that the cheapest way is an index only scan, which doesn't need to load any row at all, as all required columns are contained in the index.

We go back to our hits table to understand what's said above. Just postpone the count problem and look at a simple select with a condition on the ID, the column we created already an index for. The first thing we do, is to check the different execution times of an index only scan and a sequence scan. Therefore we need to disable some query plans.

EXPLAIN ANALYSE SELECT id FROM hits WHERE id < 10000;
Index Only Scan using hits_pkey on hits
    (cost=0.57..274.59 rows=9487 width=4)
    (actual time=22.931..79.963 rows=9999 loops=1)
   Index Cond: (id < 10000)
   Heap Fetches: 0
 Planning time: 150.086 ms
 Execution time: 80.296 ms

SET enable_indexscan = OFF;
SET enable_bitmapscan = OFF;
EXPLAIN ANALYSE SELECT id FROM hits WHERE id < 10000;
 Seq Scan on hits
    (cost=0.00..2523885.70 rows=9487 width=4)
    (actual time=53.630..131530.808 rows=9999 loops=1)
   Filter: (id < 10000)
   Rows Removed by Filter: 99990001
 Planning time: 15.590 ms
 Execution time: 131531.165 ms

As you can see, there is a big difference between the index only scan (80 ms) and the sequence scan (131531 ms). Now we can compare a normal index scan with a sequential scan. To prevent Postgres using an index only scan we select more columns than the index contains.

EXPLAIN ANALYSE SELECT id, country_code FROM hits WHERE id < 10000;
Index Scan using hits_pkey on hits
  (cost=0.57..398.59 rows=9487 width=7)
  (actual time=0.011..126.566 rows=9999 loops=1)
 Index Cond: (id < 10000)
Planning time: 128.091 ms
Execution time: 126.919 ms

SET enable_indexscan = OFF;
SET enable_bitmapscan = OFF;

EXPLAIN ANALYSE SELECT id, country_code FROM hits WHERE id < 10000;
Seq Scan on hits
  (cost=0.00..2523885.70 rows=9487 width=7)
  (actual time=47.433..131746.804 rows=9999 loops=1)
 Filter: (id < 10000)
 Rows Removed by Filter: 99990001
Planning time: 128.130 ms
Execution time: 131747.157 ms

The index scan is much faster. The next step is to look at the the query by changing the condition to match all rows.

SET enable_seqscan = OFF;
EXPLAIN ANALYSE SELECT id, country_code FROM hits WHERE id > 0;
Index Scan using hits_pkey on hits
  (cost=0.57..4120665.15 rows=99999976 width=7)
  (actual time=0.010..100631.736 rows=100000000 loops=1)
 Index Cond: (id > 0)
Planning time: 127.865 ms
Execution time: 103648.355 ms

SET enable_seqscan = ON;
EXPLAIN ANALYSE SELECT id, country_code FROM hits WHERE id > 0;
Seq Scan on hits
  (cost=0.00..2523885.70 rows=99999976 width=7)
  (actual time=0.344..128026.136 rows=100000000 loops=1)
 Filter: (id > 0)
Planning time: 3.234 ms
Execution time: 131079.683 ms

An index scan is still faster, but the percentage of the difference is far smaller. The reason for that is the already mentioned background of random reads. As the planner chooses the sequence scan over the faster index scan, you need to adjust the query configuration to support the planner making the right decision. We skip this, as it is a own topic itself. Also keep in mind that you should adjust the runtime resources depending on your system to get the best performance.

Back to the count query, we encounter the same estimation problem of the planner as in the query before. The planner prefers a sequence scan instead of the much cheaper index only scan.

SET enable_seqscan = OFF;
EXPLAIN ANALYSE SELECT count(*) FROM hits;
Aggregate
  (cost=2846817.45..2846817.46 rows=1 width=0)
  (actual time=23349.313..23349.313 rows=1 loops=1)
 ->  Index Only Scan using hits_pkey on hits
   (cost=0.57..2596811.61 rows=100002336 width=0)
   (actual time=60.504..18335.820 rows=100000000 loops=1)
       Heap Fetches: 0
Planning time: 142.866 ms
Execution time: 23349.428 ms

The index only scan is with 23349 ms more than twice faster than the sequence scan (63654 ms). But this value is actually a little misleading, as we concealed the heap fetches an index scan needs to made. The number of heap fetches depends on the visibility map, discussed in the next section.

Visibility Map (VM)

Postgres needs to check the rows in an index that were modified by other transactions at the moment the query runs. Therefore it uses a compact visibility map (VM), in which DML operations mark pages before execution as invisible for others. That way an index knows for each page if it needs to visit the page on disk to check the included rows or not. An index only scan is only that fast, if it needs to visit few page tuples, which are called heap fetches. Postgres uses a daemon called autovacuum, which runs periodically in the background and updating the visibility map of a table. Also you can run a VACUUM manually. The planner now can decide to use an index only scan over a sequence scan, as there is no need of many heap fetches.

For our previous count we see that the number of heap fetches is 0, that means the hits table is fully vacuumed. We run a pseudo-manipulating query on each row and check the execution time of the index only scan on ID again.

UPDATE hits SET country_code = country_code;
SET enable_seqscan = OFF;
EXPLAIN ANALYSE SELECT count(*) FROM hits;
Aggregate
   (cost=5044201.45..5044201.46 rows=1 width=0)
   (actual time=602128.776..602128.776 rows=1 loops=1)
 ->  Index Only Scan using hits_pkey on hits
   (cost=0.57..4794195.61 rows=100002336 width=0)
   (actual time=154.424..595217.106 rows=100000000 loops=1)
       Heap Fetches: 199999944
Planning time: 180.733 ms
Execution time: 602129.478 ms

With 602129 ms it's nearly ten times slower than the sequence scan. In conclusion we can say that the execution time of a count with an index depends on the correct estimation of the planner and the number of visible rows in the VM. Let's stop at this point with the count query and move on with bitmap index scans and multicolumn indices.

Bitmap Index Scan and Bitmap Heap Scan

Often you see a bitmap index scan followed by a bitmap heap scan. The output looks like the following.

SET enable_indexscan = OFF;
EXPLAIN ANALYSE SELECT id, country_code FROM hits WHERE id < 10000;
Bitmap Heap Scan on hits
   (cost=425.48..39329.72 rows=10440 width=7)
   (actual time=147.553..908.460 rows=9999 loops=1)
 Recheck Cond: (id < 10000)
 Heap Blocks: exact=66
 ->  Bitmap Index Scan on hits_pkey
   (cost=0.00..422.87 rows=10440 width=0)
   (actual time=81.640..81.640 rows=9999 loops=1)
       Index Cond: (id < 10000)
Planning time: 216.926 ms
Execution time: 909.083 ms

The index doesn't contain all desired columns, therefore they need to be loaded afterwards. While an index scan performs random reads, the bitmap heap scan read the pages in a sequential order. For this a bitmap in physically order is needed. That's what a bitmap index scan returns. It checks a condition on every tuple in an index and saves the result as a compressed bitmap. That way bitmaps from different indices can be combined before loading the corresponding rows from disk.

In the example we can speed up the query by adding an multicolumn index.

CREATE INDEX ON hits(id, country_code);
VACUUM hits;
EXPLAIN ANALYSE SELECT id, country_code FROM hits WHERE id < 10000;
 Index Only Scan using hits_id_country_code_idx on hits  (cost=0.57..229.53 rows=7141 width=7) (actual time=38.125..110.524 rows=9999 loops=1)
   Index Cond: (id < 10000)
   Heap Fetches: 0
 Planning time: 346.084 ms
 Execution time: 110.878 ms

Note that we VACUUM the table again, otherwise the planner won't choose the index only scan. Finally we INSERT a row again and see that with two indices the execution time increased from 171 ms to 332 ms. Many queries in nanomize were complexer to optimize, but the simplified queries needed the same knowledge and strategy.

Our book in a bundle:
A bunch of books about building and maintaining large-scale Rails applications, all at a discounted price. Featuring best-selling books like Trailblazer, Fearless Refactoring, or Growing Rails Applications in Practice!
Our address:
makandra GmbH
Werner-von-Siemens-Str. 6
86159 Augsburg
Germany
Contact us:
+49 821 58866 180
info@makandra.de
Commercial register court:
Augsburg Municipal Court
Register number:
HRB 24202
Sales tax identification number:
DE243555898
Chief executive officers:
Henning Koch
Thomas Eisenbarth