Organise photos by date

I organise my photos by grouping them in a directory, where the directory takes on the name “created_date title” (e.g. 2016-05-16 Birthday Party). As it became tedious to create a directory and manually group photo sets, I wrote a Ruby script to assist with the heavy lifting. After that, it’s just a matter of providing a title to each created directory.

require 'exifr'
require 'fileutils'

Dir['*.jpg'].each do |file|
  date = EXIFR::JPEG.new(file).date_time
  if date
    folder = date.strftime('%Y-%m-%d')
    FileUtils.mkdir_p(folder) unless File.directory?(folder)
    FileUtils.mv(file, folder)
  end
end

Note: you will first need to run gem install exifr to install dependencies.

Resetting password of Devise user from the database

I recently encountered a situation where I was trying to access a Rails application but:

  • I didn’t know the password for the admin user
  • The reset password functionality was not working
  • I had no access to a Rails console

Knowing the User model was using Devise, I set a password for a user on a dev environment to find the encrypted_password:

u = User.find(1)
u.update(password: 'password')
u.encrpyted_password
=> "$2a$10$qjBArbMLISBaUDMKdNM0KuZNPsRzPMIINMsPT.NhE9UrIYnErVF2S"

I could then use SQL to update the password on the prod environment:
UPDATE users SET encrypted_password = '$2a$10$qjBArbMLISBaUDMKdNM0KuZNPsRzPMIINMsPT.NhE9UrIYnErVF2S' WHERE id = 1;

Get database version in Rails

For one of my applications, I recently created a status report page which shows crucial information for debugging, including software versions, service connectivity and filesystem path writability. One of the items was a database version string, and the standard ActiveRecord interface does not support this concept. While version methods usually exist in the connection adapters, they are usually protected/private and have different naming conventions.

As the application can be run against multiple backends as well as on MRI or JRuby, I needed to check what adapter was in use, then query the database:

adapter = ActiveRecord::Base.connection.adapter_name
sql = case adapter
  when 'MSSQL'
    'SELECT @@VERSION'
  when 'MySQL', 'Mysql2', 'PostgreSQL'
    'SELECT VERSION()'
  when 'OracleEnhanced'
    'SELECT * FROM V$VERSION'
  when 'SQLite'
    'SELECT SQLITE_VERSION()'
end
ActiveRecord::Base.connection.select_value(sql)

Optimising index view performance

I recently came across a scenario where my index views in Rails were taking a long time to render when there were large data sets. The problem stemmed having a lot of data presented that were from models/tables other than the primary model/table of interest.

For example, an “asset” index view may have the following columns:

  • Name (from asset.name)
  • Type (from asset.type.name)
  • Site (from asset.site.name)
  • Organisation (from asset.site.organisation.name)
  • System (from asset.system.name)
  • Status (from asset.status.name)

The first (naive) approach I used was to query using eager loading:

Asset.includes(:type, {site: :organisation}, :system, :status)
     .references(:type)
     .search_by('types.name', 'Pump')
     .order(:name).page(1).per_page(1000)

Using a data set of around 500,000 items, rack-mini-profiler indicated this took about 6.7 seconds to render the action with 5.6 seconds of that consumed by the SQL query.

When investigating, I noticed this was using a single query with LEFT OUTER JOINs. I then forced ActiveRecord to use separate queries to load the related models by using Rails 4’s preload instead of includes and used joins instead of references to force ActiveRecord to use a more performant INNER JOIN:

Asset.preload(:type, {site: :organisation}, :system, :status)
     .joins(:type)
     .search_by('types.name', 'Pump')
     .order(:name).page(1).per_page(1000)

This dropped the action render time to 1.7 seconds with 0.7 seconds consumed by the six SQL queries.

For my index views, I actually don’t need the entire object graph to be hydrated – I only really needed the names of the related object. So I tried using pluck, which returns an 2-dimensional array of values:

Asset.includes(:type, {site: :organisation}, :system, :status)
     .joins(:type).search_by('types.name', 'Pump')
     .order(:name).page(1).per_page(1000)
     .pluck(:name, 'types.name', 'sites.name', 'organisations.name', 'systems.name', 'statuses.name')

This actually increased the action render time to 2.1 seconds with 1.7 seconds consumed by ActiveRecord and the SQL query. The SQL query now uses LEFT OUTER JOINs again, which as we saw in the original case, is slow. But the differential between the total and SQL had decreased since ActiveRecord models did not have to be created.

While using the second approach would provide a significant speed boost without changing any code, I wondered if we could use a combination of the second and third approaches – use separate SQL queries instead of a single with LEFT OUTER JOINs and simply pass back an array and not have to hydrate ActiveRecord models.

The result: Preload Pluck.

On the same data set, Preload Pluck drops the action render time to 1.5 second with 0.7 seconds consumed by the SQL queries.

YMMV as the results above are for that particular data set, the queries I’ve used and the way my PostgreSQL database is setup.

Selecting string literals in Rails 4 using PostgreSQL

I recently ran into an issue with Rails 4 using the PostgreSQL adapter (MRI) where selecting a string literal in a query would return a warning about an unknown OID.

This can be reproduced by running the following in Rails Console:

Rails.logger.level = 2
ActiveRecord::Base.connection.select_all("SELECT 'test'")

=> unknown OID: ?column?(705) (SELECT 'test')

This occurs both with and without column aliases.

705 is PostgreSQL’s code for an unknown type and in order to fix this, what we want to do is register the “unknown” type. Based on this blog post, I put the following code in an postgres_oids.rb initializer:

# Treat any unknown types (e.g. string literals) as text
ActiveRecord::ConnectionAdapters::PostgreSQLAdapter::OID.alias_type 'unknown', 'text'

Nokogiri XML schema validation with multiple schema files

When using Nokogiri to validate an XML document against multiple XML Schema files using import declarations, ensure that you use File.open rather than File.read as shown in the Nokogiri::XML::Schema documentation. This will allow Nokogiri to navigate to and read these imported schemas.

Thanks to this StackOverflow post.

a.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:b="ns_b" targetNamespace="ns_a">
	<xs:import namespace="ns_b" schemaLocation="b.xsd"/>
	<xs:element name="Foo" type="b:BarType" />
</xs:schema>

b.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="ns_b">
	<xs:complexType name="BarType">
		<xs:simpleContent>
			<xs:extension base="xs:string" />
		</xs:simpleContent>
	</xs:complexType>
</xs:schema>

example.xml

<?xml version="1.0"?>
<Foo xmlns="ns_a">Hello World</Foo>

validate.rb

xsd = Nokogiri::XML::Schema(File.open('a.xsd'))
doc = Nokogiri::XML(File.read('example.xml'))
xsd.validate(doc).each do |error|
  puts error.message
end

Selecting by Option Value using Capybara

Capybara’s select command allows searching a HTML select field and selecting an option that matches the supplied value by name, id or label text. Additionally, the match option allows the user to indicate the behaviour if more than one option is found. For example, match: :first will select the first item out of multiple matches, while match: :one will throw an error if more than one item is found.

I was recently dealing with a time zone select list based on ActiveRecord::TimeZone. I was receiving an error when Capybara tried to select “Samoa” from the option list (“Samoa” due to test randomisation), since there were items named “American Samoa” and “Samoa”. I ended up having to select the option via its ‘value’ attribute instead:

time_zone = ActiveRecord::TimeZone.all.sample
find('select#time_zone').find("option[value='#{time_zone.name}']").select_option

Using Rails’ descendants in development

Rails provides a nice method descendants that returns all subclasses for a specified class. However, config.cache_classes = false is a default setting in development.rb (for the ability to reload classes on the fly) and as this tells Rails not to load classes until they are referenced, then calling descendants will generally return an empty array not the expected subclasses.

To be able to use descendants in development, you can add the following code to development.rb:

config.eager_load_paths += Dir['path/to/files/*.rb']
ActionDispatch::Reloader.to_prepare do
  Dir['path/to/files/*.rb'].each {|file| require_dependency file}
end

The first line tells Rails to load these particular files when Rails first starts up. The rest of the code tells Rails to require these particular files (including any new files/classes) on each request.

Note: if your path/to/files isn’t one of the standard directories that Rails watches (to trigger the reloader), you will need to add it to config.watchable_dirs:

config.watchable_dirs['path/to/files/'] = [:rb]

Moving to a Windows JRuby on Rails development environment

tl;dr I recently set up a JRuby on Rails development environment on Windows after using Mac OS X and Ubuntu VMs. It’s noticeably faster for development and more comfortable. I’ll stick with it.

When I was first starting JRuby on Rails development, I started with a Mac OS X environment running from VirtualBox. I chose Mac OS X to emulate as close as possible what the other members in my team were using, so I could get advice on the basics without too much friction. However, Mac OS X isn’t officially supported by VirtualBox and the environment was unwieldy – no ability to resize the VM window, no shared clipboard and occasionally crashes on boot were the main drawbacks.

After a few weeks, I moved to an Ubuntu VM. This did require minor adjustment in our codebase to get it working on a development environment. VirtualBox supports Ubuntu, and the drawbacks from the Mac OS X experience were eliminated. I used this environment for a good eight months, and became quite familiar with Unix environment. The terminal took the forefront (compared to Windows), apt-get package management was fantastic and everything felt solid. However, after copying the VM to my laptop, running Rails and tests was considerably slow.

Another reason I initially avoided Windows as a development environment was the common perception at the time was that Windows was a second-class citizen in the Ruby community, and performance was lacking. Supposedly even a Ubuntu VM would outperform a native Windows environment. Having gained significant experience with JRuby on Rails over the last few months, I felt comfortable enough to try my hand at setting up a Windows development environment. Here’s what I learnt from the process:

  • Use the Java SDK instead of Java JRE to run JRuby. I would occasionally get Cannot find Java 1.5 or higher error messages when running certain gems. Ensure JAVA_HOME is set to Program Files/Java/jdk* and PATH includes Program Files/Java/jdk*/bin.
  • JRuby on Windows requires the jruby-pageant gem to interact with net/ssh.
  • We are using Eco templates and I was encountering Encoding::Compatibility Error, incompatible character encodings: UTF-8 and Windows-1252 errors when trying to compile a Javascript template that had a UTF-8 character. UTF-8 characters in ERB templates displayed correctly. I couldn’t solve this one and ended up using the escaped HTML format instead.
  • Command Prompt and Powershell don’t natively support ANSI escape characters (e.g. colour codes). So running RSpec would insert a bunch of garbage characters in the output. Turning off --color helped, but there was still some around the progress bar itself, notwithstanding the loss of coloured output. I investigated ANSICON but it seemed too complex to install and I wasn’t entirely sure I could get it working with Powershell. So I decided to install Cygwin and while doing so, thought I’d use ConEmu to unify my Cygwin and Powershell terminals under one umbrella. That’s when I found Powershell, when run through ConEmu, supports colour codes and also allowed me to navigate through irb/rails console history.
  • I found PuTTY itself more convenient than running plink through Powershell as it seems to better interact with bash. PuTTY can also be set up as a task/jump-list in ConEmu to avoid typing the full command.
  • You have to add your private RSA key to Pageant. I always wondered what the computer icon with the hat in my taskbar was for.
  • Use PowerTab with PowerShell. It uses bash-like tab completion instead of the default cycling completion.

In the end, it’s about performance and friction. In comparison with the Ubuntu VM environment, JRuby start up time is noticeably quicker although running the entire test suite is mixed – its faster when using Poltergeist/PhantomJS for integration/feature tests, while slower using Selenium (which we’ve since abandoned). This leads me to think the original performance advice of using a Unix system was based on experience with MRI rather than JRuby.

The Windows environment itself is more familiar and I can now Alt-Tab between my IDE, terminals, file manager, and browser, and move applications to different monitors. As there are no showstoppers, I’ll be sticking with this environment for the foreseeable future.

Reducing Rails asset precompile times on JRuby

Rails asset precompile times on JRuby are considerably slower compared to MRI. I came across this post which provided suggestions on speeding up the asset precompile task.

Using the following options – using Node.js instead of therubyrhino for JS compilation, forcing the JVM to 32 bit (although this can be omitted on a 32 bit JVM) and not using JIT compilation – cut my asset precompile time from 4 mins 37 secs to 2 mins 8 secs. Using Node.js contributed to the majority of that time since I’m using a 32 bit VM.

EXECJS_RUNTIME='Node' JRUBY_OPTS="-J-d32 -X-C" rake assets:precompile