I was working on a Rails project for a client recently where I was trying to figure out why a model search operation was joining to the same table twice, resulting in a huge performance hit. The answer turned out to be an incomplete understanding of the ‘has_many’ ‘:through’ relationship.
Given a model with the following relationships (using the rails documentation example):
class Physician < ApplicationRecord has_many :appointments has_many :patients, through: :appointments ... end
I added a search method to return results based on web form input of a begin timestamp, end timestamp, and search term. I am also using the will_paginate gem. Also, my dataset is large enough that I do not want to return results without input parameters:
def self.search(search, current_page, begin_timestamp, end_timestamp) if search if search.length > 0 Physician.includes(:patients, :appointments).where('patients.updated_at BETWEEN ? AND ? and patients.name LIKE ?', "#{begin_timestamp}", "#{end_timestamp}", "%#{search}%").references(:patients, :appointments).paginate(page: current_page, per_page: 15).order('patients.name ASC').distinct end end end
This resulted in a query (as seen in the mysql console and the rails application logs) that joined to the appointments table twice. Once for the has_many, through association, and once for the includes, references as added to the model.
The answer ended up being to remove the reference to appointments in the Physician model.
def self.search(search, current_page, begin_timestamp, end_timestamp) if search if search.length > 0 Physician.includes(:patients).where('patients.updated_at BETWEEN ? AND ? and patients.name LIKE ?', "#{begin_timestamp}", "#{end_timestamp}", "%#{search}%").references(:patients).paginate(page: current_page, per_page: 15).order('patients.name ASC').distinct end end end
This optimization increased performance by a huge margin. The application is now usable.
Leave a Reply