Building a Feed for humans

Folks who know me (or have suffered through one of my rants on the state of social media and the Internet in general) are familiar with my distate for what we call "feeds". Ostensibly they are collections of information that is relevant or useful to a visitor of a site.

In building Remark and Kore, my hypocrisy has never been lost on me, as the former is pretty much an Instagram clone, and the latter is a quaint (at least I think so) amalgam of a fitness social media and a dating app.

Remark's feed is fairly straightforward though. It just lists posts (I call them remarks 😌) by any account. No ads, no AI processing, etc. And that's enough, because nobody uses it except me. It's more of a digital photo album for yours truly than anything else.

Kore's feed was an interesting project because I wanted to not reproduce the infinite, eye-glazing, brain-melting slush of semi-relevant content that defines your average social media feed these days. Fortunately, I can't try to build that anyways, because we have no advertisers, and very few users. But on principle, Kore's feed consists almost entirely of content produced by people you are friends with on the app. The only content you'll see from people you don't know are signals, which is the whole point of them.

Anyways, on to the technicals.

Technical approach

The prospect of building a "static" feed (one that doesn't morph every time you visit or refresh the page in a desparate attempt to keep hold of your attention) is an interesting one. I considered two primary approaches:

  • Record-generated feed
  • Query-generated feed

I'll spoil the ending: I went with a query.

The gist of the record-based was that e.g. anytime Jane completed an activity, such as a run, a feed record would be generated for all of her friends so it could be displayed in their feeds. This approach is maybe a little simpler to understand and extend, but it would require creating feed records for every user on the app, causing a combinatorial explosion of data as users and content grows, not to mention the question of how to retroactively generate feed records for new friend connections, and having to delete them anytime a friendship is removed.

Yuck.

How the query-based feed works

Short version: polymorphic associations with a virtual record type called FeedItem.

Long version: in order to keep all of the sorting and merging records from different tables in the database query, I had to write a custom query that used polymorphic associations so ActiveRecord would generate a relation of what were essentially handles to the record of interest in a single query. That would simplify the rendering:

<% @feed_items.each do |item| %>
  <%= render item.record %>
<% end %>

This is sort of a hack, since I have to trick ActiveRecord into thinking that feed items actually exist. To do this, I created an empty view in a migration.

class CreateFeedItemsView < ActiveRecord::Migration[7.1]
  def up
    execute <<-SQL
      CREATE OR REPLACE VIEW feed_items AS (
        SELECT
          NULL::bigint AS id,
          NULL::bigint AS owner_id,
          NULL::bigint AS record_id,
          NULL::varchar AS record_type,
          NULL AS timestamp
        WHERE false
      )
    SQL
  end
  def down
    execute "DROP VIEW feed_items;"
  end
end

This lets me define and use a FeedItem model without Rails complaining. That belongs_to is the polymorphic association we use to get the actual records we're interested in using.

class FeedItem < ApplicationRecord
  belongs_to :record, polymorphic: true
end

Here's an abridged version of the query that generates the feed. I've excluded the friendship condition and simplified the sorting logic for brevity. But you can imagine a where clause in each subquery the filters to include only content by friends, if relevant, and some more mathy sorting to deal with the fact that signals occur in the future, and we don't want them all at the top of the feed.

WITH activities AS (
  SELECT
    a.user_id AS owner_id,
    'Activity' AS record_type,
    a.id AS record_id,
    a.started_at AS timestamp
  FROM
    activities a
),
posts AS (
  SELECT
    p.user_id AS owner_id,
    'Post' AS record_type,
    p.id AS record_id,
    p.created_at AS timestamp
  FROM
    posts p
),
signal_activities AS (
  SELECT
    s.user_id AS owner_id,
    'SignalActivity' AS record_type,
    s.id AS record_id,
    s.starts_at AS timestamp
  FROM
    signal_activities s
)
SELECT
  ROW_NUMBER() OVER (ORDER BY items.timestamp ASC) AS id,
  items.owner_id,
  items.record_type,
  items.record_id,
  items.timestamp
FROM
  (
    SELECT * FROM activities
    UNION ALL
    SELECT * FROM signal_activities
    UNION ALL
    SELECT * FROM posts
  ) items

Note: remember the view we created? It has the exact columns that we're generating here: id, record_id, record_type, and timestamp. Otherwise ActiveRecord thinks those columns (e.g. feed_items.record_id) don't exist and the query will fail.

Then it's just a matter of selecting the feed items with an outer query. To prevent N+1 queries, we tell ActiveRecord to preload associations. It's a little strange to read, because there's some overlap in association naming, but it works.

FeedItem
  .joins("FULL OUTER JOIN (#{sql}) AS results on true")
  .select("results.*")
  # signal activity preloads
  .includes(record: [:route, :participants, user: {avatar_attachment: :blob}])
  # activity, post preloads
  .includes(record: {user: {avatar_attachment: :blob}})

The result is an ActiveRecord relation of FeedItems and rendering is as simple as creating partials for the underlying records (Activity, Post, etc.) and writing:

<% @feed_items.each do |item| %>
  <%= render item.record %>
<% end %>