Guest User

Mastodon full-text search with no dependencies

a guest
Mar 6th, 2023
766
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.51 KB | Source Code | 0 0
  1. FAQ
  2.  
  3. Q: What's this?
  4. A: Dependency-free Mastodon full-text search using PostgreSQL text search
  5.  
  6. Q: Who can search what?
  7. A: Logged-in users can search all public posts the server has seen.
  8.  
  9. Q: How do I install it?
  10. A: $ patch -p1 < mastodon-search.patch
  11. $ RAILS_ENV=production bin/rails db:migrate
  12. $ RAILS_ENV=production bin/rails console
  13. > Status.reindex_all_for_search
  14.  
  15. Q: How long will reindexing take?
  16. A: A slow server does about ten thousand statuses per minute.
  17.  
  18. Q: Does this work with Mastodon forks?
  19. A: Probably. It was built against Glitch and appears to work fine with stock Mastodon, but verify correctness yourself before running random code from anonymous strangers on production servers.
  20.  
  21. Q: Gargon says full-text search enables negative social dynamics. Is this a good idea?
  22. A: I think it is. Letting strangers talk to each other over the internet enables negative social dynamics, but if you're using Mastodon you've evidently decided it's a net win. Search helps people find others with common interests, and helps server admins find content/people they want to remove.
  23.  
  24. Q: Why is this anonymous?
  25. A: A few people who don't like search get really mad about it. I don't want to deal with the drama. https://cathode.church/fedi-scraper-counter.html
  26.  
  27. --- a/app/models/status.rb
  28. +++ b/app/models/status.rb
  29. @@ -116,6 +116,8 @@ class Status < ApplicationRecord
  30.  
  31. scope :not_local_only, -> { where(local_only: [false, nil]) }
  32.  
  33. + scope :search_for, ->(q) { where("tsvector @@ websearch_to_tsquery(?)", q) }
  34. +
  35. cache_associated :application,
  36. :media_attachments,
  37. :conversation,
  38. @@ -147,6 +149,8 @@ class Status < ApplicationRecord
  39.  
  40. ids << account_id if local?
  41.  
  42. + ids << Account.local.pluck(:id) if visibility == "public"
  43. +
  44. if preloaded.nil?
  45. ids += mentions.joins(:account).merge(Account.local).active.pluck(:account_id)
  46. ids += favourites.joins(:account).merge(Account.local).pluck(:account_id)
  47. @@ -324,6 +328,8 @@ class Status < ApplicationRecord
  48.  
  49. around_create Mastodon::Snowflake::Callbacks
  50.  
  51. + before_save :index_for_search
  52. +
  53. after_create :set_poll_id
  54.  
  55. class << self
  56. @@ -503,6 +509,28 @@ class Status < ApplicationRecord
  57. update_attribute(:deleted_at, discard_time)
  58. end
  59.  
  60. +
  61. + def index_for_search
  62. + search_string = self.searchable_text.downcase
  63. + search_string.gsub!(/[^\w]/, ' ')
  64. + tsv = ActiveRecord::Base.connection.execute("select to_tsvector('#{search_string}')").first['to_tsvector']
  65. + self.tsvector = tsv
  66. + end
  67. +
  68. + def reindex_for_search
  69. + begin
  70. + self.update_attribute(:tsvector, self.index_for_search)
  71. + rescue StandardError
  72. + nil
  73. + end
  74. + end
  75. +
  76. + def self.reindex_all_for_search
  77. + Status.where(visibility: 'public').in_batches.each_record do |status|
  78. + status.reindex_for_search
  79. + end
  80. + end
  81. +
  82. def unlink_from_conversations!
  83. return unless direct_visibility?
  84.  
  85. @@ -511,6 +539,7 @@ class Status < ApplicationRecord
  86.  
  87. inbox_owners.each do |inbox_owner|
  88. AccountConversation.remove_status(inbox_owner, self)
  89. +
  90. end
  91. end
  92.  
  93.  
  94. --- a/app/services/search_service.rb
  95. +++ b/app/services/search_service.rb
  96. @@ -35,6 +35,8 @@ class SearchService < BaseService
  97. end
  98.  
  99. def perform_statuses_search!
  100. + return Status.where(visibility: 'public').search_for(@query).offset(@offset).limit(@limit) unless Chewy.enabled?
  101. +
  102. definition = parsed_query.apply(StatusesIndex.filter(term: { searchable_by: @account.id }))
  103.  
  104. definition = definition.filter(term: { account_id: @options[:account_id] }) if @options[:account_id].present?
  105. @@ -54,6 +56,7 @@ class SearchService < BaseService
  106. results.reject { |status| StatusFilter.new(status, @account, preloaded_relations).filtered? }
  107. rescue Faraday::ConnectionFailed, Parslet::ParseFailed
  108. []
  109. +
  110. end
  111.  
  112. def perform_hashtags_search!
  113. @@ -86,7 +89,6 @@ class SearchService < BaseService
  114. end
  115.  
  116. def full_text_searchable?
  117. - return false unless Chewy.enabled?
  118.  
  119. statuses_search? && [email protected]? && !((@query.start_with?('#') || @query.include?('@')) && [email protected]?(' '))
  120. end
  121.  
  122. new file mode 100644
  123. --- /dev/null
  124. +++ b/db/migrate/20230226185028_add_tsvector_to_statuses.rb
  125. @@ -0,0 +1,8 @@
  126. +class AddTsvectorToStatuses < ActiveRecord::Migration[6.1]
  127. +disable_ddl_transaction!
  128. +
  129. + def change
  130. + add_column :statuses, :tsvector, :tsvector
  131. + add_index :statuses, :tsvector, using: :gin, algorithm: :concurrently
  132. + end
  133. +end
  134.  
Advertisement
Add Comment
Please, Sign In to add comment