Let's say I have an array of names, along with a regex union of them:
match_array = [/Dan/i, /Danny/i, /Daniel/i]match_values = Regexp.union(match_array)
I'm using a regex union because the actual data set I'm working with contains strings that often have extraneous characters, whitespaces, and varied capitalization.
I want to iterate over a series of strings to see if they match any of the values in this array. If I use .scan
, only the first matching element is returned:
'dan'.scan(match_values) # => ["dan"]'danny'.scan(match_values) # => ["dan"]'daniel'.scan(match_values) # => ["dan"]'dannnniel'.scan(match_values) # => ["dan"]'dannyel'.scan(match_values) # => ["dan"]
I want to be able to capture all of the matches (which is why I thought to use .scan
instead of .match
), but I want to prioritize the closest/most exact matches first. If none are found, then I'd want to default to the partial matches. So the results would look like this:
'dan'.scan(match_values) # => ["dan"]'danny'.scan(match_values) # => ["danny","dan"]'daniel'.scan(match_values) # => ["daniel","dan"]'dannnniel'.scan(match_values) # => ["dan"]'dannyel'.scan(match_values) # => ["danny","dan"]
Is this possible?